RLHF Slides

PAIR Lab

This series of slides will explore Reinforcement Learning from Human Feedback (RLHF) – an essential advancement in AI. Starting with alignment (Lecture 1), we will move to Reinforcement Learning (Lectures 2, 3) and then delve into the critical role of Human Feedback (Lecture 4). Further, we will focus on how RLHF works on Large Language Models (LLMs), enhancing their alignment with human values (Lecture 5, 6). Finally, we will discuss other alignment methods in LLMs related to RLHF (Lecture 7, 8).

graph TD; subgraph Introduction: The Fundamentals of Alignment, RL, and Human Feedback. A[Lecture 1: Fundamentals of Alignment]-->B[Lecture 2: Fundamentals of Reinforcement Learning]; A-->D[Lecture 4: Fundamentals of Human Feedback]; end subgraph Core: How RLHF resolve Alignment Problem in LLMs. B-->C[Lecture 3: Policy Optimization in Reinforcement Learning]; C-->F[Lecture 6: RLHF in Language Models]; D-->E[Lecture 5: Learning through Human Feedback]; E-->F; end subgraph Extention: Other Alignment Methods F-->G[Lecture 7: Alignment Methods in Language Models I]; F-->H[Lecture 8: Alignment Methods in Language Models II]; end
PreviewDownload
Click Here
Click Here
Click Here
Click Here
Click Here
Click Here
Click Here
Click Here