RLHF Slides

PAIR Lab

This series of slides will explore Reinforcement Learning from Human Feedback (RLHF) – an essential advancement in AI. Starting with alignment (Lecture 1), we will move to Reinforcement Learning (Lectures 2, 3) and then delve into the critical role of Human Feedback (Lecture 4). Further, we will focus on how RLHF works on Large Language Models (LLMs), enhancing their alignment with human values (Lecture 5, 6). Finally, we will discuss other alignment methods in LLMs related to RLHF (Lecture 7, 8).

Extention: Other Alignment Methods
Core: How RLHF resolve Alignment Problem in LLMs.
Introduction: The Fundamentals of Alignment, RL, and Human Feedback.
Lecture 7: Alignment Methods in Language Models I
Lecture 8: Alignment Methods in Language Models II
Lecture 3: Policy Optimization in Reinforcement Learning
Lecture 6: RLHF in Language Models
Lecture 5: Learning through Human Feedback
Lecture 2: Fundamentals of Reinforcement Learning
Lecture 1: Fundamentals of Alignment
Lecture 4: Fundamentals of Human Feedback
PreviewDownload
Click Here
Click Here
Click Here
Click Here
Click Here
Click Here
Click Here
Click Here