Post Training Scaling Law

Boyuan Chen

Introduction: OpenAI o1 in Post-Training RL Era – Introduction to OpenAI o1’s role in the new paradigm of post-training.
Performance Analysis of OpenAI o1 – Evaluation of OpenAI o1’s performance and its improvements.
Post-Training Scaling Law – Discussion on how performance scales during post-training stage.
Technical Details – In-depth analysis of:

Self-play RL – Overview of RL using self-play.
Reward Model – Detailing different approaches including the process reward model, generative reward model, critic model, and self-critiquing.
CoT & MCTS Application – Application of CoT reasoning and MCTS.
STaR & Quiet STaR – Explanation of standard STaR techniques and its variations.
OpenAI o1 Tech Path Forecast – Predictions on the future technological developments for OpenAI o1.

Potential Future Directions – Exploration of:

The capacity limits of large models, synthetic data, and test-time search techniques.
Chain of reasoning’s implications for AI safety.

Analysis of Future Technical Directions – A forward-looking view on the development and impact of these technologies.

Each section addresses critical aspects of OpenAI o1’s development, technical analysis, and future possibilities in post-training and AI safety.

Slide Preview	Download
	Click Here

The Progress of Alignment

Yaodong Yang

This lecture, delivered by Professor Yaodong Yang, delves deeply into the historical background and development of AI alignment technology and discusses the critical role of AI alignment in managing AI ethics and safety.

Main Content:

Historical Foundation: The origins of intent and value alignment theory in cybernetics.
Modern AI Risks: Analysis of extinction risks posed by AI systems and the importance of maintaining human control over these systems.
AI Alignment Technologies: Exploring methods from RLHF to the Aligner, focusing on ensuring AI aligns with complex human values through robustness, interpretability, and ethical compliance.

The lecture concludes with a forward-looking perspective on future challenges and solutions in AI alignment, emphasizing the necessity of collaboration and the importance of innovative thinking in safely navigating the future of AI development.

Slide Preview	Download
	Click Here

Intro to AI Alignment

Yaodong Yang

Talk on AI Alignment in RL China. AI alignment is a huge field, including not only mature basic methods scalable oversight and mechanism interpretability. The macro goal of AI alignment can be summarized as the RICE principles: Robustness, Interpretability, Controllability and Ethicality. Also, this talk mentioned that Learning from Feedback, Addressing Distributional Shift, and Assurance are the three core subfields of AI Alignment today. They form a continuously updated and iteratively improved alignment loop.

Value Alignment

Tianyi Qiu

This enlightening talk delves into the issue of value alignment of AI systems, exploring its history, theoretical frameworks, and its pivotal role in contemporary AI research. The talk begins with a review of the origins of machine ethics and early theoretical studies on how AI systems could align with human values, discussing how these foundational theories have evolved and intersect with common AI alignment research. Further, the talk explores the cutting-edge areas of value alignment, analyzing how computational social choice is applied to AI systems to incorporate diverse values and democratic inputs, providing a novel approach to ethical AI development. Additionally, the talk addresses the necessary socio-technical evaluations for assessing value alignment in the real world.