The Progress of Alignment

Yaodong Yang

This lecture, delivered by Professor Yaodong Yang, delves deeply into the historical background and development of AI alignment technology and discusses the critical role of AI alignment in managing AI ethics and safety.

Main Content:

  • Historical Foundation: The origins of intent and value alignment theory in cybernetics.
  • Modern AI Risks: Analysis of extinction risks posed by AI systems and the importance of maintaining human control over these systems.
  • AI Alignment Technologies: Exploring methods from RLHF to the Aligner, focusing on ensuring AI aligns with complex human values through robustness, interpretability, and ethical compliance.

The lecture concludes with a forward-looking perspective on future challenges and solutions in AI alignment, emphasizing the necessity of collaboration and the importance of innovative thinking in safely navigating the future of AI development.

Slide PreviewDownload
Click Here

Intro to AI Alignment

Yaodong Yang

Talk on AI Alignment in RL China. AI alignment is a huge field, including not only mature basic methods scalable oversight and mechanism interpretability. The macro goal of AI alignment can be summarized as the RICE principles: Robustness, Interpretability, Controllability and Ethicality. Also, this talk mentioned that Learning from Feedback, Addressing Distributional Shift, and Assurance are the three core subfields of AI Alignment today. They form a continuously updated and iteratively improved alignment loop.

Value Alignment

Tianyi Qiu

This enlightening talk delves into the issue of value alignment of AI systems, exploring its history, theoretical frameworks, and its pivotal role in contemporary AI research. The talk begins with a review of the origins of machine ethics and early theoretical studies on how AI systems could align with human values, discussing how these foundational theories have evolved and intersect with common AI alignment research. Further, the talk explores the cutting-edge areas of value alignment, analyzing how computational social choice is applied to AI systems to incorporate diverse values and democratic inputs, providing a novel approach to ethical AI development. Additionally, the talk addresses the necessary socio-technical evaluations for assessing value alignment in the real world.