TOP-10 Papers Recommended in 2024-01

Papers	Authors	Published in	Date
Safe RLHF: Safe Reinforcement Learning from Human Feedback	Josef Dai, Xuehai Pan, Ruiyang Sun, Jiaming Ji, Xinbo Xu, Mickel Liu, Yizhou Wang, Yaodong Yang	International Conference on Learning Representations	2023-10
SALMON: Self-Alignment with Principle-Following Reward Models	Zhiqing Sun, Yikang Shen, Hongxin Zhang, Qinhong Zhou, Zhenfang Chen, David Cox, Yiming Yang, Chuang Gan	International Conference on Learning Representations	2023-10
Understanding the Effects of RLHF on LLM Generalisation and Diversity	Robert Kirk, Ishita Mediratta, Christoforos Nalmpantis, Jelena Luketina, Eric Hambro, Edward Grefenstette, Roberta Raileanu	International Conference on Learning Representations	2023-10
Calibrating Sequence likelihood Improves Conditional Language Generation	Yao Zhao, Misha Khalman, Rishabh Joshi, Shashi Narayan, Mohammad Saleh, Peter J. Liu	International Conference on Learning Representations	2023-10
Statistical Rejection Sampling Improves Preference Optimization	Tianqi Liu, Yao Zhao, Rishabh Joshi, Misha Khalman, Mohammad Saleh, Peter J. Liu, Jialu Liu	International Conference on Learning Representations	2023-09
RL with KL penalties is better viewed as Bayesian inference	Tomasz Korbak, Ethan Perez, Christopher L Buckley	Annual Meeting of the Association for Computational Linguistics	2022-05
Residual Energy-Based Models for Text	Anton Bakhtin, Yuntian Deng, Sam Gross, Myle Ott, Marc’Aurelio Ranzato, Arthur Szlam	International Conference on Learning Representations	2020-10
Learning Transformer Programs	Dan Friedman, Alexander Wettig, Danqi Chen	Advances in Neural Information Processing Systems	2023-06
Self-Alignment with Instruction Backtranslation	Xian Li, Ping Yu, Chunting Zhou, Timo Schick, Luke Zettlemoyer, Omer Levy, Jason Weston, Mike Lewis	International Conference on Learning Representations	2023-08
The Consensus Game: Language Model Generation via Equilibrium Search	Athul Paul Jacob, Yikang Shen, Gabriele Farina, Jacob Andreas	International Conference on Learning Representations	2023-10

Safe RLHF: Safe Reinforcement Learning from Human Feedback

Authors: Josef Dai, Xuehai Pan, Ruiyang Sun, Jiaming Ji, Xinbo Xu, Mickel Liu, Yizhou Wang, Yaodong Yang

Published in: International Conference on Learning Representations

Date: 2023-10

Read More Google Scholar

SALMON: Self-Alignment with Principle-Following Reward Models

Authors: Zhiqing Sun, Yikang Shen, Hongxin Zhang, Qinhong Zhou, Zhenfang Chen, David Cox, Yiming Yang, Chuang Gan

Published in: International Conference on Learning Representations

Date: 2023-10

Read More Google Scholar

Understanding the Effects of RLHF on LLM Generalisation and Diversity

Authors: Robert Kirk, Ishita Mediratta, Christoforos Nalmpantis, Jelena Luketina, Eric Hambro, Edward Grefenstette, Roberta Raileanu

Published in: International Conference on Learning Representations

Date: 2023-10

Read More Google Scholar

Calibrating Sequence likelihood Improves Conditional Language Generation

Authors: Yao Zhao, Misha Khalman, Rishabh Joshi, Shashi Narayan, Mohammad Saleh, Peter J. Liu

Published in: International Conference on Learning Representations

Date: 2023-10

Read More Google Scholar

Statistical Rejection Sampling Improves Preference Optimization

Authors: Tianqi Liu, Yao Zhao, Rishabh Joshi, Misha Khalman, Mohammad Saleh, Peter J. Liu, Jialu Liu

Published in: International Conference on Learning Representations

Date: 2023-09

Read More Google Scholar

RL with KL penalties is better viewed as Bayesian inference

Authors: Tomasz Korbak, Ethan Perez, Christopher L Buckley

Published in: Annual Meeting of the Association for Computational Linguistics

Date: 2022-05

Read More Google Scholar

Residual Energy-Based Models for Text

Authors: Anton Bakhtin, Yuntian Deng, Sam Gross, Myle Ott, Marc’Aurelio Ranzato, Arthur Szlam

Published in: International Conference on Learning Representations

Date: 2020-10

Read More Google Scholar

Learning Transformer Programs

Authors: Dan Friedman, Alexander Wettig, Danqi Chen

Published in: Advances in Neural Information Processing Systems

Date: 2023-06

Read More Google Scholar

Self-Alignment with Instruction Backtranslation

Authors: Xian Li, Ping Yu, Chunting Zhou, Timo Schick, Luke Zettlemoyer, Omer Levy, Jason Weston, Mike Lewis

Published in: International Conference on Learning Representations

Date: 2023-08

Read More Google Scholar

The Consensus Game: Language Model Generation via Equilibrium Search

Authors: Athul Paul Jacob, Yikang Shen, Gabriele Farina, Jacob Andreas

Published in: International Conference on Learning Representations

Date: 2023-10