A Mechanistic Understanding of Alignment Algorithms: A Case Study on DPO and Toxicity Paper • 2401.01967 • Published Jan 3, 2024
Secrets of RLHF in Large Language Models Part I: PPO Paper • 2307.04964 • Published Jul 11, 2023 • 28
LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders Paper • 2404.05961 • Published Apr 9, 2024 • 65
Direct Preference Optimization: Your Language Model is Secretly a Reward Model Paper • 2305.18290 • Published May 29, 2023 • 51
PiSSA: Principal Singular Values and Singular Vectors Adaptation of Large Language Models Paper • 2404.02948 • Published Apr 3, 2024 • 2
BookSum: A Collection of Datasets for Long-form Narrative Summarization Paper • 2105.08209 • Published May 18, 2021 • 2