henern
's Collections
Training
updated
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection
Paper
•
2403.03507
•
Published
•
184
RAFT: Adapting Language Model to Domain Specific RAG
Paper
•
2403.10131
•
Published
•
69
LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models
Paper
•
2403.13372
•
Published
•
64
InternLM2 Technical Report
Paper
•
2403.17297
•
Published
•
30
sDPO: Don't Use Your Data All at Once
Paper
•
2403.19270
•
Published
•
41
ReFT: Representation Finetuning for Language Models
Paper
•
2404.03592
•
Published
•
92
LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens
Paper
•
2402.13753
•
Published
•
115
MiniCPM: Unveiling the Potential of Small Language Models with Scalable
Training Strategies
Paper
•
2404.06395
•
Published
•
22
ORPO: Monolithic Preference Optimization without Reference Model
Paper
•
2403.07691
•
Published
•
64
Rho-1: Not All Tokens Are What You Need
Paper
•
2404.07965
•
Published
•
90
Learn Your Reference Model for Real Good Alignment
Paper
•
2404.09656
•
Published
•
83
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your
Phone
Paper
•
2404.14219
•
Published
•
255
The Instruction Hierarchy: Training LLMs to Prioritize Privileged
Instructions
Paper
•
2404.13208
•
Published
•
39
Simple and Scalable Strategies to Continually Pre-train Large Language
Models
Paper
•
2403.08763
•
Published
•
50
MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning
Paper
•
2405.12130
•
Published
•
47
RLHF Workflow: From Reward Modeling to Online RLHF
Paper
•
2405.07863
•
Published
•
67
SimPO: Simple Preference Optimization with a Reference-Free Reward
Paper
•
2405.14734
•
Published
•
11
DialSim: A Real-Time Simulator for Evaluating Long-Term Dialogue
Understanding of Conversational Agents
Paper
•
2406.13144
•
Published
•
11
Paper
•
2407.10671
•
Published
•
161
Improving Text Embeddings with Large Language Models
Paper
•
2401.00368
•
Published
•
80
LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders
Paper
•
2404.05961
•
Published
•
65
SeaLLMs 3: Open Foundation and Chat Multilingual Large Language Models
for Southeast Asian Languages
Paper
•
2407.19672
•
Published
•
56
Self-Training with Direct Preference Optimization Improves
Chain-of-Thought Reasoning
Paper
•
2407.18248
•
Published
•
32
Meta-Rewarding Language Models: Self-Improving Alignment with
LLM-as-a-Meta-Judge
Paper
•
2407.19594
•
Published
•
20
Gemma 2: Improving Open Language Models at a Practical Size
Paper
•
2408.00118
•
Published
•
76
The Llama 3 Herd of Models
Paper
•
2407.21783
•
Published
•
110
LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs
Paper
•
2408.07055
•
Published
•
66
DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for
Reinforcement Learning and Monte-Carlo Tree Search
Paper
•
2408.08152
•
Published
•
53
Controllable Text Generation for Large Language Models: A Survey
Paper
•
2408.12599
•
Published
•
64
Training Language Models to Self-Correct via Reinforcement Learning
Paper
•
2409.12917
•
Published
•
136
A Survey on the Honesty of Large Language Models
Paper
•
2409.18786
•
Published
•
32
Paper
•
2410.05258
•
Published
•
169
Addition is All You Need for Energy-efficient Language Models
Paper
•
2410.00907
•
Published
•
145
Training Large Language Models to Reason in a Continuous Latent Space
Paper
•
2412.06769
•
Published
•
75