LLM_RL - a sunlightsgy Collection

sunlightsgy 's Collections

LLM_RL

LLM_RL

updated 5 days ago

Self-rewarding correction for mathematical reasoning

Paper • 2502.19613 • Published 11 days ago • 75