Optimizing Test-Time Compute via Meta Reinforcement Fine-Tuning Paper • 2503.07572 • Published 3 days ago • 25
Jack of All Trades, Master of Some, a Multi-Purpose Transformer Agent Paper • 2402.09844 • Published Feb 15, 2024 • 21