Search-o1: Agentic Search-Enhanced Large Reasoning Models Paper • 2501.05366 • Published 4 days ago • 56
CodeElo: Benchmarking Competition-level Code Generation of LLMs with Human-comparable Elo Ratings Paper • 2501.01257 • Published 11 days ago • 46
PRMBench: A Fine-grained and Challenging Benchmark for Process-Level Reward Models Paper • 2501.03124 • Published 7 days ago • 13
Test-time Computing: from System-1 Thinking to System-2 Thinking Paper • 2501.02497 • Published 8 days ago • 34
AceMath: Advancing Frontier Math Reasoning with Post-Training and Reward Modeling Paper • 2412.15084 • Published 25 days ago • 13
Training Software Engineering Agents and Verifiers with SWE-Gym Paper • 2412.21139 • Published 14 days ago • 20
Do NOT Think That Much for 2+3=? On the Overthinking of o1-Like LLMs Paper • 2412.21187 • Published 14 days ago • 34
OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis Paper • 2412.19723 • Published 17 days ago • 78
VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction Paper • 2501.01957 • Published 10 days ago • 35
Cosmos World Foundation Model Platform for Physical AI Paper • 2501.03575 • Published 6 days ago • 55
SPaR: Self-Play with Tree-Search Refinement to Improve Instruction-Following in Large Language Models Paper • 2412.11605 • Published 28 days ago • 17
Virgo: A Preliminary Exploration on Reproducing o1-like MLLM Paper • 2501.01904 • Published 10 days ago • 29
Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Though Paper • 2501.04682 • Published 5 days ago • 74
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking Paper • 2501.04519 • Published 5 days ago • 196
Deepthink and Reasoning Collection Best for Deepthink and Reasoning • 12 items • Updated 9 days ago • 14
Unsloth 4-bit Dynamic Quants Collection Unsloths Dynamic 4bit Quants selectively skips quantizing certain parameters; greatly improving accuracy while only using <10% more VRAM than BnB 4bit • 8 items • Updated about 10 hours ago • 22