-
Evolving Deeper LLM Thinking
Paper • 2501.09891 • Published • 106 -
PaSa: An LLM Agent for Comprehensive Academic Paper Search
Paper • 2501.10120 • Published • 43 -
Multiple Choice Questions: Reasoning Makes Large Language Models (LLMs) More Self-Confident Even When They Are Wrong
Paper • 2501.09775 • Published • 29 -
ComplexFuncBench: Exploring Multi-Step and Constrained Function Calling under Long-Context Scenario
Paper • 2501.10132 • Published • 19
Collections
Discover the best community collections!
Collections including paper arxiv:2502.04306
-
Towards Large Reasoning Models: A Survey of Reinforced Reasoning with Large Language Models
Paper • 2501.09686 • Published • 36 -
Optimizing Large Language Model Training Using FP4 Quantization
Paper • 2501.17116 • Published • 35 -
Satori: Reinforcement Learning with Chain-of-Action-Thought Enhances LLM Reasoning via Autoregressive Search
Paper • 2502.02508 • Published • 21 -
On Teacher Hacking in Language Model Distillation
Paper • 2502.02671 • Published • 17
-
Rethinking Mixture-of-Agents: Is Mixing Different Large Language Models Beneficial?
Paper • 2502.00674 • Published • 12 -
Demystifying Long Chain-of-Thought Reasoning in LLMs
Paper • 2502.03373 • Published • 51 -
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model
Paper • 2502.02737 • Published • 188 -
DeepRAG: Thinking to Retrieval Step by Step for Large Language Models
Paper • 2502.01142 • Published • 23
-
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking
Paper • 2501.04519 • Published • 257 -
Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Though
Paper • 2501.04682 • Published • 90 -
Search-o1: Agentic Search-Enhanced Large Reasoning Models
Paper • 2501.05366 • Published • 95 -
Agent Laboratory: Using LLM Agents as Research Assistants
Paper • 2501.04227 • Published • 85
-
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking
Paper • 2501.04519 • Published • 257 -
Transformer^2: Self-adaptive LLMs
Paper • 2501.06252 • Published • 53 -
Multimodal LLMs Can Reason about Aesthetics in Zero-Shot
Paper • 2501.09012 • Published • 10 -
FAST: Efficient Action Tokenization for Vision-Language-Action Models
Paper • 2501.09747 • Published • 23
-
TheAgentCompany: Benchmarking LLM Agents on Consequential Real World Tasks
Paper • 2412.14161 • Published • 51 -
Training Software Engineering Agents and Verifiers with SWE-Gym
Paper • 2412.21139 • Published • 22 -
OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis
Paper • 2412.19723 • Published • 82 -
AgentGen: Enhancing Planning Abilities for Large Language Model based Agent via Environment and Task Generation
Paper • 2408.00764 • Published • 1
-
Toward Self-Improvement of LLMs via Imagination, Searching, and Criticizing
Paper • 2404.12253 • Published • 55 -
FlowMind: Automatic Workflow Generation with LLMs
Paper • 2404.13050 • Published • 34 -
How Far Can We Go with Practical Function-Level Program Repair?
Paper • 2404.12833 • Published • 7 -
Replacing Judges with Juries: Evaluating LLM Generations with a Panel of Diverse Models
Paper • 2404.18796 • Published • 69