Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2502.06060

Reasoning, Thinking, RL and Test-Time Scaling

about 17 hours ago

Mulberry: Empowering MLLM with o1-like Reasoning and Reflection via Collective Monte Carlo Tree Search

Paper • 2412.18319 • Published Dec 24, 2024 • 37
Token-Budget-Aware LLM Reasoning

Paper • 2412.18547 • Published Dec 24, 2024 • 46
Efficiently Serving LLM Reasoning Programs with Certaindex

Paper • 2412.20993 • Published Dec 30, 2024 • 36
B-STaR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners

Paper • 2412.17256 • Published Dec 23, 2024 • 46

o3-mini vs DeepSeek-R1: Which One is Safer?

Paper • 2501.18438 • Published Jan 30 • 22
SynthDetoxM: Modern LLMs are Few-Shot Parallel Detoxification Data Annotators

Paper • 2502.06394 • Published 28 days ago • 86
Fully Autonomous AI Agents Should Not be Developed

Paper • 2502.02649 • Published Feb 4 • 27
LM2: Large Memory Models

Paper • 2502.06049 • Published 29 days ago • 30

Social Deduction LLM (AAMAS 2025)

Pretrained models for "Training Language Models for Social Deduction with Multi-Agent Reinforcement Learning" (AAMAS 2025 Version)

bidiptas/amongus_base

Updated Jan 20 • 37 • 4
bidiptas/amongus_crewmate

Updated Jan 20 • 32 • 1
Training Language Models for Social Deduction with Multi-Agent Reinforcement Learning

Paper • 2502.06060 • Published 28 days ago • 34

WebRL: Training LLM Web Agents via Self-Evolving Online Curriculum Reinforcement Learning

Paper • 2411.02337 • Published Nov 4, 2024 • 35
Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models

Paper • 2411.04996 • Published Nov 7, 2024 • 51
Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level

Paper • 2411.03562 • Published Nov 5, 2024 • 66
StructRAG: Boosting Knowledge Intensive Reasoning of LLMs via Inference-time Hybrid Information Structurization

Paper • 2410.08815 • Published Oct 11, 2024 • 48

Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level

Paper • 2411.03562 • Published Nov 5, 2024 • 66
Training Language Models for Social Deduction with Multi-Agent Reinforcement Learning

Paper • 2502.06060 • Published 28 days ago • 34
MLGym: A New Framework and Benchmark for Advancing AI Research Agents

Paper • 2502.14499 • Published 18 days ago • 177
SurveyX: Academic Survey Automation via Large Language Models

Paper • 2502.14776 • Published 18 days ago • 92

FLAME: Factuality-Aware Alignment for Large Language Models

Paper • 2405.01525 • Published May 2, 2024 • 27
DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale Synthetic Data

Paper • 2405.14333 • Published May 23, 2024 • 40
Transformers Can Do Arithmetic with the Right Embeddings

Paper • 2405.17399 • Published May 27, 2024 • 53
EasyAnimate: A High-Performance Long Video Generation Method based on Transformer Architecture

Paper • 2405.18991 • Published May 29, 2024 • 12

TheAgentCompany: Benchmarking LLM Agents on Consequential Real World Tasks

Paper • 2412.14161 • Published Dec 18, 2024 • 51
Training Software Engineering Agents and Verifiers with SWE-Gym

Paper • 2412.21139 • Published Dec 30, 2024 • 22
OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis

Paper • 2412.19723 • Published Dec 27, 2024 • 82
AgentGen: Enhancing Planning Abilities for Large Language Model based Agent via Environment and Task Generation

Paper • 2408.00764 • Published Aug 1, 2024 • 1

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs