Benchmarking AI Models in Software Engineering: A Review, Search Tool, and Enhancement Protocol Paper • 2503.05860 • Published 6 days ago • 6
Optimizing Test-Time Compute via Meta Reinforcement Fine-Tuning Paper • 2503.07572 • Published 3 days ago • 25
Implicit Reasoning in Transformers is Reasoning through Shortcuts Paper • 2503.07604 • Published 3 days ago • 17
Gemini Embedding: Generalizable Embeddings from Gemini Paper • 2503.07891 • Published 3 days ago • 22
LMM-R1: Empowering 3B LMMs with Strong Reasoning Abilities Through Two-Stage Rule-Based RL Paper • 2503.07536 • Published 3 days ago • 62
State-offset Tuning: State-based Parameter-Efficient Fine-Tuning for State Space Models Paper • 2503.03499 • Published 8 days ago • 5
Vision-R1: Incentivizing Reasoning Capability in Multimodal Large Language Models Paper • 2503.06749 • Published 4 days ago • 20
FEA-Bench: A Benchmark for Evaluating Repository-Level Code Generation for Feature Implementation Paper • 2503.06680 • Published 4 days ago • 17
Taking Notes Brings Focus? Towards Multi-Turn Multimodal Dialogue Learning Paper • 2503.07002 • Published 3 days ago • 36
An Empirical Study on Eliciting and Improving R1-like Reasoning Models Paper • 2503.04548 • Published 7 days ago • 8
TinyR1-32B-Preview: Boosting Accuracy with Branch-Merge Distillation Paper • 2503.04872 • Published 7 days ago • 14
Learning from Failures in Multi-Attempt Reinforcement Learning Paper • 2503.04808 • Published 9 days ago • 15
R1-Searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning Paper • 2503.05592 • Published 6 days ago • 24
R1-Zero's "Aha Moment" in Visual Reasoning on a 2B Non-SFT Model Paper • 2503.05132 • Published 6 days ago • 44
Sketch-of-Thought: Efficient LLM Reasoning with Adaptive Cognitive-Inspired Sketching Paper • 2503.05179 • Published 6 days ago • 42
Token-Efficient Long Video Understanding for Multimodal LLMs Paper • 2503.04130 • Published 7 days ago • 76
Dedicated Feedback and Edit Models Empower Inference-Time Scaling for Open-Ended General-Domain Tasks Paper • 2503.04378 • Published 7 days ago • 6
FuseChat-3.0: Preference Optimization Meets Heterogeneous Model Fusion Paper • 2503.04222 • Published 7 days ago • 12
LINGOLY-TOO: Disentangling Memorisation from Reasoning with Linguistic Templatisation and Orthographic Obfuscation Paper • 2503.02972 • Published 9 days ago • 23