Submitted by AlexCuadron 53 The Danger of Overthinking: Examining the Reasoning-Action Dilemma in Agentic Tasks · 16 authors 2
Submitted by turrf 49 Step-Video-T2V Technical Report: The Practice, Challenges, and Future of Video Foundation Model · 115 authors 3
Submitted by taesiri 37 ZeroBench: An Impossible Visual Benchmark for Contemporary Large Multimodal Models · 23 authors 5
Submitted by yifanzhang114 29 MM-RLHF: The Next Step Forward in Multimodal LLM Alignment · 20 authors 5
Submitted by Shengkun 17 DarwinLM: Evolutionary Structured Pruning of Large Language Models · 5 authors 7
Submitted by rinong 16 ImageRAG: Dynamic Image Retrieval for Reference-Guided Image Generation · 4 authors 2
Submitted by lukasz-staniszewski 11 Precise Parameter Localization for Textual Generation in Diffusion Models · 5 authors 2
Submitted by deqing 11 FoNE: Precise Single-Token Number Embeddings via Fourier Features · 5 authors 3
Submitted by DGurgurov 9 Small Models, Big Impact: Efficient Corpus and Graph-Based Adaptation of Small Multilingual Language Models for Low-Resource Languages · 4 authors 2
Submitted by Asaf-Yehudai 8 Selective Self-to-Supervised Fine-Tuning for Generalization in Large Language Models · 6 authors 2
Submitted by abenechehab 8 AdaPTS: Adapting Univariate Foundation Models to Probabilistic Multivariate Time Series Forecasting · 6 authors 2
Submitted by xuxw98 6 Text-guided Sparse Voxel Pruning for Efficient 3D Visual Grounding · 6 authors 2
Submitted by SP4595 5 STMA: A Spatio-Temporal Memory Agent for Long-Horizon Embodied Task Planning · 7 authors 2
Submitted by cmhungsteve 4 V2V-LLM: Vehicle-to-Vehicle Cooperative Autonomous Driving with Multi-Modal Large Language Models · 6 authors 2
Submitted by akhaliq 3 CLaMP 3: Universal Music Information Retrieval Across Unaligned Modalities and Unseen Languages · 10 authors 2
Submitted by mjbuehler 3 Agentic End-to-End De Novo Protein Design for Tailored Dynamics Using a Language Diffusion Model · 2 authors 2
Submitted by z-hb 3 MRS: A Fast Sampler for Mean Reverting Diffusion based on ODE and SDE Solvers · 6 authors 2
Submitted by nielsr 2 Cluster and Predict Latents Patches for Improved Masked Image Modeling · 5 authors 2