The Stochastic Parrot on LLM's Shoulder: A Summative Assessment of Physical Concept Understanding Paper • 2502.08946 • Published 25 days ago • 182
Video-Panda: Parameter-efficient Alignment for Encoder-free Video-Language Models Paper • 2412.18609 • Published Dec 24, 2024 • 17
MMFactory: A Universal Solution Search Engine for Vision-Language Tasks Paper • 2412.18072 • Published Dec 24, 2024 • 18
STAR: Spatial-Temporal Augmentation with Text-to-Video Models for Real-World Video Super-Resolution Paper • 2501.02976 • Published Jan 6 • 55
BoostStep: Boosting mathematical capability of Large Language Models via improved single-step reasoning Paper • 2501.03226 • Published Jan 6 • 41
Virgo: A Preliminary Exploration on Reproducing o1-like MLLM Paper • 2501.01904 • Published Jan 3 • 32
CodeElo: Benchmarking Competition-level Code Generation of LLMs with Human-comparable Elo Ratings Paper • 2501.01257 • Published Jan 2 • 50
Do NOT Think That Much for 2+3=? On the Overthinking of o1-Like LLMs Paper • 2412.21187 • Published Dec 30, 2024 • 40
Explanatory Instructions: Towards Unified Vision Tasks Understanding and Zero-shot Generalization Paper • 2412.18525 • Published Dec 24, 2024 • 75
TransMLA: Multi-head Latent Attention Is All You Need Paper • 2502.07864 • Published 27 days ago • 47
NoLiMa: Long-Context Evaluation Beyond Literal Matching Paper • 2502.05167 • Published about 1 month ago • 15