REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models Paper β’ 2501.03262 β’ Published 9 days ago β’ 72
Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Though Paper β’ 2501.04682 β’ Published 4 days ago β’ 73
TACO Models Collection This collection contains the best-performing TACO models based on LLaMA-3/Qwen2 and SigLIP/CLIP. β’ 3 items β’ Updated 23 days ago β’ 8
OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis Paper β’ 2412.19723 β’ Published 16 days ago β’ 78
Cosmos World Foundation Model Platform for Physical AI Paper β’ 2501.03575 β’ Published 6 days ago β’ 55
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking Paper β’ 2501.04519 β’ Published 5 days ago β’ 194
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference Paper β’ 2412.13663 β’ Published 26 days ago β’ 121
ModernBERT Collection Bringing BERT into modernity via both architecture changes and scaling β’ 3 items β’ Updated 24 days ago β’ 123
The Open Source Advantage in Large Language Models (LLMs) Paper β’ 2412.12004 β’ Published 27 days ago β’ 9
SynerGen-VL: Towards Synergistic Image Understanding and Generation with Vision Experts and Token Folding Paper β’ 2412.09604 β’ Published Dec 12, 2024 β’ 35
Apollo: An Exploration of Video Understanding in Large Multimodal Models Paper β’ 2412.10360 β’ Published about 1 month ago β’ 137
Euclid: Supercharging Multimodal LLMs with Synthetic High-Fidelity Visual Descriptions Paper β’ 2412.08737 β’ Published Dec 11, 2024 β’ 52
InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions Paper β’ 2412.09596 β’ Published Dec 12, 2024 β’ 92
POINTS1.5: Building a Vision-Language Model towards Real World Applications Paper β’ 2412.08443 β’ Published Dec 11, 2024 β’ 38