CompCap: Improving Multimodal Large Language Models with Composite Captions Paper • 2412.05243 • Published Dec 6, 2024 • 18
LiFT: Leveraging Human Feedback for Text-to-Video Model Alignment Paper • 2412.04814 • Published Dec 6, 2024 • 45
MAmmoTH-VL: Eliciting Multimodal Reasoning with Instruction Tuning at Scale Paper • 2412.05237 • Published Dec 6, 2024 • 47
Exploring Multi-Grained Concept Annotations for Multimodal Large Language Models Paper • 2412.05939 • Published Dec 8, 2024 • 14
Chimera: Improving Generalist Model with Domain-Specific Experts Paper • 2412.05983 • Published Dec 8, 2024 • 9
ILLUME: Illuminating Your LLMs to See, Draw, and Self-Enhance Paper • 2412.06673 • Published Dec 9, 2024 • 11
Perception Tokens Enhance Visual Reasoning in Multimodal Language Models Paper • 2412.03548 • Published Dec 4, 2024 • 17
Frame Representation Hypothesis: Multi-Token LLM Interpretability and Concept-Guided Text Generation Paper • 2412.07334 • Published Dec 10, 2024 • 16
SAME: Learning Generic Language-Guided Visual Navigation with State-Adaptive Mixture of Experts Paper • 2412.05552 • Published Dec 7, 2024 • 4
OLA-VLM: Elevating Visual Perception in Multimodal LLMs with Auxiliary Embedding Distillation Paper • 2412.09585 • Published Dec 12, 2024 • 10
Multimodal Latent Language Modeling with Next-Token Diffusion Paper • 2412.08635 • Published Dec 11, 2024 • 43
Euclid: Supercharging Multimodal LLMs with Synthetic High-Fidelity Visual Descriptions Paper • 2412.08737 • Published Dec 11, 2024 • 52
InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions Paper • 2412.09596 • Published Dec 12, 2024 • 92
VideoICL: Confidence-based Iterative In-context Learning for Out-of-Distribution Video Understanding Paper • 2412.02186 • Published Dec 3, 2024 • 22
Fourier Position Embedding: Enhancing Attention's Periodic Extension for Length Generalization Paper • 2412.17739 • Published 20 days ago • 39
Large Concept Models: Language Modeling in a Sentence Representation Space Paper • 2412.08821 • Published Dec 11, 2024 • 13
The GAN is dead; long live the GAN! A Modern GAN Baseline Paper • 2501.05441 • Published 3 days ago • 60