MegaPairs: Massive Data Synthesis For Universal Multimodal Retrieval Paper • 2412.14475 • Published 25 days ago • 53
WavePulse: Real-time Content Analytics of Radio Livestreams Paper • 2412.17998 • Published 20 days ago • 10
Bridging the Data Provenance Gap Across Text, Speech and Video Paper • 2412.17847 • Published 25 days ago • 8
No More Adam: Learning Rate Scaling at Initialization is All You Need Paper • 2412.11768 • Published 28 days ago • 41
2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining Paper • 2501.00958 • Published 11 days ago • 92
URSA: Understanding and Verifying Chain-of-thought Reasoning in Multimodal Mathematics Paper • 2501.04686 • Published 4 days ago • 45
MLLM-as-a-Judge for Image Safety without Human Labeling Paper • 2501.00192 • Published 13 days ago • 23