OVO-Bench: How Far is Your Video-LLMs from Real-World Online Video Understanding? Paper • 2501.05510 • Published 18 days ago • 39
REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models Paper • 2501.03262 • Published 24 days ago • 87
MM-Ego: Towards Building Egocentric Multimodal LLMs Paper • 2410.07177 • Published Oct 9, 2024 • 22 • 3
LLaVA-UHD: an LMM Perceiving Any Aspect Ratio and High-Resolution Images Paper • 2403.11703 • Published Mar 18, 2024 • 17
HelloJiang/LAION-CLIP-ConvNeXt-Large-512 Zero-Shot Image Classification • Updated Jan 5, 2024 • 5 • 1
Using Human Feedback to Fine-tune Diffusion Models without Any Reward Model Paper • 2311.13231 • Published Nov 22, 2023 • 27
Woodpecker: Hallucination Correction for Multimodal Large Language Models Paper • 2310.16045 • Published Oct 24, 2023 • 16