-
SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features
Paper • 2502.14786 • Published • 128 -
Scaling Text-Rich Image Understanding via Code-Guided Synthetic Multimodal Data Generation
Paper • 2502.14846 • Published • 13 -
RelaCtrl: Relevance-Guided Efficient Control for Diffusion Transformers
Paper • 2502.14377 • Published • 12
Liu
Liudawp
·
AI & ML interests
None yet
Recent Activity
updated
a collection
17 days ago
ai tech
updated
a collection
17 days ago
ai tech
liked
a model
19 days ago
microsoft/OmniParser-v2.0
Organizations
None yet
Collections
1
models
None public yet
datasets
None public yet