UniTok: A Unified Tokenizer for Visual Generation and Understanding Paper • 2502.20321 • Published 11 days ago • 28
EGC: Image Generation and Classification via a Diffusion Energy-Based Model Paper • 2304.02012 • Published Apr 4, 2023 • 1
Groma: Localized Visual Tokenization for Grounding Multimodal Large Language Models Paper • 2404.13013 • Published Apr 19, 2024 • 31
Groma: Localized Visual Tokenization for Grounding Multimodal Large Language Models Paper • 2404.13013 • Published Apr 19, 2024 • 31