MinMo: A Multimodal Large Language Model for Seamless Voice Interaction Paper • 2501.06282 • Published Jan 10 • 45
FreeScale: Unleashing the Resolution of Diffusion Models via Tuning-Free Scale Fusion Paper • 2412.09626 • Published Dec 12, 2024 • 20
SWIFT:A Scalable lightWeight Infrastructure for Fine-Tuning Paper • 2408.05517 • Published Aug 10, 2024 • 2
Timestep Embedding Tells: It's Time to Cache for Video Diffusion Model Paper • 2411.19108 • Published Nov 28, 2024 • 19
StyleBooth: Image Style Editing with Multimodal Instruction Paper • 2404.12154 • Published Apr 18, 2024
FaceChain: A Playground for Human-centric Artificial Intelligence Generated Content Paper • 2308.14256 • Published Aug 28, 2023 • 1
Minimum Tuning to Unlock Long Output from LLMs with High Quality Data as the Key Paper • 2410.10210 • Published Oct 14, 2024 • 6
Animate-X: Universal Character Image Animation with Enhanced Motion Representation Paper • 2410.10306 • Published Oct 14, 2024 • 55
EvolveDirector: Approaching Advanced Text-to-Image Generation with Large Vision-Language Models Paper • 2410.07133 • Published Oct 9, 2024 • 19
ACE: All-round Creator and Editor Following Instructions via Diffusion Transformer Paper • 2410.00086 • Published Sep 30, 2024 • 12
InstructVideo: Instructing Video Diffusion Models with Human Feedback Paper • 2312.12490 • Published Dec 19, 2023 • 18
SCEdit: Efficient and Controllable Image Diffusion Generation via Skip Connection Editing Paper • 2312.11392 • Published Dec 18, 2023 • 20
DreamTalk: When Expressive Talking Head Generation Meets Diffusion Probabilistic Models Paper • 2312.09767 • Published Dec 15, 2023 • 27
DreamVideo: Composing Your Dream Videos with Customized Subject and Motion Paper • 2312.04433 • Published Dec 7, 2023 • 10
Hierarchical Spatio-temporal Decoupling for Text-to-Video Generation Paper • 2312.04483 • Published Dec 7, 2023 • 7
I2VGen-XL: High-Quality Image-to-Video Synthesis via Cascaded Diffusion Models Paper • 2311.04145 • Published Nov 7, 2023 • 34
ModelScope-Agent: Building Your Customizable Agent System with Open-source Large Language Models Paper • 2309.00986 • Published Sep 2, 2023 • 20