Abstract
Diffusion models and Flow Matching generate high-quality samples but are slow at inference, and distilling them into few-step models often leads to instability and extensive tuning. To resolve these trade-offs, we propose Inductive Moment Matching (IMM), a new class of generative models for one- or few-step sampling with a single-stage training procedure. Unlike distillation, IMM does not require pre-training initialization and optimization of two networks; and unlike Consistency Models, IMM guarantees distribution-level convergence and remains stable under various hyperparameters and standard model architectures. IMM surpasses diffusion models on ImageNet-256x256 with 1.99 FID using only 8 inference steps and achieves state-of-the-art 2-step FID of 1.98 on CIFAR-10 for a model trained from scratch.
Community
Inductive Moment Matching (IMM) is a new generative paradigm that can be trained from scratch. IMM does not rely on core concepts of diffusion model and Flow Matching, and has a single objective, uses a single model, and trains stably. IMM achieves 1.99 FID on ImageNet-256x256 with 8 sampling steps and 1.98 FID on CIFAR-10 with 2 steps.
Models citing this paper 1
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper