这篇论文主要围绕MeanFlow框架的改进展开,核心贡献在于提出了更稳定的训练目标和更灵活的引导机制,使得单步生成模型在ImageNet 256x256数据集上达到了 1.72的FID ,相较于原版MeanFlow有了 50%的性能提升 ,且无需蒸馏,这一成果让单步生成模型与多步模型的差距显著缩小。
A novel FlowViT-Diff framework that integrates a Vision Transformer (ViT) with an enhanced denoising diffusion probabilistic model (DDPM) for super-resolution reconstruction of high-resolution flow ...