Computational Tradeoffs in Image Synthesis: Diffusion, Masked-Token, and Next-Token Prediction | allainews.com

May 24, 2024, 4:50 a.m. | Maciej Kilian, Varun Japan, Luke Zettlemoyer

cs.CV updates on arXiv.org arxiv.org

arXiv:2405.13218v1 Announce Type: new
Abstract: Nearly every recent image synthesis approach, including diffusion, masked-token prediction, and next-token prediction, uses a Transformer network architecture. Despite this common backbone, there has been no direct, compute controlled comparison of how these approaches affect performance and efficiency. We analyze the scalability of each approach through the lens of compute budget measured in FLOPs. We find that token prediction methods, led by next-token prediction, significantly outperform diffusion on prompt following. On image quality, while next-token …

abstract analyze architecture arxiv comparison computational compute cs.cv diffusion efficiency every image network network architecture next performance prediction scalability synthesis token transformer transformer network type

More from arxiv.org / cs.CV updates on arXiv.org

Sim2Real Bilevel Adaptation for Object Surface Classification using Vision-Based Tactile Sensors 21 hours ago | arxiv.org

abstract arxiv bridge classification +20

Dynamic Addition of Noise in a Diffusion Model for Anomaly Detection 21 hours ago | arxiv.org

abstract anomaly anomaly detection applications +16

Bringing Back the Context: Camera Trap Species Identification as Link Prediction on Multimodal Knowledge Graphs 21 hours ago | arxiv.org

abstract application arxiv biodiversity +24

Advancing Surgical VQA with Scene Graph Knowledge 21 hours ago | arxiv.org

arxiv cs.ai cs.cv graph +4

TokenCompose: Text-to-Image Diffusion with Token-level Supervision 21 hours ago | arxiv.org

arxiv cs.cv diffusion image +7

Can Protective Perturbation Safeguard Personal Data from Being Exploited by Stable Diffusion? 21 hours ago | arxiv.org

abstract application applications arxiv +20

Vision Language Models in Autonomous Driving: A Survey and Outlook 21 hours ago | arxiv.org

abstract applications arxiv attention +25

Ultra-low-power Image Classification on Neuromorphic Hardware 21 hours ago | arxiv.org

abstract applications arxiv backpropagation +21

A Systematic Review of Few-Shot Learning in Medical Imaging 21 hours ago | arxiv.org

abstract analysis arxiv cs.ai +20

AI Focused Biochemistry Postdoctoral Fellow

@ Lawrence Berkeley National Lab | Berkeley, CA

View on ai-jobs.net

Senior Data Engineer

@ Displate | Warsaw

View on ai-jobs.net

Associate Director, IT Business Partner, Cell Therapy Analytical Development

@ Bristol Myers Squibb | Warren - NJ

View on ai-jobs.net

Solutions Architect

@ Lloyds Banking Group | London 125 London Wall

View on ai-jobs.net

Senior Lead Cloud Engineer

@ S&P Global | IN - HYDERABAD ORION

View on ai-jobs.net

Software Engineer

@ Applied Materials | Bengaluru,IND

View on ai-jobs.net