all AI news
Computational Tradeoffs in Image Synthesis: Diffusion, Masked-Token, and Next-Token Prediction
May 24, 2024, 4:50 a.m. | Maciej Kilian, Varun Japan, Luke Zettlemoyer
cs.CV updates on arXiv.org arxiv.org
Abstract: Nearly every recent image synthesis approach, including diffusion, masked-token prediction, and next-token prediction, uses a Transformer network architecture. Despite this common backbone, there has been no direct, compute controlled comparison of how these approaches affect performance and efficiency. We analyze the scalability of each approach through the lens of compute budget measured in FLOPs. We find that token prediction methods, led by next-token prediction, significantly outperform diffusion on prompt following. On image quality, while next-token …
abstract analyze architecture arxiv comparison computational compute cs.cv diffusion efficiency every image network network architecture next performance prediction scalability synthesis token transformer transformer network type
More from arxiv.org / cs.CV updates on arXiv.org
Optimization Efficient Open-World Visual Region Recognition
2 days, 11 hours ago |
arxiv.org
HyperFields: Towards Zero-Shot Generation of NeRFs from Text
2 days, 11 hours ago |
arxiv.org
Jobs in AI, ML, Big Data
Senior Data Engineer
@ Displate | Warsaw
Junior Data Analyst - ESG Data
@ Institutional Shareholder Services | Mumbai
Intern Data Driven Development in Sensor Fusion for Autonomous Driving (f/m/x)
@ BMW Group | Munich, DE
Senior MLOps Engineer, Machine Learning Platform
@ GetYourGuide | Berlin
Data Engineer, Analytics
@ Meta | Menlo Park, CA
Data Engineer
@ Meta | Menlo Park, CA