all AI news
Analyzing and Improving the Training Dynamics of Diffusion Models
March 21, 2024, 4:43 a.m. | Tero Karras, Miika Aittala, Jaakko Lehtinen, Janne Hellsten, Timo Aila, Samuli Laine
cs.LG updates on arXiv.org arxiv.org
Abstract: Diffusion models currently dominate the field of data-driven image synthesis with their unparalleled scaling to large datasets. In this paper, we identify and rectify several causes for uneven and ineffective training in the popular ADM diffusion model architecture, without altering its high-level structure. Observing uncontrolled magnitude changes and imbalances in both the network activations and weights over the course of training, we redesign the network layers to preserve activation, weight, and update magnitudes on expectation. …
abstract architecture arxiv cs.ai cs.cv cs.lg cs.ne data data-driven datasets diffusion diffusion model diffusion models dynamics identify image improving large datasets paper popular scaling stat.ml synthesis training type
More from arxiv.org / cs.LG updates on arXiv.org
Jobs in AI, ML, Big Data
Founding AI Engineer, Agents
@ Occam AI | New York
AI Engineer Intern, Agents
@ Occam AI | US
AI Research Scientist
@ Vara | Berlin, Germany and Remote
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Consultant - Artificial Intelligence & Data (Google Cloud Data Engineer) - MY / TH
@ Deloitte | Kuala Lumpur, MY