March 21, 2024, 4:43 a.m. | Tero Karras, Miika Aittala, Jaakko Lehtinen, Janne Hellsten, Timo Aila, Samuli Laine

cs.LG updates on arXiv.org arxiv.org

arXiv:2312.02696v2 Announce Type: replace-cross
Abstract: Diffusion models currently dominate the field of data-driven image synthesis with their unparalleled scaling to large datasets. In this paper, we identify and rectify several causes for uneven and ineffective training in the popular ADM diffusion model architecture, without altering its high-level structure. Observing uncontrolled magnitude changes and imbalances in both the network activations and weights over the course of training, we redesign the network layers to preserve activation, weight, and update magnitudes on expectation. …

abstract architecture arxiv cs.ai cs.cv cs.lg cs.ne data data-driven datasets diffusion diffusion model diffusion models dynamics identify image improving large datasets paper popular scaling stat.ml synthesis training type

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Consultant - Artificial Intelligence & Data (Google Cloud Data Engineer) - MY / TH

@ Deloitte | Kuala Lumpur, MY