Nov. 15, 2022, 4:32 a.m. | /u/ButterscotchLost421

Machine Learning www.reddit.com

Hey,

I am currently training a diffusion model on CIFAR.

The network is very similar to the code in the annotated diffusion model blog post ([https://huggingface.co/blog/annotated-diffusion](https://huggingface.co/blog/annotated-diffusion)).

Checking Yang Songs code for CIFAR 10 ( [https://github.com/yang-song/score\_sde](https://github.com/yang-song/score_sde) ), I see that the DM is trained for a staggering amount of 1 300 000 epochs.

One epoch takes 7 seconds on the machine (NVIDIA A100-SXM4-40GB).

Therefore overall training would take 2500 hours, i.e. a hundred days?

What am I doing wrong? Was the …

cifar-10 diffusion diffusion model machinelearning

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne