all AI news
[D] DreamBooth Stable Diffusion training in 10 GB VRAM, using xformers, 8bit adam, gradient checkpointing and caching latents.
Oct. 2, 2022, 1:37 a.m. | /u/0x00groot
Machine Learning www.reddit.com
Colab: [https://colab.research.google.com/github/ShivamShrirao/diffusers/blob/main/examples/dreambooth/DreamBooth\_Stable\_Diffusion.ipynb](https://colab.research.google.com/github/ShivamShrirao/diffusers/blob/main/examples/dreambooth/DreamBooth_Stable_Diffusion.ipynb)
https://preview.redd.it/rj70zdpqqar91.png?width=1009&format=png&auto=webp&s=940710714f058f0e0e9707e19e119c79ed7f3ce6
Tested on Tesla T4 GPU on google colab. It is still pretty fast, no further precision loss from the previous 12 GB version. I have also added a table to choose the best flags according to the memory and speed requirements.
​
|`fp16`|`train_batch_size`|`gradient_accumulation_steps`|`gradient_checkpointing`|`use_8bit_adam`|GB VRAM usage|Speed (it/s)|
|:-|:-|:-|:-|:-|:-|:-|
|fp16|1|1|TRUE|TRUE|9.92|0.93|
|no|1|1|TRUE|TRUE|10.08|0.42|
|fp16|2|1|TRUE|TRUE|10.4|0.66|
|fp16|1|1|FALSE|TRUE|11.17|1.14|
|no|1|1|FALSE|TRUE|11.17|0.49|
|fp16|1|2|TRUE|TRUE|11.56|1|
|fp16|2|1|FALSE|TRUE|13.67|0.82|
|fp16|1|2|FALSE|TRUE|13.7|0.83|
|fp16|1|1|TRUE|FALSE|15.79|0.77|
Might also work on 3080 10GB now but I haven't tested. Let me know if anybody here can test.
8bit adam diffusion gradient machinelearning stable diffusion training
More from www.reddit.com / Machine Learning
Jobs in AI, ML, Big Data
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Lead Data Engineer
@ JPMorgan Chase & Co. | Jersey City, NJ, United States
Senior Machine Learning Engineer
@ TELUS | Vancouver, BC, CA
CT Technologist - Ambulatory Imaging - PRN
@ Duke University | Morriville, NC, US, 27560
BH Data Analyst
@ City of Philadelphia | Philadelphia, PA, United States