May 25, 2023, 11:10 p.m. | /u/arg_max

Machine Learning


I did a deep dive into diffusers for my neurips submission and found something that I consider kind of weird but don't really have anyone to discuss it with so I thought I'd just post it here to see if somebody has any idea what's going on and if this is a well-known phenomenon.

So conditioning in Stable diffusion. You have a prompt, something like "an image of a dog". This prompt gets encoded via a Clip model into …

