May 15, 2024, 4:45 a.m. | Chengde Lin, Xijun Lu, Guangxi Chen

cs.CV updates on

arXiv:2405.08114v1 Announce Type: new
Abstract: Synthesizing high-quality photorealistic images with textual descriptions as a condition is very challenging. Generative Adversarial Networks (GANs), the classical model for this task, frequently suffer from low consistency between image and text descriptions and insufficient richness in synthesized images. Recently, conditional affine transformations (CAT), such as conditional batch normalization and instance normalization, have been applied to different layers of GAN to control content synthesis in images. CAT is a multi-layer perceptron that independently predicts data …

adversarial arxiv clip generative image synthesis text text-to-image type

