May 15, 2023, 12:46 a.m. | Shanshan Zhong, Zhongzhan Huang, Wushao Wen, Jinghui Qin, Liang Lin

cs.CL updates on arXiv.org arxiv.org

Diffusion models, which have emerged to become popular text-to-image
generation models, can produce high-quality and content-rich images guided by
textual prompts. However, there are limitations to semantic understanding and
commonsense reasoning in existing models when the input prompts are concise
narrative, resulting in low-quality image generation. To improve the capacities
for narrative prompts, we propose a simple-yet-effective parameter-efficient
fine-tuning approach called the Semantic Understanding and Reasoning adapter
(SUR-adapter) for pre-trained diffusion models. To reach this goal, we first
collect and …

arxiv become diffusion diffusion models image image generation image generation models images language language models large language models low narrative popular prompts quality reasoning semantic text text-to-image understanding

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US

Research Engineer

@ Allora Labs | Remote

Ecosystem Manager

@ Allora Labs | Remote

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US