TheaterGen: Character Management with LLM for Consistent Multi-turn Image Generation | allainews.com

April 30, 2024, 4:47 a.m. | Junhao Cheng, Baiqiao Yin, Kaixin Cai, Minbin Huang, Hanhui Li, Yuxin He, Xi Lu, Yue Li, Yifei Li, Yuhao Cheng, Yiqiang Yan, Xiaodan Liang

cs.CV updates on arXiv.org arxiv.org

arXiv:2404.18919v1 Announce Type: new
Abstract: Recent advances in diffusion models can generate high-quality and stunning images from text. However, multi-turn image generation, which is of high demand in real-world scenarios, still faces challenges in maintaining semantic consistency between images and texts, as well as contextual consistency of the same subject across multiple interactive turns. To address this issue, we introduce TheaterGen, a training-free framework that integrates large language models (LLMs) and text-to-image (T2I) models to provide the capability of multi-turn …

abstract advances arxiv challenges consistent cs.cv demand diffusion diffusion models generate however image image generation images llm management quality semantic text type world

More from arxiv.org / cs.CV updates on arXiv.org

Multi-View Spectrogram Transformer for Respiratory Sound Classification 1 day, 1 hour ago | arxiv.org

abstract arxiv audio classification +17

CL-MRI: Self-Supervised Contrastive Learning to Improve the Accuracy of Undersampled MRI Reconstruction 1 day, 1 hour ago | arxiv.org

abstract accuracy acquisitions arxiv +15

LoopDraw: a Loop-Based Autoregressive Model for Shape Synthesis and Editing 1 day, 1 hour ago | arxiv.org

abstract alternative arxiv autoregressive +16

CLIP-Guided Source-Free Object Detection in Aerial Images 1 day, 1 hour ago | arxiv.org

aerial arxiv clip cs.cv +6

MonoNPHM: Dynamic Head Reconstruction from Monocular Videos 1 day, 1 hour ago | arxiv.org

abstract arxiv color cs.cv +9

GaussianHead: High-fidelity Head Avatars with Learnable Gaussian Derivation 1 day, 1 hour ago | arxiv.org

arxiv avatars cs.cv derivation +4

OTMatch: Improving Semi-Supervised Learning with Optimal Transport 1 day, 1 hour ago | arxiv.org

abstract algorithms arxiv cs.cv +20

SpATr: MoCap 3D Human Action Recognition based on Spiral Auto-encoder and Transformer Network 1 day, 1 hour ago | arxiv.org

action action recognition arxiv auto +9

FairCLIP: Social Bias Elimination based on Attribute Prototype Learning and Representation Neutralization 1 day, 1 hour ago | arxiv.org

abstract arxiv bias biases +22

ML/AI Engineer / NLP Expert - Custom LLM Development (x/f/m)

@ HelloBetter | Remote

View on ai-jobs.net

Doctoral Researcher (m/f/div) in Automated Processing of Bioimages

@ Leibniz Institute for Natural Product Research and Infection Biology (Leibniz-HKI) | Jena

View on ai-jobs.net

Seeking Developers and Engineers for AI T-Shirt Generator Project

@ Chevon Hicks | Remote

View on ai-jobs.net

Security Data Engineer

@ ASML | Veldhoven, Building 08, Netherlands

View on ai-jobs.net

Data Engineer

@ Parsons Corporation | Pune - Business Bay

View on ai-jobs.net

Data Engineer

@ Parsons Corporation | Bengaluru, Velankani Tech Park

View on ai-jobs.net