all AI news
CosmicMan: A Text-to-Image Foundation Model for Humans
April 2, 2024, 7:48 p.m. | Shikai Li, Jianglin Fu, Kaiyuan Liu, Wentao Wang, Kwan-Yee Lin, Wayne Wu
cs.CV updates on arXiv.org arxiv.org
Abstract: We present CosmicMan, a text-to-image foundation model specialized for generating high-fidelity human images. Unlike current general-purpose foundation models that are stuck in the dilemma of inferior quality and text-image misalignment for humans, CosmicMan enables generating photo-realistic human images with meticulous appearance, reasonable structure, and precise text-image alignment with detailed dense descriptions. At the heart of CosmicMan's success are the new reflections and perspectives on data and models: (1) We found that data quality and a …
abstract alignment arxiv cs.cv current fidelity foundation foundation model general human human images humans image images photo quality text text-image text-to-image type
More from arxiv.org / cs.CV updates on arXiv.org
Jobs in AI, ML, Big Data
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Software Engineer, Machine Learning (Tel Aviv)
@ Meta | Tel Aviv, Israel
Senior Data Scientist- Digital Government
@ Oracle | CASABLANCA, Morocco