March 5, 2024, 2:49 p.m. | Lukas H\"ollein, Alja\v{z} Bo\v{z}i\v{c}, Norman M\"uller, David Novotny, Hung-Yu Tseng, Christian Richardt, Michael Zollh\"ofer, Matthias Nie{\ss}ner

cs.CV updates on arXiv.org arxiv.org

arXiv:2403.01807v1 Announce Type: new
Abstract: 3D asset generation is getting massive amounts of attention, inspired by the recent success of text-guided 2D content creation. Existing text-to-3D methods use pretrained text-to-image diffusion models in an optimization problem or fine-tune them on synthetic data, which often results in non-photorealistic 3D objects without backgrounds. In this paper, we present a method that leverages pretrained text-to-image models as a prior, and learn to generate multi-view images in a single denoising process from real-world data. …

3d objects abstract arxiv attention consistent cs.cv data diffusion diffusion models image image diffusion image generation massive objects optimization photorealistic results success synthetic synthetic data text text-to-image them type

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Business Data Analyst

@ Alstom | Johannesburg, GT, ZA