all AI news
BIVDiff: A Training-Free Framework for General-Purpose Video Synthesis via Bridging Image and Video Diffusion Models
April 10, 2024, 4:46 a.m. | Fengyuan Shi, Jiaxi Gu, Hang Xu, Songcen Xu, Wei Zhang, Limin Wang
cs.CV updates on arXiv.org arxiv.org
Abstract: Diffusion models have made tremendous progress in text-driven image and video generation. Now text-to-image foundation models are widely applied to various downstream image synthesis tasks, such as controllable image generation and image editing, while downstream video synthesis tasks are less explored for several reasons. First, it requires huge memory and computation overhead to train a video generation foundation model. Even with video foundation models, additional costly training is still required for downstream video synthesis tasks. …
abstract arxiv cs.ai cs.cv diffusion diffusion models editing foundation framework free general image image generation progress synthesis tasks text text-to-image training type via video video diffusion video generation
More from arxiv.org / cs.CV updates on arXiv.org
Compact 3D Scene Representation via Self-Organizing Gaussian Grids
1 day, 12 hours ago |
arxiv.org
Fingerprint Matching with Localized Deep Representation
1 day, 12 hours ago |
arxiv.org
Jobs in AI, ML, Big Data
Founding AI Engineer, Agents
@ Occam AI | New York
AI Engineer Intern, Agents
@ Occam AI | US
AI Research Scientist
@ Vara | Berlin, Germany and Remote
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne