May 24, 2024, 4:52 a.m. | Bo Peng, Xinyuan Chen, Yaohui Wang, Chaochao Lu, Yu Qiao

cs.CV updates on arXiv.org arxiv.org

arXiv:2310.07697v2 Announce Type: replace
Abstract: Recent works have successfully extended large-scale text-to-image models to the video domain, producing promising results but at a high computational cost and requiring a large amount of video data. In this work, we introduce ConditionVideo, a training-free approach to text-to-video generation based on the provided condition, video, and input text, by leveraging the power of off-the-shelf text-to-image generation methods (e.g., Stable Diffusion). ConditionVideo generates realistic dynamic videos from random noise or given scene videos. Our …

abstract arxiv computational cost cs.cv data domain free image replace results scale text text-to-image text-to-video training type video video data video generation work

AI Focused Biochemistry Postdoctoral Fellow

@ Lawrence Berkeley National Lab | Berkeley, CA

Senior Data Engineer

@ Displate | Warsaw

Associate Director, IT Business Partner, Cell Therapy Analytical Development

@ Bristol Myers Squibb | Warren - NJ

Solutions Architect

@ Lloyds Banking Group | London 125 London Wall

Senior Lead Cloud Engineer

@ S&P Global | IN - HYDERABAD ORION

Software Engineer

@ Applied Materials | Bengaluru,IND