May 24, 2024, 4:52 a.m. | Bo Peng, Xinyuan Chen, Yaohui Wang, Chaochao Lu, Yu Qiao

cs.CV updates on arXiv.org arxiv.org

arXiv:2310.07697v2 Announce Type: replace
Abstract: Recent works have successfully extended large-scale text-to-image models to the video domain, producing promising results but at a high computational cost and requiring a large amount of video data. In this work, we introduce ConditionVideo, a training-free approach to text-to-video generation based on the provided condition, video, and input text, by leveraging the power of off-the-shelf text-to-image generation methods (e.g., Stable Diffusion). ConditionVideo generates realistic dynamic videos from random noise or given scene videos. Our …

abstract arxiv computational cost cs.cv data domain free image replace results scale text text-to-image text-to-video training type video video data video generation work

Senior Data Engineer

@ Displate | Warsaw

Junior Data Analyst - ESG Data

@ Institutional Shareholder Services | Mumbai

Intern Data Driven Development in Sensor Fusion for Autonomous Driving (f/m/x)

@ BMW Group | Munich, DE

Senior MLOps Engineer, Machine Learning Platform

@ GetYourGuide | Berlin

Data Engineer, Analytics

@ Meta | Menlo Park, CA

Data Engineer

@ Meta | Menlo Park, CA