June 6, 2024, 4:49 a.m. | Haoran Cheng, Liang Peng, Linxuan Xia, Yuepeng Hu, Hengjia Li, Qinglin Lu, Xiaofei He, Boxi Wu

cs.CV updates on arXiv.org arxiv.org

arXiv:2406.03215v1 Announce Type: new
Abstract: Significant advancements in video diffusion models have brought substantial progress to the field of text-to-video (T2V) synthesis. However, existing T2V synthesis model struggle to accurately generate complex motion dynamics, leading to a reduction in video realism. One possible solution is to collect massive data and train the model on it, but this would be extremely expensive. To alleviate this problem, in this paper, we reformulate the typical T2V generation process as a search-based generation pipeline. …

abstract arxiv cs.cv data diffusion diffusion models dynamics generate however massive progress searching solution struggle synthesis text text-to-video train type video video diffusion

