all AI news
VSTAR: Generative Temporal Nursing for Longer Dynamic Video Synthesis
March 21, 2024, 4:42 a.m. | Yumeng Li, William Beluch, Margret Keuper, Dan Zhang, Anna Khoreva
cs.LG updates on arXiv.org arxiv.org
Abstract: Despite tremendous progress in the field of text-to-video (T2V) synthesis, open-sourced T2V diffusion models struggle to generate longer videos with dynamically varying and evolving content. They tend to synthesize quasi-static videos, ignoring the necessary visual change-over-time implied in the text prompt. At the same time, scaling these models to enable longer, more dynamic video synthesis often remains computationally intractable. To address this challenge, we introduce the concept of Generative Temporal Nursing (GTN), where we aim …
abstract arxiv change cs.ai cs.cv cs.lg cs.mm diffusion diffusion models dynamic generate generative nursing progress prompt scaling struggle synthesis temporal text text-to-video type video videos visual
More from arxiv.org / cs.LG updates on arXiv.org
Jobs in AI, ML, Big Data
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Global Data Architect, AVP - State Street Global Advisors
@ State Street | Boston, Massachusetts
Data Engineer
@ NTT DATA | Pune, MH, IN