March 21, 2024, 4:42 a.m. | Yumeng Li, William Beluch, Margret Keuper, Dan Zhang, Anna Khoreva

cs.LG updates on arXiv.org arxiv.org

arXiv:2403.13501v1 Announce Type: cross
Abstract: Despite tremendous progress in the field of text-to-video (T2V) synthesis, open-sourced T2V diffusion models struggle to generate longer videos with dynamically varying and evolving content. They tend to synthesize quasi-static videos, ignoring the necessary visual change-over-time implied in the text prompt. At the same time, scaling these models to enable longer, more dynamic video synthesis often remains computationally intractable. To address this challenge, we introduce the concept of Generative Temporal Nursing (GTN), where we aim …

abstract arxiv change cs.ai cs.cv cs.lg cs.mm diffusion diffusion models dynamic generate generative nursing progress prompt scaling struggle synthesis temporal text text-to-video type video videos visual

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Global Data Architect, AVP - State Street Global Advisors

@ State Street | Boston, Massachusetts

Data Engineer

@ NTT DATA | Pune, MH, IN