March 12, 2024, 4:47 a.m. | Youyuan Zhang, Xuan Ju, James J. Clark

cs.CV updates on arXiv.org arxiv.org

arXiv:2403.06269v1 Announce Type: new
Abstract: Diffusion models have demonstrated remarkable capabilities in text-to-image and text-to-video generation, opening up possibilities for video editing based on textual input. However, the computational cost associated with sequential sampling in diffusion models poses challenges for efficient video editing. Existing approaches relying on image generation models for video editing suffer from time-consuming one-shot fine-tuning, additional condition extraction, or DDIM inversion, making real-time applications impractical. In this work, we propose FastVideoEdit, an efficient zero-shot video editing approach …

abstract arxiv capabilities challenges computational cost cs.cv diffusion diffusion models editing however image image generation image generation models sampling text text-to-image text-to-video textual type video video generation

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US

Research Engineer

@ Allora Labs | Remote

Ecosystem Manager

@ Allora Labs | Remote

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US