GPT4Motion: Scripting Physical Motions in Text-to-Video Generation via Blender-Oriented GPT Planning | allainews.com

Feb. 21, 2024, 5:46 a.m. | Jiaxi Lv, Yi Huang, Mingfu Yan, Jiancheng Huang, Jianzhuang Liu, Yifan Liu, Yafei Wen, Xiaoxin Chen, Shifeng Chen

cs.CV updates on arXiv.org arxiv.org

arXiv:2311.12631v2 Announce Type: replace
Abstract: Recent advances in text-to-video generation have harnessed the power of diffusion models to create visually compelling content conditioned on text prompts. However, they usually encounter high computational costs and often struggle to produce videos with coherent physical motions. To tackle these issues, we propose GPT4Motion, a training-free framework that leverages the planning capability of large language models such as GPT, the physical simulation strength of Blender, and the excellent image generation ability of text-to-image diffusion …

abstract advances arxiv blender computational costs cs.cv diffusion diffusion models gpt planning power prompts scripting struggle text text-to-video type via video video generation videos

More from arxiv.org / cs.CV updates on arXiv.org

Validating polyp and instrument segmentation methods in colonoscopy through Medico 2020 and MedAI 2021 Challenges 20 hours ago | arxiv.org

abstract analysis arxiv challenges +11

ReFACT: Updating Text-to-Image Models by Editing the Text Encoder 20 hours ago | arxiv.org

abstract arxiv become challenge +17

Yuille-Poggio's Flow and Global Minimizer of Polynomials through Convexification by Heat Evolution 20 hours ago | arxiv.org

abstract algorithm arxiv cs.cv +9

Motion State: A New Benchmark Multiple Object Tracking 20 hours ago | arxiv.org

abstract analysis arxiv benchmark +18

Paint-it: Text-to-Texture Synthesis via Deep Convolutional Texture Map Optimization and Physically-Based Rendering 20 hours ago | arxiv.org

arxiv convolutional cs.ai cs.cv +10

A Unified Approach for Text- and Image-guided 4D Scene Generation 20 hours ago | arxiv.org

3d scene generation abstract arxiv cs.cv +17

From Pixels to Titles: Video Game Identification by Screenshots using Convolutional Neural Networks 20 hours ago | arxiv.org

abstract architectures arxiv cnn +24

Amodal Optical Flow 20 hours ago | arxiv.org

arxiv cs.ai cs.cv cs.ro +4

Interpretable Geoscience Artificial Intelligence (XGeoS-AI): Application to Demystify Image Recognition 20 hours ago | arxiv.org

abstract ai models application artificial +21

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

View on ai-jobs.net

Lead Developer (AI)

@ Cere Network | San Francisco, US

View on ai-jobs.net

Research Engineer

@ Allora Labs | Remote

View on ai-jobs.net

Ecosystem Manager

@ Allora Labs | Remote

View on ai-jobs.net

Founding AI Engineer, Agents

@ Occam AI | New York

View on ai-jobs.net

AI Engineer Intern, Agents

@ Occam AI | US

View on ai-jobs.net