Aug. 31, 2022, 1:13 a.m. | Seung Hyun Lee, Gyeongrok Oh, Wonmin Byeon, Chanyoung Kim, Won Jeong Ryoo, Sang Ho Yoon, Hyunjun Cho, Jihyun Bae, Jinkyu Kim, Sangpil Kim

cs.CV updates on arXiv.org arxiv.org

The recent success in StyleGAN demonstrates that pre-trained StyleGAN latent
space is useful for realistic video generation. However, the generated motion
in the video is usually not semantically meaningful due to the difficulty of
determining the direction and magnitude in the StyleGAN latent space. In this
paper, we propose a framework to generate realistic videos by leveraging
multimodal (sound-image-text) embedding space. As sound provides the temporal
contexts of the scene, our framework learns to generate a video that is
semantically …

arxiv generation semantic sound video

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US

Research Engineer

@ Allora Labs | Remote

Ecosystem Manager

@ Allora Labs | Remote

Founding AI Engineer, Agents

@ Occam AI | New York