all AI news
Action2Sound: Ambient-Aware Generation of Action Sounds from Egocentric Videos
June 14, 2024, 4:48 a.m. | Changan Chen, Puyuan Peng, Ami Baid, Zihui Xue, Wei-Ning Hsu, David Harwarth, Kristen Grauman
cs.CV updates on arXiv.org arxiv.org
Abstract: Generating realistic audio for human interactions is important for many applications, such as creating sound effects for films or virtual reality games. Existing approaches implicitly assume total correspondence between the video and audio during training, yet many sounds happen off-screen and have weak to no correspondence with the visuals -- resulting in uncontrolled ambient sounds or hallucinations at test time. We propose a novel ambient-aware audio generation model, AV-LDM. We devise a novel audio-conditioning mechanism …
abstract action ambient applications arxiv audio cs.ai cs.cv cs.sd eess.as effects films games human human interactions important interactions off reality sound sound effects total training type video videos virtual virtual reality virtual reality games
More from arxiv.org / cs.CV updates on arXiv.org
Jobs in AI, ML, Big Data
AI Focused Biochemistry Postdoctoral Fellow
@ Lawrence Berkeley National Lab | Berkeley, CA
Senior Data Engineer
@ Displate | Warsaw
PhD Student AI simulation electric drive (f/m/d)
@ Volkswagen Group | Kassel, DE, 34123
AI Privacy Research Lead
@ Leidos | 6314 Remote/Teleworker US
Senior Platform System Architect, Silicon
@ Google | New Taipei, Banqiao District, New Taipei City, Taiwan
Fabrication Hardware Litho Engineer, Quantum AI
@ Google | Goleta, CA, USA