Text-to-Audio Generation Synchronized with Videos | allainews.com

March 14, 2024, 4:42 a.m. | Shentong Mo, Jing Shi, Yapeng Tian

cs.LG updates on arXiv.org arxiv.org

arXiv:2403.07938v1 Announce Type: cross
Abstract: In recent times, the focus on text-to-audio (TTA) generation has intensified, as researchers strive to synthesize audio from textual descriptions. However, most existing methods, though leveraging latent diffusion models to learn the correlation between audio and text embeddings, fall short when it comes to maintaining a seamless synchronization between the produced audio and its video. This often results in discernible audio-visual mismatches. To bridge this gap, we introduce a groundbreaking benchmark for Text-to-Audio generation that …

abstract arxiv audio audio generation correlation cs.ai cs.cv cs.lg cs.mm cs.sd diffusion diffusion models eess.as embeddings focus however latent diffusion models learn researchers synchronization text textual type videos

More from arxiv.org / cs.LG updates on arXiv.org

Transforming gradient-based techniques into interpretable methods 3 hours ago | arxiv.org

abstract arxiv challenges cnn +20

ChatQA: Surpassing GPT-4 on Conversational QA and RAG 3 hours ago | arxiv.org

arxiv conversational cs.ai cs.cl +7

Towards Truly Zero-shot Compositional Visual Reasoning with LLMs as Programmers 3 hours ago | arxiv.org

abstract arxiv cs.ai cs.cv +22

Calibrating Wireless Ray Tracing for Digital Twinning using Local Phase Error Estimates 3 hours ago | arxiv.org

abstract access arxiv construct +22

Graph Network Surrogate Model for Subsurface Flow Optimization 3 hours ago | arxiv.org

abstract arxiv co2 cs.lg +16

Double Machine Learning for Static Panel Models with Fixed Effects 3 hours ago | arxiv.org

abstract advances algorithms arxiv +20

Dynamic Adversarial Attacks on Autonomous Driving Systems 3 hours ago | arxiv.org

abstract adversarial adversarial attacks arxiv +22

BioCLIP: A Vision Foundation Model for the Tree of Life 3 hours ago | arxiv.org

arxiv cs.cl cs.cv cs.lg +7

On the convergence of adaptive first order methods: proximal gradient and alternating minimization algorithms 3 hours ago | arxiv.org

abstract algorithms arxiv building +12

Software Engineer for AI Training Data (School Specific)

@ G2i Inc | Remote

View on ai-jobs.net

Software Engineer for AI Training Data (Python)

@ G2i Inc | Remote

View on ai-jobs.net

Software Engineer for AI Training Data (Tier 2)

@ G2i Inc | Remote

View on ai-jobs.net

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

View on ai-jobs.net

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

View on ai-jobs.net

Lead Developer (AI)

@ Cere Network | San Francisco, US

View on ai-jobs.net