[R] WavJourney: Compositional Audio Creation with Large Language Models - University of Surrey 2023 | allainews.com

Aug. 25, 2023, 6:43 p.m. | /u/Singularian2501

Machine Learning www.reddit.com

Paper: [https://arxiv.org/abs/2307.14335](https://arxiv.org/abs/2307.14335)

Github: [https://github.com/Audio-AGI/WavJourney](https://github.com/Audio-AGI/WavJourney)

Project Page: [https://audio-agi.github.io/WavJourney\_demopage/](https://audio-agi.github.io/WavJourney_demopage/)

Demo: [https://huggingface.co/spaces/Audio-AGI/WavJourney](https://huggingface.co/spaces/Audio-AGI/WavJourney)

Abstract:

>Large Language Models (LLMs) have shown great promise in integrating diverse expert models to tackle intricate language and vision tasks. Despite their significance in advancing the field of Artificial Intelligence Generated Content (AIGC), their potential in intelligent audio content creation remains unexplored. In this work, we tackle the problem of creating audio content with storylines encompassing speech, music, and sound effects, guided by text instructions. We present WavJourney, a system …

abstract aigc artificial artificial intelligence audio diverse expert generated intelligence intelligent language language models large language large language models llms machinelearning music significance speech tasks vision work

More from www.reddit.com / Machine Learning

[R] What is the state-of-art of model parallelism ? 18 hours ago | www.reddit.com

architecture art easy frameworks +4

[P] Simplified PyTorch Implementation of AlphaFold 3 20 hours ago | www.reddit.com

alphafold alphafold 3 implementation machinelearning +2

[D] What role do you think machine learning will play in fields like computational biology … 22 hours ago | www.reddit.com

bioinformatics biology computation computational +10

[D] Are LLM observability tools really used in startups and companies? 23 hours ago | www.reddit.com

adversarial adversarial attacks attacks combination +12

[D] Does DSPy actually change the LM weights? 1 day, 2 hours ago | www.reddit.com

change dspy engineering machinelearning +2

[D] How did OpenAI go from doing exciting research to a big-tech-like company? 1 day, 2 hours ago | www.reddit.com

capabilities engineering fast forward gpt4 +6

Multimodal AI from First Principles - Most fundamental approaches [D] 1 day, 2 hours ago | www.reddit.com

building fundamental machinelearning multimodal +4

[D] Culture of Recycling Old Conference Submissions in ML 1 day, 4 hours ago | www.reddit.com

conference conferences culture iclr +10

[D] How Do You Efficiently Conduct Ablation Studies in Machine Learning? 1 day, 5 hours ago | www.reddit.com

fine-tuning grid insights machine +7

Software Engineer for AI Training Data (School Specific)

@ G2i Inc | Remote

View on ai-jobs.net

Software Engineer for AI Training Data (Python)

@ G2i Inc | Remote

View on ai-jobs.net

Software Engineer for AI Training Data (Tier 2)

@ G2i Inc | Remote

View on ai-jobs.net

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

View on ai-jobs.net

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

View on ai-jobs.net

Lead Developer (AI)

@ Cere Network | San Francisco, US

View on ai-jobs.net