all AI news
[Research] [Project] Text-to-Audio Generation using Instruction-Tuned LLM and Latent Diffusion Model
May 4, 2023, 4:44 p.m. | /u/bideex
Machine Learning www.reddit.com
Code: [https://github.com/declare-lab/tango](https://github.com/declare-lab/tango)
Demo: [https://huggingface.co/spaces/declare-lab/tango](https://huggingface.co/spaces/declare-lab/tango)
Project: [https://tango-web.github.io/](https://tango-web.github.io/)
Abstract: The immense scale of the recent large language models (LLM) allows many interesting properties, such as, instruction- and chain-of-thought-based fine-tuning, that has significantly improved zero- and few-shot performance in many natural language processing (NLP) tasks. Inspired by such successes, we adopt such an instruction-tuned LLM FLAN-T5 as the text encoder for text-to audio (TTA) generation—a task where the goal is to generate an audio from its textual description. The prior works …
abstract audio encoder fine-tuning language language models language processing large language models llm machinelearning natural natural language natural language processing nlp performance processing scale text thought
More from www.reddit.com / Machine Learning
Jobs in AI, ML, Big Data
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Data Analyst
@ SEAKR Engineering | Englewood, CO, United States
Data Analyst II
@ Postman | Bengaluru, India
Data Architect
@ FORSEVEN | Warwick, GB
Director, Data Science
@ Visa | Washington, DC, United States
Senior Manager, Data Science - Emerging ML
@ Capital One | McLean, VA