July 25, 2023, 1 p.m. | Anthony Alford

InfoQ - AI, ML & Data Engineering www.infoq.com

Meta recently announced Voicebox, a speech generation model that can perform text-to-speech (TTS) synthesis in six languages, as well as edit and remove noise from speech recordings. Voicebox is trained on over 50k hours of audio data and outperforms previous state-of-the-art models on several TTS benchmarks.

By Anthony Alford

ai anthony art audio benchmarks data deep learning edit languages meta ml & data engineering natural language processing neural networks noise six speech speech generation state synthesis text text-to-speech tts voicebox

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US

Research Engineer

@ Allora Labs | Remote

Ecosystem Manager

@ Allora Labs | Remote

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US