Jan. 31, 2024, 4:46 p.m. | Jade Copet, Felix Kreuk, Itai Gat, Tal Remez, David Kant, Gabriel Synnaeve, Yossi Adi, Alexandre Défossez

cs.LG updates on arXiv.org arxiv.org

We tackle the task of conditional music generation. We introduce MusicGen, a
single Language Model (LM) that operates over several streams of compressed
discrete music representation, i.e., tokens. Unlike prior work, MusicGen is
comprised of a single-stage transformer LM together with efficient token
interleaving patterns, which eliminates the need for cascading several models,
e.g., hierarchically or upsampling. Following this approach, we demonstrate how
MusicGen can generate high-quality samples, both mono and stereo, while being
conditioned on textual description or melodic …

arxiv cs.sd interleaving language language model music musicgen music generation patterns prior representation simple stage together token tokens transformer work

Lead Developer (AI)

@ Cere Network | San Francisco, US

Research Engineer

@ Allora Labs | Remote

Ecosystem Manager

@ Allora Labs | Remote

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote