June 7, 2024, 4:44 a.m. | Zhichao Huang, Chutong Meng, Tom Ko

cs.LG updates on arXiv.org arxiv.org

arXiv:2309.00169v2 Announce Type: replace-cross
Abstract: With recent rapid growth of large language models (LLMs), discrete speech tokenization has played an important role for injecting speech into LLMs. However, this discretization gives rise to a loss of information, consequently impairing overall performance. To improve the performance of these discrete speech tokens, we present RepCodec, a novel speech representation codec for semantic speech tokenization. In contrast to audio codecs which reconstruct the raw audio, RepCodec learns a vector quantization codebook through reconstructing …

abstract arxiv codec cs.lg cs.sd eess.as growth however information language language models large language large language models llms loss performance replace representation role speech tokenization tokens type

