Aug. 25, 2022, 1:11 a.m. | Georgia Maniati, Alexandra Vioni, Nikolaos Ellinas, Karolos Nikitaras, Konstantinos Klapsas, June Sig Sung, Gunu Jho, Aimilios Chalamandaris, Pirros T

cs.LG updates on arXiv.org arxiv.org

In this work, we present the SOMOS dataset, the first large-scale mean
opinion scores (MOS) dataset consisting of solely neural text-to-speech (TTS)
samples. It can be employed to train automatic MOS prediction systems focused
on the assessment of modern synthesizers, and can stimulate advancements in
acoustic model evaluation. It consists of 20K synthetic utterances of the LJ
Speech voice, a public domain speech dataset which is a common benchmark for
building neural acoustic models and vocoders. Utterances are generated from …

arxiv dataset evaluation mos samsung speech text text-to-speech

Data Scientist (m/f/x/d)

@ Symanto Research GmbH & Co. KG | Spain, Germany

Data Analyst, Tableau

@ NTT DATA | Bengaluru, KA, IN

Junior Machine Learning Researcher

@ Weill Cornell Medicine | Doha, QA, 24144

Marketing Data Analytics Intern

@ Sloan | Franklin Park, IL, US, 60131

Senior Machine Learning Scientist

@ Adyen | Amsterdam

Data Engineer

@ Craft.co | Warsaw, Mazowieckie