Web: http://arxiv.org/abs/2206.07666

June 16, 2022, 1:12 a.m. | Jan Lehečka, Josef V. Psutka, Josef Psutka

cs.CL updates on arXiv.org arxiv.org

Czech is a very specific language due to its large differences between the
formal and the colloquial form of speech. While the formal (written) form is
used mainly in official documents, literature, and public speeches, the
colloquial (spoken) form is used widely among people in casual speeches. This
gap introduces serious problems for ASR systems, especially when training or
evaluating ASR models on datasets containing a lot of colloquial speech, such
as the MALACH project. In this paper, we are …

arxiv automatic speech recognition project speech speech recognition transformer

