all AI news
Modeling speech recognition and synthesis simultaneously: Encoding and decoding lexical and sublexical semantic information into speech with no direct access to speech data. (arXiv:2203.11476v3 [cs.CL] UPDATED)
cs.CL updates on arXiv.org arxiv.org
Human speakers encode information into raw speech which is then decoded by
the listeners. This complex relationship between encoding (production) and
decoding (perception) is often modeled separately. Here, we test how encoding
and decoding of lexical semantic information can emerge automatically from raw
speech in unsupervised generative deep convolutional networks that combine the
production and perception principles of speech. We introduce, to our knowledge,
the most challenging objective in unsupervised lexical learning: a network that
must learn unique representations for …
arxiv data encoding information modeling semantic speech speech recognition