Oct. 31, 2022, 1:15 a.m. | Ramon Sanabria, Hao Tang, Sharon Goldwater

cs.CL updates on arXiv.org arxiv.org

Given the strong results of self-supervised models on various tasks, there
have been surprisingly few studies exploring self-supervised representations
for acoustic word embeddings (AWE), fixed-dimensional vectors representing
variable-length spoken word segments. In this work, we study several
pre-trained models and pooling methods for constructing AWEs with
self-supervised representations. Owing to the contextualized nature of
self-supervised representations, we hypothesize that simple pooling methods,
such as averaging, might already be useful for constructing AWEs. When
evaluating on a standard word discrimination task, …

arxiv speech word embeddings

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Business Data Scientist, gTech Ads

@ Google | Mexico City, CDMX, Mexico

Lead, Data Analytics Operations

@ Zocdoc | Pune, Maharashtra, India