Feb. 9, 2024, 5:47 a.m. | Maxime Fily Guillaume Wisniewski Severine Guillaume Gilles Adda Alexis Michaud

cs.CL updates on arXiv.org arxiv.org

In the highly constrained context of low-resource language studies, we explore vector representations of speech from a pretrained model to determine their level of abstraction with regard to the audio signal. We propose a new unsupervised method using ABX tests on audio recordings with carefully curated metadata to shed light on the type of information present in the representations. ABX tests determine whether the representations computed by a multilingual speech model encode a given characteristic. Three experiments are devised: one …

abstraction audio audio recordings context cross-lingual cs.cl cs.sd dimensions eess.as explore language low regard scale signal speech studies tests unsupervised vector

Doctoral Researcher (m/f/div) in Automated Processing of Bioimages

@ Leibniz Institute for Natural Product Research and Infection Biology (Leibniz-HKI) | Jena

Research Scholar (Technical Research)

@ Centre for the Governance of AI | Hybrid; Oxford, UK

Lead Software Engineer, Machine Learning

@ Monarch Money | Remote (US)

Investigator, Data Science

@ GSK | Stevenage

Alternance - Assistant.e Chef de Projet Data Business Intelligence (H/F)

@ Pernod Ricard | FR - Paris - The Island

Expert produit Big Data & Data Science - Services Publics - Nantes

@ Sopra Steria | Nantes, France