all AI news
Cem Mil Podcasts: A Spoken Portuguese Document Corpus. (arXiv:2209.11871v1 [cs.CL])
Sept. 27, 2022, 1:14 a.m. | Edgar Tanaka, Ann Clifton, Joana Correia, Sharmistha Jat, Rosie Jones, Jussi Karlgren, Winstead Zhu
cs.CL updates on arXiv.org arxiv.org
This document describes the Portuguese language podcast dataset released by
Spotify for academic research purposes. We give an overview of how the data was
sampled, some basic statistics over the collection, as well as brief
information of distribution over Brazilian and Portuguese dialects.
More from arxiv.org / cs.CL updates on arXiv.org
Jobs in AI, ML, Big Data
Founding AI Engineer, Agents
@ Occam AI | New York
AI Engineer Intern, Agents
@ Occam AI | US
AI Research Scientist
@ Vara | Berlin, Germany and Remote
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne