Web: http://arxiv.org/abs/2204.09634

June 20, 2022, 1:11 a.m. | Samuel Lipping, Parthasaarathy Sudarsanam, Konstantinos Drossos, Tuomas Virtanen

cs.LG updates on arXiv.org arxiv.org

Audio question answering (AQA) is a multimodal translation task where a
system analyzes an audio signal and a natural language question, to generate a
desirable natural language answer. In this paper, we introduce Clotho-AQA, a
dataset for Audio question answering consisting of 1991 audio files each
between 15 to 30 seconds in duration selected from the Clotho dataset. For each
audio file, we collect six different questions and corresponding answers by
crowdsourcing using Amazon Mechanical Turk. The questions and answers …

arxiv audio dataset question answering

More from arxiv.org / cs.LG updates on arXiv.org

Machine Learning Researcher - Saalfeld Lab

@ Howard Hughes Medical Institute - Chevy Chase, MD | Ashburn, Virginia

Project Director, Machine Learning in US Health

@ ideas42.org | Remote, US

Data Science Intern

@ NannyML | Remote

Machine Learning Engineer NLP/Speech

@ Play.ht | Remote

Research Scientist, 3D Reconstruction

@ Yembo | Remote, US

Clinical Assistant or Associate Professor of Management Science and Systems

@ University at Buffalo | Buffalo, NY