July 13, 2022, 5:34 p.m. | Dorien Herremans

Towards Data Science - Medium towardsdatascience.com

Dealing with audio can complicate any machine learning task. In this tutorial, we go over how to build a neural network in PyTorch by directly feeding it audio files that are directly converted into finetunable spectrograms. To do this, we use nnAudio [1] and PyTorch.

This tutorial will build a classifier on the Google speech commands dataset v2 for the Key Word Spotting (KWS) task. KWS is a sound classification problem. Our model will predict the word (text) that matches …

ai audio deep learning deep neural network gpu network neural network processing spectrogram speech

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Lead Data Scientist, Commercial Analytics

@ Checkout.com | London, United Kingdom

Data Engineer I

@ Love's Travel Stops | Oklahoma City, OK, US, 73120