March 30, 2024, 8:23 p.m. | /u/Amgadoz

Machine Learning www.reddit.com

Hey everyone!

I recently compared all the open source whisper-based packages that support long-form transcription.

Long-form transcription is basically transcribing audio files that are more than 30 seconds.

This can be useful if you want to chat with a youtube video or podcast etc.

I compared the following packages:

1. OpenAI's official whisper package
2. Huggingface Transformers
3. Huggingface BetterTransformer
4. FasterWhisper
5. WhisperX
6. Whisper.cpp

I compared between them in the following areas:

1. Accuracy - using word error …

audio chat etc files form hey machinelearning open source podcast support transcription video whisper youtube

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Sr. VBI Developer II

@ Atos | Texas, US, 75093

Wealth Management - Data Analytics Intern/Co-op Fall 2024

@ Scotiabank | Toronto, ON, CA