[D] Speech to Text Word Level Timestamps Accuracy Issue | allainews.com

April 23, 2024, 7:18 p.m. | /u/Mindless-Ordinary485

Machine Learning www.reddit.com

I've had a lot of success with Whisper when it comes to transcriptions, but word level timestamps seems to be slightly inaccurate. From my understanding ("Whisper cannot provide reliable word timestamps, because the END-TO-END models like Transformer using cross-entropy training criterion are not designed for reliably estimating word timestamps." [https://www.youtube.com/watch?v=H576iCWt1Co&t=192s](https://www.youtube.com/watch?v=H576iCWt1Co&t=192s)) For my use case, I need precise word level timestamps, because I'm doing audio insertion after specific words. This becomes problematic when I do an insertion and the back part …

audio clip example file france however machinelearning speech the end will word

More from www.reddit.com / Machine Learning

[Research] Consistency LLMs: converting LLMs to parallel decoders accelerates inference 3.5x 2 hours ago | www.reddit.com

check decoding deployment family +17

[D] How do transformers memorize facts after a single gradient update? 3 hours ago | www.reddit.com

dataset facts gradient knowledge +6

[D] Fun little discovery: Gemini is surprisingly bad at following simple number sequences 6 hours ago | www.reddit.com

discovery fun gemini machinelearning +5

[D] Strange Loss Curve while training 6 hours ago | www.reddit.com

dataset gpt loss machinelearning +4

[D] Intra-Document prefix (cumulative) sum when using sequence packing in PyTorch 11 hours ago | www.reddit.com

computational context context window documents +7

[Research] xLSTM: Extended Long Short-Term Memory 18 hours ago | www.reddit.com

abstract contributed deep learning error +16

Non Technical ML Podcasts? [D] 1 day, 1 hour ago | www.reddit.com

challenge context current data +16

[D] PEFT techniques actually used in the industry 1 day, 5 hours ago | www.reddit.com

industry machinelearning normally peft +2

[D] Can anyone with the expertise speak to the overlap, or not, between Nvidia's hardware … 1 day, 6 hours ago | www.reddit.com

apple chips expertise hardware +4

Lead Developer (AI)

@ Cere Network | San Francisco, US

View on ai-jobs.net

Research Engineer

@ Allora Labs | Remote

View on ai-jobs.net

Ecosystem Manager

@ Allora Labs | Remote

View on ai-jobs.net

Founding AI Engineer, Agents

@ Occam AI | New York

View on ai-jobs.net

AI Engineer Intern, Agents

@ Occam AI | US

View on ai-jobs.net

AI Research Scientist

@ Vara | Berlin, Germany and Remote

View on ai-jobs.net