[P] Speculative Decoding: 2x faster inference for Whisper large-v2 | allainews.com

Dec. 20, 2023, 4:22 p.m. | /u/sanchitgandhi99

Machine Learning www.reddit.com

Speculative decoding gives 2x faster Whisper inference while ensuring exactly the same outputs, making it the perfect drop-in replacement for existing Whisper pipelines ⚡️

https://preview.redd.it/nq1xnd517h7c1.png?width=2604&format=png&auto=webp&s=5e2b9f662f5bbb7fb0183e0f822fe2bb86330e16

Check out the [blog post](https://huggingface.co/blog/whisper-speculative-decoding) and accompanying Google Colab, or continue reading for details👇

## How does it work? 🧐

Speculative decoding uses a smaller, faster model to assist the generation of a slower, larger one 🤝 By auto-regressively generating with the smaller model, and only performing validation forward passes with the larger one, the …

auto decoding faster inference machinelearning making pipelines replacement whisper work

More from www.reddit.com / Machine Learning

[Discussion] What are SOTA Uncertainty Quantification Methods for Neural Networks? 8 hours ago | www.reddit.com

inputs look machinelearning modeling +8

[R] LLM4ED: Large Language Models for Automatic Equation Discovery 8 hours ago | www.reddit.com

abstract algorithms data design +20

[R] The Platonic Representation Hypothesis 8 hours ago | www.reddit.com

convergence machinelearning modal multi-modal +3

[D] Those in the industry, how are you using open source LLMs? 12 hours ago | www.reddit.com

industry llms love machinelearning +6

[R] Matryoshka representation learning (MRL) for CLIP (& SigLip) 17 hours ago | www.reddit.com

clip dimensions embeddings fidelity +11

[D] Any reason not to submit to NeurIPS? 18 hours ago | www.reddit.com

machinelearning neurips reason reviews +1

[D] Kolmogorov Arnold Networks: A visual paper breakdown (Video) 23 hours ago | www.reddit.com

breakdown challenges concepts core +10

[D] GPT-4o "natively" multi-modal, what does this actually mean? 1 day ago | www.reddit.com

architecture embed encoder fine-tune +15

[D] Is BERT still relevant in 2024 for an EMNLP submission? 1 day, 1 hour ago | www.reddit.com

active learning applications bert classification +7

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

View on ai-jobs.net

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

View on ai-jobs.net

Lead Developer (AI)

@ Cere Network | San Francisco, US

View on ai-jobs.net

Research Engineer

@ Allora Labs | Remote

View on ai-jobs.net

Ecosystem Manager

@ Allora Labs | Remote

View on ai-jobs.net

Founding AI Engineer, Agents

@ Occam AI | New York

View on ai-jobs.net