[R] Exponentially Faster Language Modelling | allainews.com

Nov. 22, 2023, 9:28 a.m. | /u/lexected

Machine Learning www.reddit.com

**TL;DR:** Organize your neurons into a tree to get 78x faster inference (theoretical limit is 341x).

This was demonstrated on BERT-base, where this change preserved 96% of its downstream GLUE performance. For a quick comparison, DistilBERT offers 1.6x acceleration while preserving 97% of GLUE performance.

This is a [HuggingFace Featured Paper from 11/21/2023](https://huggingface.co/papers/2311.10770).

Paper: [https://arxiv.org/abs/2311.10770](https://arxiv.org/abs/2311.10770)

Code: [https://github.com/pbelcak/UltraFastBERT](https://github.com/pbelcak/UltraFastBERT)

Model: [https://huggingface.co/pbelcak/UltraFastBERT-1x11-long](https://huggingface.co/pbelcak/UltraFastBERT-1x11-long)

Abstract:

>Language models only really need to use an exponential fraction of their neurons for individual inferences.
>
>As proof, we …

abstract bert change comparison distilbert faster glue inference language language modelling language models machinelearning modelling neurons organize performance tree

More from www.reddit.com / Machine Learning

[D] How do you get better at reading proof in the ML papers, with background … an hour ago | www.reddit.com

adversarial basic calculus context +6

[D] The usefulness of the last linear layer of each transformer layer 3 hours ago | www.reddit.com

kind layer linear machinelearning +7

[D] Have someone tried to implement KANs from scratch? 6 hours ago | www.reddit.com

announcement architecture deep learning domain +7

[D] Full causal self-attention layer in O(NlogN) computation steps and O(logN) time rather than O(N^2) … 10 hours ago | www.reddit.com

attention big causal computation +6

[Discussion] MICCAI 2024 decisions 11 hours ago | www.reddit.com

application decisions discuss email +5

What's your favorite paper at ICLR2024? [D] 11 hours ago | www.reddit.com

iclr2024 machinelearning paper

[D] Neurips 2024 submissions 16 hours ago | www.reddit.com

abstract case machinelearning neurips +2

[N] GPT-4o 19 hours ago | www.reddit.com

arena chatbot chatbot arena current +8

ML Feature Compression [D] 22 hours ago | www.reddit.com

autoencoders compression etc feature +7

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

View on ai-jobs.net

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

View on ai-jobs.net

Lead Developer (AI)

@ Cere Network | San Francisco, US

View on ai-jobs.net

Research Engineer

@ Allora Labs | Remote

View on ai-jobs.net

Ecosystem Manager

@ Allora Labs | Remote

View on ai-jobs.net

Founding AI Engineer, Agents

@ Occam AI | New York

View on ai-jobs.net