all AI news
[R] Exponentially Faster Language Modelling
Nov. 22, 2023, 9:28 a.m. | /u/lexected
Machine Learning www.reddit.com
This was demonstrated on BERT-base, where this change preserved 96% of its downstream GLUE performance. For a quick comparison, DistilBERT offers 1.6x acceleration while preserving 97% of GLUE performance.
This is a [HuggingFace Featured Paper from 11/21/2023](https://huggingface.co/papers/2311.10770).
Paper: [https://arxiv.org/abs/2311.10770](https://arxiv.org/abs/2311.10770)
Code: [https://github.com/pbelcak/UltraFastBERT](https://github.com/pbelcak/UltraFastBERT)
Model: [https://huggingface.co/pbelcak/UltraFastBERT-1x11-long](https://huggingface.co/pbelcak/UltraFastBERT-1x11-long)
Abstract:
>Language models only really need to use an exponential fraction of their neurons for individual inferences.
>
>As proof, we …
abstract bert change comparison distilbert faster glue inference language language modelling language models machinelearning modelling neurons organize performance tree
More from www.reddit.com / Machine Learning
Jobs in AI, ML, Big Data
Data Engineer
@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania
Artificial Intelligence – Bioinformatic Expert
@ University of Texas Medical Branch | Galveston, TX
Lead Developer (AI)
@ Cere Network | San Francisco, US
Research Engineer
@ Allora Labs | Remote
Ecosystem Manager
@ Allora Labs | Remote
Founding AI Engineer, Agents
@ Occam AI | New York