April 18, 2022, 4:03 p.m. | Ricky Costa

Towards AI - Medium pub.towardsai.net

Photo by Rishabh Pandoh on Unsplash

How fast can BERT go with sparsity?

Here’s a Little Secret:

If you want to analyze how fast 19 sparse BERT models perform inference, you’ll only need a YAML file and 16GB of RAM to find out. And spoiler alert:

… they run on CPUs.

… and they’re super fast!

The latest feature from Neural Magic’s DeepSparse repo is the DeepSparse Server! And the objective of this article is to show not only how …

artificial intelligence data science deep learning demo machine learning technology transformers

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Staff Software Engineer, Generative AI, Google Cloud AI

@ Google | Mountain View, CA, USA; Sunnyvale, CA, USA

Expert Data Sciences

@ Gainwell Technologies | Any city, CO, US, 99999