Aug. 16, 2022, 5:48 p.m. | Heiko Hotz

Towards Data Science - Medium towardsdatascience.com

Deploy large language models with bnb-Int8 for Hugging Face

Photo by Saffu on Unsplash

What is this about?

In this tutorial we will deploy BigScience’s BLOOM model, one of the most impressive large language models (LLMs), in an Amazon SageMaker endpoint. To do so, we will leverage the bitsandbytes (bnb) Int8 integration for models from the Hugging Face (HF) Hub. With these Int8 weights we can run large models that previously wouldn’t fit into our GPUs.

The code …

ai ai model bigscience bloom hugging face machine learning nlp sagemaker

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne