June 21, 2023, 3 p.m. | Venelin Valkov

Venelin Valkov www.youtube.com

How to deploy a fine-tuned LLM (Falcon 7B) with QLoRA to production?

After training Falcon 7B with QLoRA on a custom dataset, the next step is deploying the model to production. In this tutorial, we'll use HuggingFace Inference Endpoints to build and deploy our model behind a REST API.

Discord: https://discord.gg/UaNPxVD6tv
Prepare for the Machine Learning interview: https://mlexpert.io
Subscribe: http://bit.ly/venelin-subscribe

Cloud image by macrovector-official

#chatgpt #gpt4 #llms #artificialintelligence #promptengineering #chatbot #transformers #python #pytorch

api dataset endpoints falcon gpu huggingface inference llm next production rest rest api training tutorial

More from www.youtube.com / Venelin Valkov

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Data Science Analyst

@ Mayo Clinic | AZ, United States

Sr. Data Scientist (Network Engineering)

@ SpaceX | Redmond, WA