Deploying Mistral 7B - Quantization Methods, Hosting Options etc. (for the GPU poor) | allainews.com

March 18, 2024, 7:15 a.m. | /u/Aggravating-Floor-38

Natural Language Processing www.reddit.com

I'm trying to deploy a Mistral 7B api endpoint for a RAG application I'm building. A few major things I'm confused about - I'm GPU poor :( so was planning on using AWS sagemaker to deploy the model - the 2 month free plan has 125 hours of m4.xlarge or m5.xlarge instance per month on Inference - would that be enough to set up an endpoint for quantized mistral (I'm thinking 5-bit)? And like if you don't have a GPU …

api application aws aws sagemaker building deploy etc free gpu hosting languagetechnology major mistral mistral 7b planning quantization rag sagemaker

More from www.reddit.com / Natural Language Processing

Did we just receive an AI-generated meta-review? 19 hours ago | www.reddit.com

generated languagetechnology meta review

Found a Way to Keep Transcripts Going 24/7 1 day, 4 hours ago | www.reddit.com

apple apple silicon bugs check +10

Anyone working on mathematics of transformers? 2 days, 13 hours ago | www.reddit.com

graduate languagetechnology transformers

What Do You Love About NLP? 3 days ago | www.reddit.com

coding communication computer conversations +7

Show Your Work with Confidence: Confidence Bands for Tuning Curves 3 days, 16 hours ago | www.reddit.com

abstract accounting function hyperparameter +11

How to Install and Deploy LLaMA 3 Into Production 3 days, 17 hours ago | www.reddit.com

70b beast easy gpu +8

The Languages AI Is Leaving Behind 6 days, 17 hours ago | www.reddit.com

languages languagetechnology

Feeling so inferior in the NLP job market. 1 week ago | www.reddit.com

job language languagetechnology master +5

NLP: building a sentiment model 1 week ago | www.reddit.com

began building create dataset +12

Data Architect

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

View on ai-jobs.net

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

View on ai-jobs.net

Principal Machine Learning Engineer (AI, NLP, LLM, Generative AI)

@ Palo Alto Networks | Santa Clara, CA, United States

View on ai-jobs.net

Consultant Senior Data Engineer F/H

@ Devoteam | Nantes, France

View on ai-jobs.net