March 18, 2024, 7:19 a.m. | /u/Aggravating-Floor-38

Machine Learning www.reddit.com

I'm trying to deploy a Mistral 7B api endpoint for a RAG application I'm building. A few major things I'm confused about - I'm GPU poor :( so was planning on using AWS sagemaker to deploy the model - the 2 month free plan has 125 hours of m4.xlarge or m5.xlarge instance per month on Inference - would that be enough to set up an endpoint for quantized mistral (I'm thinking 5-bit)? And like if you don't have a GPU …

api application aws aws sagemaker building deploy etc free gpu hosting machinelearning major mistral mistral 7b planning quantization rag sagemaker

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Software Engineering Manager, Generative AI - Characters

@ Meta | Bellevue, WA | Menlo Park, CA | Seattle, WA | New York City | San Francisco, CA

Senior Operations Research Analyst / Predictive Modeler

@ LinQuest | Colorado Springs, Colorado, United States