March 18, 2024, 7:15 a.m. | /u/Aggravating-Floor-38

Natural Language Processing www.reddit.com

I'm trying to deploy a Mistral 7B api endpoint for a RAG application I'm building. A few major things I'm confused about - I'm GPU poor :( so was planning on using AWS sagemaker to deploy the model - the 2 month free plan has 125 hours of m4.xlarge or m5.xlarge instance per month on Inference - would that be enough to set up an endpoint for quantized mistral (I'm thinking 5-bit)? And like if you don't have a GPU …

api application aws aws sagemaker building deploy etc free gpu hosting languagetechnology major mistral mistral 7b planning quantization rag sagemaker

Software Engineer for AI Training Data (School Specific)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Python)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Tier 2)

@ G2i Inc | Remote

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US