Seeking advice on optimizing response time and handling multiple requests on AWS instance with NVIDIA A10G GPU | allainews.com

April 11, 2024, 6:27 a.m. | Jaydeep Biswas

DEV Community dev.to

Hey everyone,

I'm currently facing some challenges with optimizing the response time of my AWS instance. Here's the setup: I'm using a g5.xlarge instance which houses a single NVIDIA A10G GPU with 24GB of VRAM. Recently, I fine-tuned a mistralai/Mistral-7B-Instruct-v0.2 model on my custom data and then merged it with the base model. Additionally, I applied quantization methods to optimize further.

However, when I send a request to my fine-tuned model, it's taking approximately 3 minutes to respond, even …

advice ai aws challenges gpu hey instance llm machinelearning mistral multiple nvidia python setup

More from dev.to / DEV Community

Best AI Voice Generators APIs in 2024 38 minutes ago | dev.to

advanced ai ai voice ai voice generation +24

What is React Hydration an hour ago | dev.to

application client html javascript +9

Understanding and Using Lambda Functions in Python and Java an hour ago | dev.to

anonymous beginners code coding +10

What are the responsibilities for technical writers in the GenAI era? an hour ago | dev.to

boom concepts documentation economy +11

Building a WhatsApp Customer Service Representative with Lyzr, Flask, Twilio, and OpenAI 2 hours ago | dev.to

age become blog building +24

NextJS Add XML Sitemap 2 hours ago | dev.to

beginners blog file index +7

How Do You (Unconventionally) Use ChatGPT? 2 hours ago | dev.to

chatgpt creative dev discuss +5

Building static websites 3 hours ago | dev.to

astro build building collection +11

Looking to Collaborate 4 hours ago | dev.to

collab data data science developers +8

Data Architect

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

View on ai-jobs.net

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

View on ai-jobs.net

Data Engineer - AWS

@ 3Pillar Global | Costa Rica

View on ai-jobs.net

Cost Controller/ Data Analyst - India

@ John Cockerill | Mumbai, India, India, India

View on ai-jobs.net