all AI news
[Project] Simple FastAPI service to serve LLAMA-2 7B chat model
Aug. 16, 2023, 12:01 p.m. | /u/JacekPlocharczyk
Machine Learning www.reddit.com
I wrote a simple FastAPI service to serve the LLAMA-2 7B chat model for our internal usage (just to avoid using chatgpt in our prototypes).
I thought it could also be beneficial for you to use it if needed.
Feel free to play with it [https://github.com/mowa-ai/llm-as-a-service](https://github.com/mowa-ai/llm-as-a-service)
Tested on Nvidia L4 (24GB) with \`g2-standard-8\` VM at GCP.
Any feedback welcome :)
chat chatgpt fastapi gcp hey llama machinelearning nvidia project serve service simple standard thought usage
More from www.reddit.com / Machine Learning
Jobs in AI, ML, Big Data
AI Research Scientist
@ Vara | Berlin, Germany and Remote
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Data Science Analyst
@ Mayo Clinic | AZ, United States
Sr. Data Scientist (Network Engineering)
@ SpaceX | Redmond, WA