Aug. 6, 2023, 10:06 p.m. | /u/amindiro

Machine Learning www.reddit.com

Hello,

I have been working on an OpenAI-compatible API for serving LLAMA-2 models written entirely in Rust. It supports offloading computation to Nvidia GPU and Metal acceleration for GGML models !

Here is the project link: [Cria- Local LLAMA2 API](https://github.com/AmineDiro/cria)

You can use it as an OpenAI replacement (check out the included \`Langchain\` example in the project).

This is an ongoing project, I have implemented the \`embeddings\` and \`completions\` routes. The \`chat-completion\` route will be here very soon!

Really interested …

api check computation example gpu langchain llama llama2 machinelearning metal nvidia nvidia gpu openai project replacement rust

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US

Research Engineer

@ Allora Labs | Remote

Ecosystem Manager

@ Allora Labs | Remote

Founding AI Engineer, Agents

@ Occam AI | New York