I made my own batching/caching API over the weekend. 200+ tk/s with Mistral 5.0bpw esl2 on an RTX 3090. It was for a personal project, and it's not complete, but happy holidays! It will probably just run in your LLM Conda env without installing anyth

Dec. 27, 2023, 9:22 a.m. | /u/Educational_Ice151

Ai Prompt Programming www.reddit.com

aipromptprogramming api batching caching conda holidays llm mistral project rtx rtx 3090 will

More from www.reddit.com / Ai Prompt Programming

Sora competitor: Shengshu Technology and Tsinghua University announce "Vidu", can create 16 seconds long HD … 5 hours ago | www.reddit.com

aipromptprogramming create resolution sora +4

Run the strongest open-source LLM model: Llama3 70B with just a single 4GB GPU! 2 days, 7 hours ago | www.reddit.com

70b aipromptprogramming gpu llama3 +1

I'm doing my PhD and helped develop a ChatGPT tool to assist with learning and … 2 days, 23 hours ago | www.reddit.com

aipromptprogramming articles chatgpt interactive +6

Unlocking the Power of locally running Llama-3 8B Model Agents with Chat-UI! 4 days, 7 hours ago | www.reddit.com

agents aipromptprogramming chat llama +2

Voice chatting with llama 3 8B 5 days, 7 hours ago | www.reddit.com

aipromptprogramming llama llama 3 voice

To not be bias. 1 week ago | www.reddit.com

aipromptprogramming bias

Llama 3 benchmark is out 🦙🦙 1 week, 1 day ago | www.reddit.com

aipromptprogramming benchmark llama llama 3

Use Llama 3 70B to code with this VS Code coding copilot extension 1 week, 1 day ago | www.reddit.com

70b aipromptprogramming code coding +5

Open Interface - Control Any Computer Using GPT-4V 1 week, 2 days ago | www.reddit.com

aipromptprogramming computer control gpt +1

Data Architect

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

View on ai-jobs.net

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

View on ai-jobs.net

Global Data Architect, AVP - State Street Global Advisors

@ State Street | Boston, Massachusetts

View on ai-jobs.net

Data Engineer

@ NTT DATA | Pune, MH, IN

View on ai-jobs.net

View more jobs

all AI news

I made my own batching/caching API over the weekend. 200+ tk/s with Mistral 5.0bpw esl2 on an RTX 3090. It was for a personal project, and it's not complete, but happy holidays! It will probably just run in your LLM Conda env without installing anyth

More from www.reddit.com / Ai Prompt Programming

Jobs in AI, ML, Big Data

Data Architect

Data ETL Engineer

Lead GNSS Data Scientist

Senior Machine Learning Engineer (MLOps)

Global Data Architect, AVP - State Street Global Advisors

Data Engineer