[Project] LLM inference with vLLM and AMD: Achieving LLM inference parity with Nvidia | allainews.com

Oct. 28, 2023, 2:06 a.m. | /u/openssp

Machine Learning www.reddit.com

I wanted to share some exciting news from the GPU world that could potentially change the game for LLM inference. AMD has been making significant strides in LLM inference, thanks to the porting of vLLM to ROCm 5.6. You can find the code implementation on [GitHub](https://github.com/EmbeddedLLM/vllm-rocm).

The result? AMD's MI210 now almost matches Nvidia's A100 in LLM inference performance. This is a significant development, as it could make AMD a more viable option for LLM inference tasks, which traditionally have …

a100 amd development inference llm machinelearning nvidia performance tasks

More from www.reddit.com / Machine Learning

[D] Strange Loss Curve while training 4 hours ago | www.reddit.com

dataset gpt loss machinelearning +4

[D] Intra-Document prefix (cumulative) sum when using sequence packing in PyTorch 9 hours ago | www.reddit.com

computational context context window documents +7

[Research] xLSTM: Extended Long Short-Term Memory 16 hours ago | www.reddit.com

abstract contributed deep learning error +16

Non Technical ML Podcasts? [D] 23 hours ago | www.reddit.com

challenge context current data +16

[D] PEFT techniques actually used in the industry 1 day, 2 hours ago | www.reddit.com

industry machinelearning normally peft +2

[D] Can anyone with the expertise speak to the overlap, or not, between Nvidia's hardware … 1 day, 4 hours ago | www.reddit.com

apple chips expertise hardware +4

[P] Skyrim - Open-source model zoo for Large Weather Models 1 day, 5 hours ago | www.reddit.com

ai models building capabilities fine-tuning +7

[P] Identify toxic underwater air bubbles lurking in the substrate with aquatic ultrasonic scans via … 1 day, 7 hours ago | www.reddit.com

arduino classification color identify +11

[P] YARI - Yet Another RAG Implementation. Hybrid context retrieval 1 day, 8 hours ago | www.reddit.com

api context cosine embedding +14

Lead Developer (AI)

@ Cere Network | San Francisco, US

View on ai-jobs.net

Research Engineer

@ Allora Labs | Remote

View on ai-jobs.net

Ecosystem Manager

@ Allora Labs | Remote

View on ai-jobs.net

Founding AI Engineer, Agents

@ Occam AI | New York

View on ai-jobs.net

AI Engineer Intern, Agents

@ Occam AI | US

View on ai-jobs.net

AI Research Scientist

@ Vara | Berlin, Germany and Remote

View on ai-jobs.net