[D] Why choose an H100 over an A100 for LLM inference? | allainews.com

Oct. 27, 2023, 5:18 p.m. | /u/faschu

Machine Learning www.reddit.com

What are the benefits of using an H100 over an A100 (both at 80 GB and both using FP16) for LLM inference?

Seeing the datasheet for both GPUS, the H100 has twice the max flops, but they have almost the same memory bandwidth (2000 GB/sec). As memory latency dominates inference, I wonder what benefits the H100 has. One benefit could, of course, be the ability to use FP8 (which is extremely useful), but I'm interested in the difference in …

a100 bandwidth benefits fp16 gpus h100 inference latency llm machinelearning max memory sec

More from www.reddit.com / Machine Learning

How Large Language Models play video games [D] 4 hours ago | www.reddit.com

agents case engineering explore +15

[Research] Understanding The Attention Mechanism In Transformers: A 5-minute visual guide. 🧠 8 hours ago | www.reddit.com

architectures attention dictionary guide +12

[D] Is there a more systematic way of choosing the layers or how deep the … 13 hours ago | www.reddit.com

architecture deep learning least machinelearning +6

[D] Where does the real value of a data scientist come from? 17 hours ago | www.reddit.com

code companies data data scientist +11

[D] NVIDIA GPU Benchmarks & Comparison 19 hours ago | www.reddit.com

a100 ada cards cloud +15

[N] 1st Workshop on In-Context Learning at ICML 2024 20 hours ago | www.reddit.com

context context learning icml in-context learning +2

[R] A Careful Examination of Large Language Model Performance on Grade School Arithmetic 21 hours ago | www.reddit.com

abstract benchmark benchmarks claim +21

[D] [R] Are there any methods/works that enable extracting high-quality dense feature map from CLIP/OpenCLIP … 23 hours ago | www.reddit.com

clip compute feature finetuning +8

[P] [D] Is inference time the important performance metric for ML Models on edge/mobile? 1 day, 4 hours ago | www.reddit.com

apps devices edge embed +15

Founding AI Engineer, Agents

@ Occam AI | New York

View on ai-jobs.net

AI Engineer Intern, Agents

@ Occam AI | US

View on ai-jobs.net

AI Research Scientist

@ Vara | Berlin, Germany and Remote

View on ai-jobs.net

Data Architect

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Machine Learning Engineer

@ Apple | Sunnyvale, California, United States

View on ai-jobs.net