GPT-Fast - blazingly fast inference with PyTorch (w/ Horace He) | allainews.com

March 7, 2024, 3:23 p.m. | Aleksa Gordić - The AI Epiphany

Aleksa Gordić - The AI Epiphany www.youtube.com

Become a Patreon: https://www.patreon.com/theaiepiphany
👨‍👩‍👧‍👦 Join our Discord community: https://discord.gg/peBrCpheKE

Horace He joined us today to talk more about how to make inference fast using just PyTorch native operations!

▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
https://pytorch.org/blog/accelerating-generative-ai-2/
https://github.com/pytorch-labs/gpt-fast
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬

⌚️ Timetable:
00:00 - 00:45 Intro
00:45 - 02:23 HyperStack GPUs! (sponsored)
02:23 - 08:40 What is GPT-Fast?
08:40 - 28:15 PyTorch compile
28:15 - 32:15 int8 quantization
32:15 - 40:12 Speculative Decoding
40:12 - 42:05 Int 4 quantization
42:05 - 45:25 Putting it all together, tensor …

decoding gpt gpus inference intro operations pytorch quantization sponsored talk

More from www.youtube.com / Aleksa Gordić - The AI Epiphany

Ishan Misra (Meta) - Emu Video Generation 1 month ago | www.youtube.com

billion data embedding emu +16

InstructPix2Pix (w/ OpenAI's Tim Brooks) 1 month, 2 weeks ago | www.youtube.com

author discuss gpus intro +6

GPT-Fast - blazingly fast inference with PyTorch (w/ Horace He) 1 month, 3 weeks ago | www.youtube.com

decoding gpt gpus inference +6

How does Groq LPU work? (w/ Head of Silicon Igor Arsovski!) 2 months ago | www.youtube.com

benchmarks cloud gpu gpus +13

Thomas Wolf (HuggingFace) - the case for open-source! 2 months, 1 week ago | www.youtube.com

developer gpus huggingface intro +8

Jeremy Howard - answer.ai, what is wrong with the academia and industry 2 months, 2 weeks ago | www.youtube.com

academia current etc gpus +4

LLaMA 2 w/ Thomas Scialom (LLaMA 2 lead) 4 months, 2 weeks ago | www.youtube.com

bloom etc fine-tuning galactica +11

EleutherAI Pythia w/ Hailey Schoelkopf 4 months, 2 weeks ago | www.youtube.com

beyond bloom data eleutherai +10

Lucas Beyer (Google DeepMind) - Convergence of Vision & Language 4 months, 3 weeks ago | www.youtube.com

api architectures convergence deepmind +11

Data Architect

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

View on ai-jobs.net

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

View on ai-jobs.net

Senior Data Engineer

@ Quantexa | Sydney, New South Wales, Australia

View on ai-jobs.net

Staff Analytics Engineer

@ Warner Bros. Discovery | NY New York 230 Park Avenue South

View on ai-jobs.net