Oct. 10, 2023, 3:06 p.m. | Andy Lo

Towards Data Science - Medium towardsdatascience.com

Matrix Multiplication on GPU

How to achieve state-of-the-art matrix multiplication performance in CUDA.

“A minimalist art taking inspiration from Matrix Multiplication, in the style of vaporwave” by DALLE-2

This blog came from a sudden realisation of how little I knew about how matrix multiplication works on the GPU. Having done so many ML projects, I feel like I ought to understand how the most important operation in ML works: What is this “Tensor Core” thing? Why does everyone say “ …

art blog cuda dalle efficiency gpu hardware inspiration matrix matrix multiplication ml projects performance projects state

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne