Dec. 24, 2023, 3:47 p.m. | Yannic Kilcher

Yannic Kilcher www.youtube.com

#mamba #s4 #ssm

OUTLINE:
0:00 - Introduction
0:45 - Transformers vs RNNs vs S4
6:10 - What are state space models?
12:30 - Selective State Space Models
17:55 - The Mamba architecture
22:20 - The SSM layer and forward propagation
31:15 - Utilizing GPU memory hierarchy
34:05 - Efficient computation via prefix sums / parallel scans
36:01 - Experimental results and comments
38:00 - A brief look at the code


Paper: https://arxiv.org/abs/2312.00752

Abstract:
Foundation models, now powering most of the …

architecture computation explained gpu introduction layer linear mamba memory modeling paper propagation space spaces state transformers

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne