Dec. 24, 2023, 3:47 p.m. | Yannic Kilcher

Yannic Kilcher www.youtube.com

#mamba #s4 #ssm

OUTLINE:
0:00 - Introduction
0:45 - Transformers vs RNNs vs S4
6:10 - What are state space models?
12:30 - Selective State Space Models
17:55 - The Mamba architecture
22:20 - The SSM layer and forward propagation
31:15 - Utilizing GPU memory hierarchy
34:05 - Efficient computation via prefix sums / parallel scans
36:01 - Experimental results and comments
38:00 - A brief look at the code


Paper: https://arxiv.org/abs/2312.00752

Abstract:
Foundation models, now powering most of the …

architecture computation explained gpu introduction layer linear mamba memory modeling paper propagation space spaces state transformers

Software Engineer for AI Training Data (School Specific)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Python)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Tier 2)

@ G2i Inc | Remote

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US