all AI news
Mamba: Linear-Time Sequence Modeling with Selective State Spaces (Paper Explained)
Dec. 24, 2023, 3:47 p.m. | Yannic Kilcher
Yannic Kilcher www.youtube.com
OUTLINE:
0:00 - Introduction
0:45 - Transformers vs RNNs vs S4
6:10 - What are state space models?
12:30 - Selective State Space Models
17:55 - The Mamba architecture
22:20 - The SSM layer and forward propagation
31:15 - Utilizing GPU memory hierarchy
34:05 - Efficient computation via prefix sums / parallel scans
36:01 - Experimental results and comments
38:00 - A brief look at the code
Paper: https://arxiv.org/abs/2312.00752
Abstract:
Foundation models, now powering most of the …
architecture computation explained gpu introduction layer linear mamba memory modeling paper propagation space spaces state transformers
More from www.youtube.com / Yannic Kilcher
[ML News] Chips, Robots, and Models
5 days, 1 hour ago |
www.youtube.com
TransformerFAM: Feedback attention is working memory
6 days, 23 hours ago |
www.youtube.com
Jobs in AI, ML, Big Data
Founding AI Engineer, Agents
@ Occam AI | New York
AI Engineer Intern, Agents
@ Occam AI | US
AI Research Scientist
@ Vara | Berlin, Germany and Remote
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne