Feb. 6, 2024, 6:41 a.m. | /u/Yossarian_1234

Machine Learning www.reddit.com

*Link:* [https://arxiv.org/abs/2402.03170](https://arxiv.org/abs/2402.03170)

*Authors:* Riccardo Grazzi\*, Julien Siems\*, Simon Schrodi, Thomas Brox, Frank Hutter

\*equal contribution

*Abstract:* This work provides empirical evidence that Mamba, a newly proposed selective structured state space model, has similar in-context learning (ICL) capabilities as transformers. We evaluated Mamba on tasks involving simple function approximation as well as more complex natural language processing problems. Our results demonstrate that across both categories of tasks, Mamba matches the performance of transformer models for ICL. Further analysis reveals that like …

abstract approximation authors capabilities context evidence function in-context learning language language processing machinelearning mamba natural natural language natural language processing processing simple space state tasks thomas transformers work

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Research Scientist

@ Meta | Menlo Park, CA

Principal Data Scientist

@ Mastercard | O'Fallon, Missouri (Main Campus)