April 5, 2024, 4:47 a.m. | Arnab Sen Sharma, David Atkinson, David Bau

cs.CL updates on arXiv.org arxiv.org

arXiv:2404.03646v1 Announce Type: new
Abstract: We investigate the mechanisms of factual recall in the Mamba state space model. Our work is inspired by previous findings in autoregressive transformer language models suggesting that their knowledge recall is localized to particular modules at specific token locations; we therefore ask whether factual recall in Mamba can be similarly localized. To investigate this, we conduct four lines of experiments on Mamba. First, we apply causal tracing or interchange interventions to localize key components inside …

abstract arxiv cs.cl editing knowledge language language models locations mamba modules recall space state state space model token transformer transformer language models type work

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Senior Software Engineer, Generative AI (C++)

@ SoundHound Inc. | Toronto, Canada