Do LLMs dream of elephants (when told not to)? Latent concept association and associative memory in transformers | allainews.com

June 27, 2024, 4:42 a.m. | Yibo Jiang, Goutham Rajendran, Pradeep Ravikumar, Bryon Aragam

cs.CL updates on arXiv.org arxiv.org

arXiv:2406.18400v1 Announce Type: new
Abstract: Large Language Models (LLMs) have the capacity to store and recall facts. Through experimentation with open-source models, we observe that this ability to retrieve facts can be easily manipulated by changing contexts, even without altering their factual meanings. These findings highlight that LLMs might behave like an associative memory model where certain tokens in the contexts serve as clues to retrieving facts. We mathematically explore this property by studying how transformers, the building blocks of …

abstract arxiv association capacity concept cs.cl cs.lg elephants experimentation facts language language models large language large language models llms memory observe open-source models recall stat.ml store through transformers type

More from arxiv.org / cs.CL updates on arXiv.org

MuTox: Universal MUltilingual Audio-based TOXicity Dataset and Zero-shot Detector 1 day, 20 hours ago | arxiv.org

abstract arxiv audio cs.cl +22

Can Large Language Model Summarizers Adapt to Diverse Scientific Communication Goals? 1 day, 20 hours ago | arxiv.org

abstract adapt arxiv communication +23

ReFT: Reasoning with Reinforced Fine-Tuning 1 day, 20 hours ago | arxiv.org

abstract annotations arxiv capability +22

Deductive Closure Training of Language Models for Coherence, Accuracy, and Updatability 1 day, 20 hours ago | arxiv.org

abstract accuracy arxiv cs.cl +13

Exploring Defeasibility in Causal Reasoning 1 day, 20 hours ago | arxiv.org

abstract arxiv causal causal reasoning +7

Can Large Language Models Follow Concept Annotation Guidelines? A Case Study on Scientific and Financial … 1 day, 20 hours ago | arxiv.org

abstract annotation arxiv capacity +26

Theory of Mind for Multi-Agent Collaboration via Large Language Models 1 day, 20 hours ago | arxiv.org

abstract agent agents arxiv +28

Enhancing Text-based Knowledge Graph Completion with Zero-Shot Large Language Models: A Focus on Semantic Enhancement 1 day, 20 hours ago | arxiv.org

arxiv cs.ai cs.cl focus +12

A Large Language Model Approach to Educational Survey Feedback Analysis 1 day, 20 hours ago | arxiv.org

abstract analysis arxiv capabilities +27

Quantitative Researcher – Algorithmic Research

@ Man Group | GB London Riverbank House

View on ai-jobs.net

Software Engineering Expert

@ Sanofi | Budapest

View on ai-jobs.net

Senior Bioinformatics Scientist

@ Illumina | US - Bay Area - Foster City

View on ai-jobs.net

Senior Engineer - Generative AI Product Engineering (Remote-Eligible)

@ Capital One | McLean, VA

View on ai-jobs.net

Graduate Assistant - Bioinformatics

@ University of Arkansas System | University of Arkansas at Little Rock

View on ai-jobs.net

Senior AI-HPC Cluster Engineer

@ NVIDIA | US, CA, Santa Clara

View on ai-jobs.net