March 13, 2024, 4:47 a.m. | Tongyao Zhu, Qian Liu, Liang Pang, Zhengbao Jiang, Min-Yen Kan, Min Lin

cs.CL updates on arXiv.org arxiv.org

arXiv:2403.07805v1 Announce Type: new
Abstract: Recent developments in Language Models (LMs) have shown their effectiveness in NLP tasks, particularly in knowledge-intensive tasks. However, the mechanisms underlying knowledge storage and memory access within their parameters remain elusive. In this paper, we investigate whether a generative LM (e.g., GPT-2) is able to access its memory sequentially or randomly. Through carefully-designed synthetic tasks, covering the scenarios of full recitation, selective recitation and grounded question answering, we reveal that LMs manage to sequentially access …

arxiv beyond challenge cs.ai cs.cl language language models memory random type

Senior Data Engineer

@ Displate | Warsaw

Solution Architect

@ Philips | Bothell - B2 - Bothell 22050

Senior Product Development Engineer - Datacenter Products

@ NVIDIA | US, CA, Santa Clara

Systems Engineer - 2nd Shift (Onsite)

@ RTX | PW715: Asheville Site W Asheville Greenfield Site TBD , Asheville, NC, 28803 USA

System Test Engineers (HW & SW)

@ Novanta | Barcelona, Spain

Senior Solutions Architect, Energy

@ NVIDIA | US, TX, Remote