Feb. 13, 2024, 5:45 a.m. | Yuhan Chen Ang Lv Ting-En Lin Changyu Chen Yuchuan Wu Fei Huang Yongbin Li Rui Yan

cs.LG updates on arXiv.org arxiv.org

In this paper, we demonstrate that an inherent waveform pattern in the attention allocation of large language models (LLMs) significantly affects their performance in tasks demanding a high degree of context awareness, such as utilizing LLMs for tool-use. Specifically, the crucial information in the context will be potentially overlooked by model when it is positioned in the trough zone of the attention waveform, leading to decreased performance. To address this issue, we propose a novel inference method named Attention Buckets. …

attention context cs.ai cs.cl cs.lg information language language models large language large language models llms paper performance tasks tool

Research Scholar (Technical Research)

@ Centre for the Governance of AI | Hybrid; Oxford, UK

HPC Engineer (x/f/m) - DACH

@ Meshcapade GmbH | Remote, Germany

Data Scientist AI / ML - Associate 2 -Bangalore

@ PwC | Bengaluru (SDC) - Bagmane Tech Park

Staff ML Engineer - Machine Learning

@ Visa | Bengaluru, India

Senior Data Scientist

@ IQVIA | Dublin, Ireland

Data Analyst ETL Expert

@ Bosch Group | Bengaluru, India