all AI news
LongHeads: Multi-Head Attention is Secretly a Long Context Processor
Feb. 19, 2024, 5:47 a.m. | Yi Lu, Xin Zhou, Wei He, Jun Zhao, Tao Ji, Tao Gui, Qi Zhang, Xuanjing Huang
cs.CL updates on arXiv.org arxiv.org
Abstract: Large language models (LLMs) have achieved impressive performance in numerous domains but often struggle to process lengthy inputs effectively and efficiently due to limited length generalization and attention's quadratic computational demands. Many sought to mitigate this by restricting the attention window within the pre-trained length. However, these methods introduce new issues such as ignoring the middle context and requiring additional training. To address these problems, we propose LongHeads, a training-free framework that enhances LLM's long …
abstract arxiv attention computational context cs.ai cs.cl domains head inputs language language models large language large language models llms multi-head multi-head attention performance process processor struggle type
More from arxiv.org / cs.CL updates on arXiv.org
Jobs in AI, ML, Big Data
AI Research Scientist
@ Vara | Berlin, Germany and Remote
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Data Analyst (Digital Business Analyst)
@ Activate Interactive Pte Ltd | Singapore, Central Singapore, Singapore