all AI news
LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression
March 20, 2024, 4:43 a.m. | Zhuoshi Pan, Qianhui Wu, Huiqiang Jiang, Menglin Xia, Xufang Luo, Jue Zhang, Qingwei Lin, Victor R\"uhle, Yuqing Yang, Chin-Yew Lin, H. Vicky Zhao, Li
cs.LG updates on arXiv.org arxiv.org
Abstract: This paper focuses on task-agnostic prompt compression for better generalizability and efficiency. Considering the redundancy in natural language, existing approaches compress prompts by removing tokens or lexical units according to their information entropy obtained from a causal language model such as LLaMa-7B. The challenge is that information entropy may be a suboptimal compression metric: (i) it only leverages unidirectional context and may fail to capture all essential information needed for prompt compression; (ii) it is …
abstract arxiv causal challenge compression cs.cl cs.lg data distillation efficiency entropy information language language model llama llama-7b natural natural language paper prompt prompts redundancy tokens type units
More from arxiv.org / cs.LG updates on arXiv.org
Jobs in AI, ML, Big Data
AI Research Scientist
@ Vara | Berlin, Germany and Remote
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Data Engineer (m/f/d)
@ Project A Ventures | Berlin, Germany
Principle Research Scientist
@ Analog Devices | US, MA, Boston