[R] Adapting Language Models to Compress Contexts | allainews.com

May 31, 2023, 11:27 a.m. | /u/Balance-

Machine Learning www.reddit.com

* [https://arxiv.org/abs/2305.14788](https://arxiv.org/abs/2305.14788)

[Alexis Chevalier](https://arxiv.org/search/cs?searchtype=author&query=Chevalier%2C+A), [Alexander Wettig](https://arxiv.org/search/cs?searchtype=author&query=Wettig%2C+A), [Anirudh Ajith](https://arxiv.org/search/cs?searchtype=author&query=Ajith%2C+A), [Danqi Chen](https://arxiv.org/search/cs?searchtype=author&query=Chen%2C+D)

>Transformer-based language models (LMs) are powerful and widely-applicable tools, but their usefulness is constrained by a finite context window and the expensive computational cost of processing long text documents. We propose to adapt pre-trained LMs into AutoCompressors. These models are capable of compressing long contexts into compact summary vectors, which are then accessible to the model as soft prompts. Summary vectors are trained with an unsupervised objective, whereby long documents …

computational context context window cost documents language language models machinelearning processing prompts summary text tools transformer vectors

More from www.reddit.com / Machine Learning

[D] In cross-attention, why is Q taken from decoder, and K taken from the encoders … 5 hours ago | www.reddit.com

attention decoder difference encoder +1

[D] How does visual embedding coexist with language embedding space in Vision Language Model? 5 hours ago | www.reddit.com

community conversation discuss embedding +13

[D] Best NLP encoders (BERT...) for NER with very low data finetuning ? 6 hours ago | www.reddit.com

bert data distilbert encoder +12

Word embedding - contextualised vs word2vec [D] 7 hours ago | www.reddit.com

attention bert context embedding +11

[D] Likes the math hates the programming 7 hours ago | www.reddit.com

coding libraries machinelearning math +6

[D] What comes first, math, or algorithm in research? 11 hours ago | www.reddit.com

algorithm data ddpm diffusion +9

The future of AI/ML data centers is going to be 100's, even 1000's of servers … 14 hours ago | www.reddit.com

ai data block building chatgpt +13

[R] The Illusion of State in State-Space Models 17 hours ago | www.reddit.com

machinelearning space state

[P] Are there open-source models related to singing and vocals? 18 hours ago | www.reddit.com

change machinelearning ml models open-source models +4

Data Scientist (m/f/x/d)

@ Symanto Research GmbH & Co. KG | Spain, Germany

View on ai-jobs.net

Future Opportunity: Managed Services, Data Analyst

@ project44 | Poland - Kraków

View on ai-jobs.net

Staff Software Engineer, Data Migration

@ Okta | Spain

View on ai-jobs.net

Data Engineer

@ Red Bull | Thalgau, Austria

View on ai-jobs.net

Head of Artificial Intelligence & Automation Transformation

@ Guardian | New York

View on ai-jobs.net

Data Scientist-1

@ Visa | Bengaluru, India

View on ai-jobs.net