Web: https://www.reddit.com/r/MachineLearning/comments/vdsqhl/r_generalpurpose_longcontext_autoregressive/

June 16, 2022, 6:34 p.m. | /u/Singularian2501

Machine Learning reddit.com

Paper: [https://arxiv.org/abs/2202.07765](https://arxiv.org/abs/2202.07765)

Deepmind: [https://www.deepmind.com/publications/perceiver-ar-general-purpose-long-context-autoregressive-generation](https://www.deepmind.com/publications/perceiver-ar-general-purpose-long-context-autoregressive-generation)

Abstract:

>Real-world data is high-dimensional: a book, image, or musical performance can easily contain hundreds of thousands of elements even after compression. However, the most commonly used autoregressive models, Transformers, are prohibitively expensive to scale to the number of inputs and layers needed to capture this long-range structure. We develop Perceiver AR, an autoregressive, modality-agnostic architecture which uses cross-attention to map long-range inputs to a small number of latents while also maintaining end-to-end causal masking. **Perceiver …

2022 ar context deepmind general machinelearning modeling perceiver

Machine Learning Researcher - Saalfeld Lab

@ Howard Hughes Medical Institute - Chevy Chase, MD | Ashburn, Virginia

Project Director, Machine Learning in US Health

@ ideas42.org | Remote, US

Data Science Intern

@ NannyML | Remote

Machine Learning Engineer NLP/Speech

@ Play.ht | Remote

Research Scientist, 3D Reconstruction

@ Yembo | Remote, US

Clinical Assistant or Associate Professor of Management Science and Systems

@ University at Buffalo | Buffalo, NY