Nov. 22, 2022, 5:19 p.m. | /u/korec1234

Machine Learning www.reddit.com

Efficient Transformers with Dynamic Token Pooling

Paper: [https://arxiv.org/pdf/2211.09761.pdf](https://arxiv.org/pdf/2211.09761.pdf)

Github: [https://github.com/PiotrNawrot/dynamic-pooling](https://github.com/PiotrNawrot/dynamic-pooling)

Twitter: [https://twitter.com/PontiEdoardo/status/1593607268980891648](https://twitter.com/PontiEdoardo/status/1593607268980891648)

​

Abstract:

Transformers achieve unrivalled performance in modelling language, but remain inefficient in terms of memory and time complexity. A possible remedy is to reduce the sequence length in the intermediate layers by pooling fixed-length segments of tokens. Nevertheless, natural units of meaning, such as words or phrases, display varying sizes. To address this mismatch, we equip language models with a dynamic-pooling mechanism, which predicts segment boundaries in …

machinelearning pooling transformers

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Senior Data Scientist

@ ITE Management | New York City, United States