Web: https://www.reddit.com/r/deeplearning/comments/vh6c8f/how_perform_attentionbased_transformers_on_local/

June 21, 2022, 6:05 a.m. | /u/rezayazdanfar

Deep Learning reddit.com

We cannot ignore the demand in time series forecasting regardless of the industry, Energy, Healthcare, etc. Recently, Transformers have been expressed as great architectures to make complex predictions in deep learning. These transformers are mainly based on attention. The full Self-attention includes a mathematical operation known as Scaled Dot-Product Attention in its core. These attentions suffer from two problems: 1. Locality-agnostics. 2. Memory bottleneck.
This article is about solving these two problems.


link: [https://rezayazdanfar.medium.com/how-perform-attention-based-transformers-on-local-sensitivity-3475094027cf](https://rezayazdanfar.medium.com/how-perform-attention-based-transformers-on-local-sensitivity-3475094027cf)

attention deeplearning on transformers

Machine Learning Researcher - Saalfeld Lab

@ Howard Hughes Medical Institute - Chevy Chase, MD | Ashburn, Virginia

Project Director, Machine Learning in US Health

@ ideas42.org | Remote, US

Data Science Intern

@ NannyML | Remote

Machine Learning Engineer NLP/Speech

@ Play.ht | Remote

Research Scientist, 3D Reconstruction

@ Yembo | Remote, US

Clinical Assistant or Associate Professor of Management Science and Systems

@ University at Buffalo | Buffalo, NY