all AI news
[D] Is a single channel enough for Positional Encoding in Transformers?
March 29, 2024, 12:04 p.m. | /u/bnqj
Machine Learning www.reddit.com
I’ve been exploring a new approach to positional encoding that I’m calling VAPE - Vector Addition Positional Encoding.
**The Method**:
* borrow some number of channels from queries and keys,
* run a cumulative (prefix) sum across sequence length on these borrowed channels (add vectors together),
* normalize - divide by the square root of the vector's magnitude,
* we now have position-aware channels,
* so concatenate them back to queries and …
channels encoding keys machinelearning positional encoding queries square together vector vectors
More from www.reddit.com / Machine Learning
Jobs in AI, ML, Big Data
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Global Data Architect, AVP - State Street Global Advisors
@ State Street | Boston, Massachusetts
Data Engineer
@ NTT DATA | Pune, MH, IN