[D] Is a single channel enough for Positional Encoding in Transformers? | allainews.com

March 29, 2024, 12:04 p.m. | /u/bnqj

Machine Learning www.reddit.com

Yes, with [VAPE - Vector Addition Positional Encoding](https://gist.github.com/bnqj/309550aceae3d3be423bcf0199a9afc5).

I’ve been exploring a new approach to positional encoding that I’m calling VAPE - Vector Addition Positional Encoding.

**The Method**:

* borrow some number of channels from queries and keys,
* run a cumulative (prefix) sum across sequence length on these borrowed channels (add vectors together),
* normalize - divide by the square root of the vector's magnitude,
* we now have position-aware channels,
* so concatenate them back to queries and …

channels encoding keys machinelearning positional encoding queries square together vector vectors

More from www.reddit.com / Machine Learning

[P] Classification finetuning experiments on small GPT-2 sized LLMs 4 hours ago | www.reddit.com

acc classification context cpu +16

[D] Llama-3 based OpenBioLLM-70B & 8B: Outperforms GPT-4, Gemini, Meditron-70B, Med-PaLM-1 & Med-PaLM-2 in Medical-domain 4 hours ago | www.reddit.com

70b art biomedical domain +16

How do I convince my superior to do data preprocessing? [D] 5 hours ago | www.reddit.com

ai engineer build chat chatbots +11

[D] Llama-3 based OpenBioLLM-70B & 8B: Outperforms GPT-4, Gemini, Meditron-70B, Med-PaLM-1 & Med-PaLM-2 in Medical-domain 5 hours ago | www.reddit.com

70b art biomedical domain +16

[D] Mathematical aspects of tokenization 7 hours ago | www.reddit.com

compression educational encoding entropy +7

[R] Parameter-Efficient Fine-Tuning for Large Models: A Comprehensive Survey 9 hours ago | www.reddit.com

abstract advancement application challenges +15

[D] Does it make sense to talk about the probabilities of models? 15 hours ago | www.reddit.com

compute data likelihood machinelearning +4

Open-Sourced: Automated Data Sorting Tools [P] 23 hours ago | www.reddit.com

application automated building community +11

[D]What Nomenclature do you follow for naming ML Models? 1 day ago | www.reddit.com

files inputs kind machinelearning +4

Data Architect

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

View on ai-jobs.net

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

View on ai-jobs.net

Global Data Architect, AVP - State Street Global Advisors

@ State Street | Boston, Massachusetts

View on ai-jobs.net

Data Engineer

@ NTT DATA | Pune, MH, IN

View on ai-jobs.net