Jan. 25, 2024, 12:44 a.m. | /u/Instantinopaul

Machine Learning www.reddit.com

I'm finally wrapping my head around the attention mechanism, but one piece still eludes me: the matrix magic behind q, k, and v.

I get the whole matrix multiplication dance at a theoretical level, but what **mathematical property** actually dictates which matrix gets to be the **query (q)**, the **key (k)**, and the **value (v)**? Is it just some random assignment, or is there deeper logic at play?


Here's what I've gathered so far:

* All three matrices come from …

attention dance head machinelearning magic matrix matrix multiplication property query the matrix

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne