all AI news
[D] Transformer multi-head attention implementation
Jan. 19, 2024, 2:48 a.m. | /u/Melodic_Stomach_2704
Machine Learning www.reddit.com
query, key, value = [
lin(x).view(nbatches, -1, self.h, self.d_k).transpose(1, 2)
for lin, x in zip(self.linears, (query, key, value))
]
Here, `lin(x)` is being reshaped into `(nbatches, -1, self.h, self.d_k)`and dimension 1 & 2 is being transposed which makes the dimension `(nbatches, self.h, -1, self.d_k).`
I'm failing to understand why …
More from www.reddit.com / Machine Learning
Jobs in AI, ML, Big Data
Senior Machine Learning Engineer
@ GPTZero | Toronto, Canada
ML/AI Engineer / NLP Expert - Custom LLM Development (x/f/m)
@ HelloBetter | Remote
Doctoral Researcher (m/f/div) in Automated Processing of Bioimages
@ Leibniz Institute for Natural Product Research and Infection Biology (Leibniz-HKI) | Jena
Seeking Developers and Engineers for AI T-Shirt Generator Project
@ Chevon Hicks | Remote
Senior Applied Data Scientist
@ dunnhumby | London
Principal Data Architect - Azure & Big Data
@ MGM Resorts International | Home Office - US, NV