all AI news
Question regarding the transformer-architecture
Jan. 25, 2022, 5:39 p.m. | /u/CleverProgrammer12
Deep Learning www.reddit.com
I am trying to implement transformers in pytorch from scratch. If we feed into the decoder block what the transformer had previously generated. In my understanding the output of the decoder block should be of dimension(acc to the tutorial referenced below)
(batch_size, Ty, trg_vocab_size)
The Ty is the len of inp to the decoder. Do we avg it? bc we want it to only generate one word at a time, right? Why is the output of the decoder(transformer block) dependent …
!-->More from www.reddit.com / Deep Learning
Final Year Project Ideas
4 days, 11 hours ago |
www.reddit.com
Conditioning mechanism in DiT
5 days, 2 hours ago |
www.reddit.com
Learning deep learning from scratch
5 days, 14 hours ago |
www.reddit.com
Deep Learning Theory and Interpretability books and resources
6 days, 10 hours ago |
www.reddit.com
Jobs in AI, ML, Big Data
Data Scientist (m/f/x/d)
@ Symanto Research GmbH & Co. KG | Spain, Germany
AI Scientist/Engineer
@ OKX | Singapore
Research Engineering/ Scientist Associate I
@ The University of Texas at Austin | AUSTIN, TX
Senior Data Engineer
@ Algolia | London, England
Fundamental Equities - Vice President, Equity Quant Research Analyst (Income & Value Investment Team)
@ BlackRock | NY7 - 50 Hudson Yards, New York
Snowflake Data Analytics
@ Devoteam | Madrid, Spain