all AI news
Question regarding the transformer-architecture
Web: https://www.reddit.com/r/deeplearning/comments/scj82w/question_regarding_the_transformerarchitecture/
Jan. 25, 2022, 5:39 p.m. | /u/CleverProgrammer12
Deep Learning reddit.com
I am trying to implement transformers in pytorch from scratch. If we feed into the decoder block what the transformer had previously generated. In my understanding the output of the decoder block should be of dimension(acc to the tutorial referenced below)
(batch_size, Ty, trg_vocab_size)
The Ty is the len of inp to the decoder. Do we avg it? bc we want it to only generate one word at a time, right? Why is the output of the decoder(transformer block) dependent …
!-->More from reddit.com / Deep Learning
The past two years went down in a blink because of some pandemic? Check out …
1 day, 13 hours ago |
reddit.com
Looking for speech Datasets with different microphone profiles
2 days, 3 hours ago |
reddit.com
How to train regression model with multiple dataset
2 days, 8 hours ago |
reddit.com
Latest AI/ML/Big Data Jobs
Research Scientist, 3D Reconstruction
@ Yembo | Remote, US
Clinical Assistant or Associate Professor of Management Science and Systems
@ University at Buffalo | Buffalo, NY
Data Analyst
@ Colorado Springs Police Department | Colorado Springs, CO
Predictive Ecology Postdoctoral Fellow
@ Lawrence Berkeley National Lab | Berkeley, CA
Data Analyst, Patagonia Action Works
@ Patagonia | Remote
Data & Insights Strategy & Innovation General Manager
@ Chevron Services Company, a division of Chevron U.S.A Inc. | Houston, TX