Question regarding the transformer-architecture | allainews.com

Jan. 25, 2022, 5:39 p.m. | /u/CleverProgrammer12

Deep Learning www.reddit.com

I am trying to implement transformers in pytorch from scratch. If we feed into the decoder block what the transformer had previously generated. In my understanding the output of the decoder block should be of dimension(acc to the tutorial referenced below)

(batch_size, Ty, trg_vocab_size)

The Ty is the len of inp to the decoder. Do we avg it? bc we want it to only generate one word at a time, right? Why is the output of the decoder(transformer block) dependent …

architecture deeplearning transformer

More from www.reddit.com / Deep Learning

How can i visualize a CNN's architecture in this way? 17 hours ago | www.reddit.com

aim architecture building cnn +8

A monster of a paper by Stanford, a 500-page report on the 2024 state of … 2 days, 4 hours ago | www.reddit.com

ai research benchmarks classification commonsense +18

Why transformers with causal mask perform better without mask when overfitting training data? 2 days, 18 hours ago | www.reddit.com

causal data dataset deeplearning +12

Andrew be like 3 days, 4 hours ago | www.reddit.com

andrew deeplearning

Final Year Project Ideas 4 days, 11 hours ago | www.reddit.com

data data science deeplearning front-end +5

Conditioning mechanism in DiT 5 days, 2 hours ago | www.reddit.com

attention author deeplearning diffusion +9

Learning deep learning from scratch 5 days, 14 hours ago | www.reddit.com

deep learning deeplearning easy isn +2

But what is a GPT? Visual intro to transformers | Chapter 5, Deep Learning 5 days, 14 hours ago | www.reddit.com

deep learning deeplearning gpt intro +2

Deep Learning Theory and Interpretability books and resources 6 days, 10 hours ago | www.reddit.com

basics books college deep learning +15

Data Scientist (m/f/x/d)

@ Symanto Research GmbH & Co. KG | Spain, Germany

View on ai-jobs.net

AI Scientist/Engineer

@ OKX | Singapore

View on ai-jobs.net

Research Engineering/ Scientist Associate I

@ The University of Texas at Austin | AUSTIN, TX

View on ai-jobs.net

Senior Data Engineer

@ Algolia | London, England

View on ai-jobs.net

Fundamental Equities - Vice President, Equity Quant Research Analyst (Income & Value Investment Team)

@ BlackRock | NY7 - 50 Hudson Yards, New York

View on ai-jobs.net

Snowflake Data Analytics

@ Devoteam | Madrid, Spain

View on ai-jobs.net