Encoder-Decoder Model | allainews.com

Nov. 15, 2023, 1:23 p.m. | /u/duffano

Deep Learning www.reddit.com

Dear all,

I had a look at the encoder-decoder architecture following the seminal paper "Attention is all you need".

After doing experiments on my own and doing further reading, I found many sources saying that the (maximum) input lengths of encoder and decoder are usually the same, or that there is no reason in practice to use different legnths (see e.g. [https://stats.stackexchange.com/questions/603535/in-transformers-for-the-maximum-length-of-encoders-input-sequences-and-decoder](https://stats.stackexchange.com/questions/603535/in-transformers-for-the-maximum-length-of-encoders-input-sequences-and-decoder)).

What puzzles me is the "usually". I want to understand the thing on the mathematical level, and I …

architecture attention attention is all you need decoder deeplearning encoder encoder-decoder look paper practice reason

More from www.reddit.com / Deep Learning

How LLMs are trained? A simple guide to understand LLM Training 18 hours ago | www.reddit.com

deeplearning guide llm llms +3

What is the efficient way of learning ML? 19 hours ago | www.reddit.com

concepts course deeplearning python +3

Update v1.2 of the "Little Book of Deep Learning." Minor changes + a new chapter … 21 hours ago | www.reddit.com

book deep learning deeplearning llms +4

Kolmogorov-Arnold Networks (KANs) Explained: A Superior Alternative to MLPs 1 day, 3 hours ago | www.reddit.com

Classification of images with numerical "continous" categories 2 days, 16 hours ago | www.reddit.com

age classification clear deeplearning +6

How can I truly learn to code the models, not just understand them? 3 days, 6 hours ago | www.reddit.com

architectures code coding concepts +9

How does gradient descent work in random forest 3 days, 8 hours ago | www.reddit.com

beast deeplearning gradient parameters +2

Prerequisites for jumping into transformers? 3 days, 10 hours ago | www.reddit.com

basics cnns concepts deep learning +11

[Reading] Deeplearning by goodfellow 3 days, 16 hours ago | www.reddit.com

alternative assessment bayesian change +9

Software Engineer for AI Training Data (School Specific)

@ G2i Inc | Remote

View on ai-jobs.net

Software Engineer for AI Training Data (Python)

@ G2i Inc | Remote

View on ai-jobs.net

Software Engineer for AI Training Data (Tier 2)

@ G2i Inc | Remote

View on ai-jobs.net

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

View on ai-jobs.net

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

View on ai-jobs.net

Lead Developer (AI)

@ Cere Network | San Francisco, US

View on ai-jobs.net