all AI news
Encoder-Decoder Model
Nov. 15, 2023, 1:23 p.m. | /u/duffano
Deep Learning www.reddit.com
I had a look at the encoder-decoder architecture following the seminal paper "Attention is all you need".
After doing experiments on my own and doing further reading, I found many sources saying that the (maximum) input lengths of encoder and decoder are usually the same, or that there is no reason in practice to use different legnths (see e.g. [https://stats.stackexchange.com/questions/603535/in-transformers-for-the-maximum-length-of-encoders-input-sequences-and-decoder](https://stats.stackexchange.com/questions/603535/in-transformers-for-the-maximum-length-of-encoders-input-sequences-and-decoder)).
What puzzles me is the "usually". I want to understand the thing on the mathematical level, and I …
architecture attention attention is all you need decoder deeplearning encoder encoder-decoder look paper practice reason
More from www.reddit.com / Deep Learning
How are decesion boundry drawn in feature space ?
1 day, 5 hours ago |
www.reddit.com
Best Resources to Learn Computer Vision in 2024
1 day, 9 hours ago |
www.reddit.com
Any tips how to start DL?
2 days, 5 hours ago |
www.reddit.com
What amount of data makes up a tensor?
3 days, 13 hours ago |
www.reddit.com
Why does IA still struggle with colorization of old movies.
4 days, 18 hours ago |
www.reddit.com
Training an Small Language Model
5 days, 3 hours ago |
www.reddit.com
Jobs in AI, ML, Big Data
Data Engineer
@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania
Artificial Intelligence – Bioinformatic Expert
@ University of Texas Medical Branch | Galveston, TX
Lead Developer (AI)
@ Cere Network | San Francisco, US
Research Engineer
@ Allora Labs | Remote
Ecosystem Manager
@ Allora Labs | Remote
Founding AI Engineer, Agents
@ Occam AI | New York