Q: predicting only one token in autoregressive model training? | allainews.com

May 13, 2022, 1:33 a.m. | /u/Novel_Cucumber_1588

Deep Learning www.reddit.com

I'm training an autoregressive model (transformer encoder + decoder)

where a text is given as input and output is also text(both tokenized).

I've been using nll loss, and even though the nll loss decreased significantly, the predictions are just a bunch of repetition of a single token.

for example,

input: hello world

output: aaaaaaaaaaaaaaaaa



I've been looking in to the model architecture and loss function, but can't catch any bugs in it yet.



Could you suggest any tips …

autoregressive model deeplearning training

More from www.reddit.com / Deep Learning

Classical ML interview 15 hours ago | www.reddit.com

algorithms deeplearning however interview +11

Deep Learning 17 hours ago | www.reddit.com

deep learning deeplearning

Talking face generation!! 1 day, 9 hours ago | www.reddit.com

create deeplearning face generated +5

98% training accuracy but predictions on new images are wrong - Overfitting? 1 day, 15 hours ago | www.reddit.com

accuracy data deep learning deeplearning +7

Evolutionary Model Merging 2 days, 3 hours ago | www.reddit.com

deeplearning merging

Learning Deep Learning from scratch 2 days, 10 hours ago | www.reddit.com

book deep learning deeplearning hello +3

Latency of dilated convolutions 2 days, 13 hours ago | www.reddit.com

access computation convolution deeplearning +9

First time ML build 2 days, 18 hours ago | www.reddit.com

amd amd ryzen build cpu +9

What are your thoughts on Neurosymbolic AI? 4 days, 7 hours ago | www.reddit.com

articles attention classical ai combination +6

Data Architect

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

View on ai-jobs.net

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

View on ai-jobs.net

Data Management Assistant

@ World Vision | Amman Office, Jordan

View on ai-jobs.net

Cloud Data Engineer, Global Services Delivery, Google Cloud

@ Google | Buenos Aires, Argentina

View on ai-jobs.net