all AI news
Q: predicting only one token in autoregressive model training?
May 13, 2022, 1:33 a.m. | /u/Novel_Cucumber_1588
Deep Learning www.reddit.com
where a text is given as input and output is also text(both tokenized).
I've been using nll loss, and even though the nll loss decreased significantly, the predictions are just a bunch of repetition of a single token.
for example,
input: hello world
output: aaaaaaaaaaaaaaaaa
​
I've been looking in to the model architecture and loss function, but can't catch any bugs in it yet.
​
Could you suggest any tips …
More from www.reddit.com / Deep Learning
Learning Deep Learning from scratch
2 days, 10 hours ago |
www.reddit.com
Latency of dilated convolutions
2 days, 13 hours ago |
www.reddit.com
What are your thoughts on Neurosymbolic AI?
4 days, 7 hours ago |
www.reddit.com
Jobs in AI, ML, Big Data
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Data Management Assistant
@ World Vision | Amman Office, Jordan
Cloud Data Engineer, Global Services Delivery, Google Cloud
@ Google | Buenos Aires, Argentina