all AI news
[D] Question on the loss function in DeepMind's Beyond Human Data paper. Why use reward-weighted loss if the reward is only ever 1 or 0, as opposed to just training on successes?
Dec. 31, 2023, 3:12 p.m. | /u/30299578815310
Machine Learning www.reddit.com
Later in the paper they say use reward-weighted negative log-likelihood loss for training.
If the reward is only ever 0 or 1 though, isn't this just normal negative log-likelihood loss, but where you only train on the success (the gradient …
complexity extra gradient isn likelihood loss machinelearning mods negative normal paper success train training
More from www.reddit.com / Machine Learning
Jobs in AI, ML, Big Data
Founding AI Engineer, Agents
@ Occam AI | New York
AI Engineer Intern, Agents
@ Occam AI | US
AI Research Scientist
@ Vara | Berlin, Germany and Remote
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne