all AI news
[D] Decision Transformer Alignment should be better than DeepMind ReST
Aug. 30, 2023, 1:47 a.m. | /u/seventh_day123
Machine Learning www.reddit.com
see the tech report: [https://arxiv.org/abs/2308.12050v1](https://arxiv.org/abs/2308.12050v1)
We train an SFT model and an RM model, then align the LLM with DT/MLE with filtering (ReST) + RM /SFT datasets/SFT model-generated samples
https://preview.redd.it/195op5q636lb1.png?width=1081&format=png&auto=webp&s=a9fa862e8a9ab05819484af8619f73d918fdc26a
DT is the Decision Transformer alignment
MLE is the ReST-like alignment
https://preview.redd.it/u6x28fook5lb1.png?width=1118&format=png&auto=webp&s=4a87898129c1238c00071d43809f5daf440b26d8
alignment datasets decision deepmind filtering generated llm machinelearning mle rest transformer
More from www.reddit.com / Machine Learning
[D] software to design figures
11 hours ago |
www.reddit.com
[Discussion] Should I go to ICML and present my paper?
1 day, 5 hours ago |
www.reddit.com
Jobs in AI, ML, Big Data
AI Engineer Intern, Agents
@ Occam AI | US
AI Research Scientist
@ Vara | Berlin, Germany and Remote
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Lead Data Modeler
@ Sherwin-Williams | Cleveland, OH, United States