March 15, 2024, 6:32 p.m. | 1littlecoder

1littlecoder www.youtube.com

From abstract:

Quiet-STaR, a generalization of STaR in which LMs learn to
generate rationales at each token to explain future text, improving their
predictions. We address key challenges, including 1) the computational cost
of generating continuations, 2) the fact that the LM does not initially know
how to generate or use internal thoughts, and 3) the need to predict beyond
individual next tokens. To resolve these, we propose a tokenwise parallel
sampling algorithm, using learnable tokens indicating a thought’s start …

abstract challenges computational cost future generate key learn lms predictions speaking star text think token

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Software Engineering Manager, Generative AI - Characters

@ Meta | Bellevue, WA | Menlo Park, CA | Seattle, WA | New York City | San Francisco, CA

Senior Operations Research Analyst / Predictive Modeler

@ LinQuest | Colorado Springs, Colorado, United States