s
Oct. 4, 2023, 4:23 p.m. |

Simon Willison's Weblog simonwillison.net

Think before you speak: Training Language Models With Pause Tokens


Another example of how much low hanging fruit remains to be discovered in basic Large Language Model research: this team from Carnegie Mellon and Google Research note that, since LLMs get to run their neural networks once for each token of input and output, inserting "pause" tokens that don't output anything at all actually gives them extra opportunities to "think" about their output.

ai basic carnegie mellon example generativeai google google research language language model language models large language large language model llms low networks neural networks research speak team think token tokens training

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US

Research Engineer

@ Allora Labs | Remote

Ecosystem Manager

@ Allora Labs | Remote

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US