April 13, 2023, 5 p.m. | Wolfram

Wolfram www.youtube.com

A conversation about large language models, specifically why and how ChatGPT works.

Read Stephen Wolfram's blog: https://writings.stephenwolfram.com/2023/02/what-is-chatgpt-doing-and-why-does-it-work

Check out Wolfram Machine Learning-a Core Part of Wolfram Language: https://wolfr.am/ml

Chapters
0:00 Intro
0:55 Beyond just Numbers
1:28 Images: LeNet trained on MNIST data
6:06 Text: GPT2 Transformer Trained on WebText Data
9:45 What Happens After Tokenization?
12:18 How Do We Get the Final Output?
15:22 Why Is It Called Attention?
17:56 Continuing Sentences
20:50 Training ChatGPT

Follow us on our official …

attention beyond chatgpt conversation data images inside language language models large language models mnist numbers text tokenization training transformer what is chatgpt

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Sr. BI Analyst

@ AkzoNobel | Pune, IN