April 27, 2024, 9:06 a.m. | /u/zouharvi

Machine Learning www.reddit.com

I recently made a video covering our recent work on the mathematical aspects of tokenization, specifically:
- formalization of tokenization as compression
- bounds of Byte-Pair Encoding optimality
- link between tokenization entropy and performance

I'd be very grateful for any feedback as I'm still learning how to make educational videos. Thank you!

https://youtu.be/yeEZpf4BlDA

compression educational encoding entropy feedback machinelearning performance tokenization video videos work

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US

Research Engineer

@ Allora Labs | Remote

Ecosystem Manager

@ Allora Labs | Remote

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US