April 25, 2024, 3:12 p.m. | Maykeye

DEV Community dev.to

Modern tokenizers come in all form! Some, like qwen, support ~150 000 tokens.


Byte level models support 256 tokens.


Can we go lower?


(There should be "you were so busy asking if you could" meme, but dev.to complains)


But the answer is yes. MambaBit comes with just 2 tokens. One token for bit 0, one token for bit 1. That's it. Yet somehow it still produces something which is not completely random.


Behold the most cursed becomes



Behold the most …

dev form llm machinelearning meme modern qwen support token tokens

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US

Research Engineer

@ Allora Labs | Remote

Ecosystem Manager

@ Allora Labs | Remote

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US