Tokenization counts: the impact of tokenization on arithmetic in frontier LLMs | allainews.com

Feb. 26, 2024, 5:42 a.m. | Aaditya K. Singh, DJ Strouse

cs.LG updates on arXiv.org arxiv.org

arXiv:2402.14903v1 Announce Type: cross
Abstract: Tokenization, the division of input text into input tokens, is an often overlooked aspect of the large language model (LLM) pipeline and could be the source of useful or harmful inductive biases. Historically, LLMs have relied on byte pair encoding, without care to specific input domains. With the increased use of LLMs for reasoning, various number-specific tokenization schemes have been adopted, with popular models like LLaMa and PaLM opting for single-digit tokenization while GPT-3.5 and …

abstract arxiv biases cs.cl cs.lg encoding impact inductive language language model large language large language model llm llms pipeline text tokenization tokens type

More from arxiv.org / cs.LG updates on arXiv.org

(Accelerated) Noise-adaptive Stochastic Heavy-Ball Momentum 1 day, 15 hours ago | arxiv.org

abstract aim arxiv cs.lg +12

Nash Learning from Human Feedback 1 day, 15 hours ago | arxiv.org

abstract arxiv cs.ai cs.gt +20

GraphDreamer: Compositional 3D Scene Synthesis from Scene Graphs 1 day, 15 hours ago | arxiv.org

abstract arxiv become cs.cv +16

Trainwreck: A damaging adversarial attack on image classifiers 1 day, 15 hours ago | arxiv.org

adversarial arxiv classifiers cs.cr +5

Fast Controllable Diffusion Models for Undersampled MRI Reconstruction 1 day, 15 hours ago | arxiv.org

abstract acquisition arxiv cs.lg +13

MAD Max Beyond Single-Node: Enabling Large Machine Learning Model Acceleration on Distributed Systems 1 day, 15 hours ago | arxiv.org

abstract analysis arxiv beyond +24

From Classification to Segmentation with Explainable AI: A Study on Crack Detection and Growth Monitoring 1 day, 15 hours ago | arxiv.org

abstract arxiv classification cs.cv +22

Exploring Meta Information for Audio-based Zero-shot Bird Classification 1 day, 15 hours ago | arxiv.org

abstract advances arxiv audio +22

Occlusion-Aware Deep Convolutional Neural Network via Homogeneous Tanh-transforms for Face Parsing 1 day, 15 hours ago | arxiv.org

abstract arxiv become convolutional +16

Senior Machine Learning Engineer

@ GPTZero | Toronto, Canada

View on ai-jobs.net

Sr. Data Operations

@ Carousell Group | West Jakarta, Indonesia

View on ai-jobs.net

Senior Analyst, Business Intelligence & Reporting

@ Deutsche Bank | Bucharest

View on ai-jobs.net

Business Intelligence Subject Matter Expert (SME) - Assistant Vice President

@ Deutsche Bank | Cary, 3000 CentreGreen Way

View on ai-jobs.net

Enterprise Business Intelligence Specialist

@ NAIC | Kansas City

View on ai-jobs.net

Senior Business Intelligence (BI) Developer - Associate

@ Deutsche Bank | Cary, 3000 CentreGreen Way

View on ai-jobs.net