all AI news
SpaceByte: Towards Deleting Tokenization from Large Language Modeling
April 23, 2024, 4:43 a.m. | Kevin Slagle
cs.LG updates on arXiv.org arxiv.org
Abstract: Tokenization is widely used in large language models because it significantly improves performance. However, tokenization imposes several disadvantages, such as performance biases, increased adversarial vulnerability, decreased character-level modeling performance, and increased modeling complexity. To address these disadvantages without sacrificing performance, we propose SpaceByte, a novel byte-level decoder architecture that closes the performance gap between byte-level and subword autoregressive language modeling. SpaceByte consists of a byte-level Transformer model, but with extra larger transformer blocks inserted in …
abstract adversarial arxiv biases complexity cs.ai cs.cl cs.lg decoder disadvantages however language language models large language large language models modeling novel performance tokenization type vulnerability
More from arxiv.org / cs.LG updates on arXiv.org
Testing the Segment Anything Model on radiology data
1 day, 20 hours ago |
arxiv.org
Calorimeter shower superresolution
1 day, 20 hours ago |
arxiv.org
Jobs in AI, ML, Big Data
Software Engineer for AI Training Data (School Specific)
@ G2i Inc | Remote
Software Engineer for AI Training Data (Python)
@ G2i Inc | Remote
Software Engineer for AI Training Data (Tier 2)
@ G2i Inc | Remote
Data Engineer
@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania
Artificial Intelligence – Bioinformatic Expert
@ University of Texas Medical Branch | Galveston, TX
Lead Developer (AI)
@ Cere Network | San Francisco, US