CANINE: Pre-training an Efficient Tokenization-Free Encoder for Language Representation. (arXiv:2103.06874v4 [cs.CL] UPDATED) | allainews.com

May 19, 2022, 1:11 a.m. | Jonathan H. Clark, Dan Garrette, Iulia Turc, John Wieting

cs.LG updates on arXiv.org arxiv.org

Pipelined NLP systems have largely been superseded by end-to-end neural
modeling, yet nearly all commonly-used models still require an explicit
tokenization step. While recent tokenization approaches based on data-derived
subword lexicons are less brittle than manually engineered tokenizers, these
techniques are not equally suited to all languages, and the use of any fixed
vocabulary may limit a model's ability to adapt. In this paper, we present
CANINE, a neural encoder that operates directly on character sequences, without
explicit tokenization or …

arxiv encoder free language pre-training representation tokenization training

More from arxiv.org / cs.LG updates on arXiv.org

Stochastic Optimal Control Matching 10 hours ago | arxiv.org

arxiv control cs.lg cs.na +6

Value Approximation for Two-Player General-Sum Differential Games with State Constraints 10 hours ago | arxiv.org

abstract approximation arxiv constraints +20

Can We Edit Multimodal Large Language Models? 10 hours ago | arxiv.org

arxiv cs.ai cs.cl cs.cv +9

XIMAGENET-12: An Explainable AI Benchmark Dataset for Model Robustness Evaluation 10 hours ago | arxiv.org

ai benchmark arxiv benchmark cs.cv +7

Generalized Schr\"odinger Bridge Matching 10 hours ago | arxiv.org

arxiv bridge cs.lg generalized +3

Tight bounds on Pauli channel learning without entanglement 10 hours ago | arxiv.org

abstract algorithms arxiv cs.it +9

Automated mapping of virtual environments with visual predictive coding 10 hours ago | arxiv.org

abstract access algorithms arxiv +28

Confident Feature Ranking 10 hours ago | arxiv.org

abstract arxiv cs.ai cs.lg +14

Integrated Sensing-Communication-Computation for Edge Artificial Intelligence 10 hours ago | arxiv.org

abstract advanced and edge ai artificial +27

Senior Marketing Data Analyst

@ Amazon.com | Amsterdam, North Holland, NLD

View on ai-jobs.net

Senior Data Analyst

@ MoneyLion | Kuala Lumpur, Kuala Lumpur, Malaysia

View on ai-jobs.net

Data Management Specialist - Office of the CDO - Chase- Associate

@ JPMorgan Chase & Co. | LONDON, LONDON, United Kingdom

View on ai-jobs.net

BI Data Analyst

@ Nedbank | Johannesburg, ZA

View on ai-jobs.net

Head of Data Science and Artificial Intelligence (m/f/d)

@ Project A Ventures | Munich, Germany

View on ai-jobs.net

Senior Data Scientist - GenAI

@ Roche | Hyderabad RSS

View on ai-jobs.net