[D] Why use hard-coded tokenization in NLP instead of a learned tokenization? | allainews.com

Aug. 13, 2022, 3:38 a.m. | /u/QLaHPD

Machine Learning www.reddit.com

Usually when you are going to create an NLP model, you use some library that performs tokenization of the text. The network receives these tokens on input, and on output, it has to predict which token (class) is the most likely.
Why not use a prior network that on input receives the raw text, and generate an learned output to the main network. I believe that letting the neural network itself tokenize the text is the best way to process …

machinelearning nlp tokenization

More from www.reddit.com / Machine Learning

[D] tutorial on how to build streaming ML applications 10 hours ago | www.reddit.com

machinelearning

[D] Why is R^2 so crazy? 10 hours ago | www.reddit.com

baseball games good labels +5

[D] Preserving spatial distribution of data during data splitting 15 hours ago | www.reddit.com

data dataset distribution machinelearning +6

[N] Snowflake releases open (Apache 2.0) 128x3B MoE model 15 hours ago | www.reddit.com

apache apache 2.0 machinelearning moe +2

[D] Why would such a simple sentence break an LLM? 16 hours ago | www.reddit.com

copilot disadvantages german gpt4 +7

[R] Speaker diarization 17 hours ago | www.reddit.com

api assemblyai aws box +12

[R] I made an app to predict ICML paper acceptance from reviews 20 hours ago | www.reddit.com

analysis conferences iclr machinelearning +6

[R] SpaceByte: Towards Deleting Tokenization from Large Language Modeling - Rice University 2024 - Practically … 20 hours ago | www.reddit.com

abstract machinelearning

[D] Keeping track of models and their associated metadata. 22 hours ago | www.reddit.com

industry machinelearning metadata project +1

Data Architect

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

View on ai-jobs.net

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

View on ai-jobs.net

Stagista Technical Data Engineer

@ Hager Group | BRESCIA, IT

View on ai-jobs.net

Data Analytics - SAS, SQL - Associate

@ JPMorgan Chase & Co. | Mumbai, Maharashtra, India

View on ai-jobs.net