Aug. 13, 2022, 3:38 a.m. | /u/QLaHPD

Machine Learning www.reddit.com

Usually when you are going to create an NLP model, you use some library that performs tokenization of the text. The network receives these tokens on input, and on output, it has to predict which token (class) is the most likely.
Why not use a prior network that on input receives the raw text, and generate an learned output to the main network. I believe that letting the neural network itself tokenize the text is the best way to process …

machinelearning nlp tokenization

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Stagista Technical Data Engineer

@ Hager Group | BRESCIA, IT

Data Analytics - SAS, SQL - Associate

@ JPMorgan Chase & Co. | Mumbai, Maharashtra, India