April 30, 2022, 12:55 p.m. | /u/radjeep

Data Science www.reddit.com

This \[Keras article / tutorial here\]\[1\] does perform text standardization i.e removing HTML elements, punctuation, etc. from the text dataset, however, there is a distinct lack of any stemming or lemmatization before the vectorization step.

I have a bit of experience in deep learning but I am very new to NLP, and I just got to know (from a \[different tutorial on Udemy\]\[2\], which BTW was using Bag of Words) that using either a Stemmer or a Lemmatizer helps in …

datascience keras lemmatization stemming

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Data Management Assistant

@ World Vision | Amman Office, Jordan

Cloud Data Engineer, Global Services Delivery, Google Cloud

@ Google | Buenos Aires, Argentina