Nov. 17, 2023, 12:20 p.m. | Krikor Postalian-Yrausquin

Towards AI - Medium

NLP (doc2vec from scratch) & Clustering: Classification of news reports based on the content of the text

Using NLP (doc2vec), with deep and customized text cleaning, and then clustering (Birch) to find topics in the text of news articles.

In this example, I use NLP (Doc2Vec) and clustering algorithms to try to classify news by topic.

There are many ways to do this type of classification, such as using supervised methods (a tagged dataset), using clustering and using a specific …

articles classification clustering data science example machine learning nlp python reports text topics unsupervised learning

Machine Learning Postdoctoral Fellow

@ Lawrence Berkeley National Lab | Berkeley, Ca

Senior Data Engineer (Microsoft Azure)

@ Capco | UK - London

Senior Data Analyst

@ Publicis Groupe | Bengaluru, India

Senior Data Engineer

@ Press Ganey | Chicago, IL, United States

Senior Data Scientist (remote from EU)

@ PriceHubble | Vienna, Vienna, Austria - Remote

Data Science Co-op

@ Novelis | Atlanta, GA, United States