July 6, 2022, 8:13 p.m. | Dan Robinson

Towards Data Science - Medium towardsdatascience.com

Image by author.

Overcome LDA’s Shortcomings with Embedded Topic Models

The 2003 paper, Latent Dirichlet Allocation, established LDA as what is now probably the best known and most widely used algorithm for topic modeling (Blei et al. 2003). Yet despite its ubiquity and longevity, those experienced with LDA are familiar with its limitations. In addition to its instability, detailed below, LDA requires more than a little text pre-processing to obtain good results. Even putting aside the implementation details, LDA …

bert hdbscan lda machine learning modeling topic modeling

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Data Engineer (m/f/d)

@ Project A Ventures | Berlin, Germany

Principle Research Scientist

@ Analog Devices | US, MA, Boston