April 29, 2023, 7:17 p.m. | /u/Romcom1398

Machine Learning www.reddit.com

I have a dataset with, amongst others, a column with book descriptions and whether a book has gone viral. I want to extract the topics from the descriptions by first using TF-IDF (also I need TF-IDF because I need to use SMOTE, which needs numerical data), and then using LDA to get the topics. I have a few questions:

​

1. Do I fit the TF-IDF on the training data and then transform the validation and test data with that? …

book column data dataset extract lda machinelearning normal predictions test tf-idf topics

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Business Data Analyst

@ Alstom | Johannesburg, GT, ZA