June 1, 2023, 7:14 p.m. | /u/LargeBrick7

Natural Language Processing www.reddit.com

I have a corpus with a list of sentences and I want to build a n-gram language model with it. I use the padded\_everygram\_pipeline function from NLTK to build my ngrams and then fit a model. This works fine. I want to calculate the perplexity with lm.perplexity(test\_data). The NLTK doc says that the function expects a list of ngrams. If I transform my test data with the padded\_everygram\_pipeline to ngrams, the lm.perplexity functions gives me a DivisionByZero error. What is …

function language language model languagetechnology list ngrams nltk test

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Lead Data Scientist, Commercial Analytics

@ Checkout.com | London, United Kingdom

Data Engineer I

@ Love's Travel Stops | Oklahoma City, OK, US, 73120