April 24, 2023, 12:48 a.m. | Karthick Prasad Gunasekaran, B Chase Babrich, Saurabh Shirodkar, Hee Hwang

cs.CL updates on arXiv.org arxiv.org

We explore the problem of predicting the publication period of text document,
such as a news article, using the text from that document. In order to do so,
we created our own extensive labeled dataset of over 350,000 news articles
published by The New York Times over six decades. We then provide an
implementation of a simple Naive Bayes baseline model, which surprisingly
achieves decent performance in terms of accuracy.Finally, for our approach, we
use a pretrained BERT model fine-tuned …

accuracy article articles arxiv bayes bert classification dataset implementation performance publication six terms text text classification transformer

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Stagista Technical Data Engineer

@ Hager Group | BRESCIA, IT

Data Analytics - SAS, SQL - Associate

@ JPMorgan Chase & Co. | Mumbai, Maharashtra, India