Feb. 6, 2024, 5:54 a.m. | Darija Medvecki Bojana Ba\v{s}aragin Adela Ljaji\'c Nikola Milo\v{s}evi\'c

cs.CL updates on arXiv.org arxiv.org

This paper presents the results of the first application of BERTopic, a state-of-the-art topic modeling technique, to short text written in a morphologi-cally rich language. We applied BERTopic with three multilingual embed-ding models on two levels of text preprocessing (partial and full) to evalu-ate its performance on partially preprocessed short text in Serbian. We also compared it to LDA and NMF on fully preprocessed text. The experiments were conducted on a dataset of tweets expressing hesitancy toward COVID-19 vaccination. Our …

application art bertopic case cs.ai cs.cl embed language modeling multilingual paper performance state text topic modeling transformer

Software Engineer for AI Training Data (School Specific)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Python)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Tier 2)

@ G2i Inc | Remote

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

.NET Software Engineer (AI Focus)

@ Boskalis | Papendrecht, Netherlands