April 30, 2024, 4:43 a.m. | Saloni Mittal, Vidula Magdum, Omkar Dhekane, Sharayu Hiwarkhedkar, Raviraj Joshi

cs.LG updates on arXiv.org arxiv.org

arXiv:2404.18216v1 Announce Type: cross
Abstract: The availability of text or topic classification datasets in the low-resource Marathi language is limited, typically consisting of fewer than 4 target labels, with some achieving nearly perfect accuracy. In this work, we introduce L3Cube-MahaNews, a Marathi text classification corpus that focuses on News headlines and articles. This corpus stands out as the largest supervised Marathi Corpus, containing over 1.05L records classified into a diverse range of 12 categories. To accommodate different document lengths, MahaNews …

arxiv classification cs.cl cs.lg datasets document text type

Software Engineer for AI Training Data (School Specific)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Python)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Tier 2)

@ G2i Inc | Remote

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US