April 2, 2024, 7:43 p.m. | Jakub Piskorski, Micha{\l} Marci\'nczuk, Roman Yangarber

cs.LG updates on arXiv.org arxiv.org

arXiv:2404.00482v1 Announce Type: cross
Abstract: This paper presents a corpus manually annotated with named entities for six Slavic languages - Bulgarian, Czech, Polish, Slovenian, Russian, and Ukrainian. This work is the result of a series of shared tasks, conducted in 2017-2023 as a part of the Workshops on Slavic Natural Language Processing. The corpus consists of 5 017 documents on seven topics. The documents are annotated with five classes of named entities. Each entity is described by a category, a …

abstract arxiv cross-lingual cs.ai cs.cl cs.lg czech language language processing languages natural natural language natural language processing paper part processing series six tasks type work workshops

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne