April 2, 2024, 7:43 p.m. | Jakub Piskorski, Micha{\l} Marci\'nczuk, Roman Yangarber

cs.LG updates on arXiv.org arxiv.org

arXiv:2404.00482v1 Announce Type: cross
Abstract: This paper presents a corpus manually annotated with named entities for six Slavic languages - Bulgarian, Czech, Polish, Slovenian, Russian, and Ukrainian. This work is the result of a series of shared tasks, conducted in 2017-2023 as a part of the Workshops on Slavic Natural Language Processing. The corpus consists of 5 017 documents on seven topics. The documents are annotated with five classes of named entities. Each entity is described by a category, a …

abstract arxiv cross-lingual cs.ai cs.cl cs.lg czech language language processing languages natural natural language natural language processing paper part processing series six tasks type work workshops

Data Scientist (m/f/x/d)

@ Symanto Research GmbH & Co. KG | Spain, Germany

NUSolve Innovation Assistant/Associate in Data Science'

@ Newcastle University | Newcastle, GB

Data Engineer (Snowflake)

@ Unit4 | Lisbon, Portugal

Lead Data Engineer

@ Provident Bank | Woodbridge, NJ, US

Specialist Solutions Engineer (Data Science/Machine Learning)

@ Databricks | London, United Kingdom

Staff Software Engineer, Data Mirgrations

@ Okta | Canada