March 26, 2024, 4:52 a.m. | Stephen Mayhew, Terra Blevins, Shuheng Liu, Marek \v{S}uppa, Hila Gonen, Joseph Marvin Imperial, B\"orje F. Karlsson, Peiqin Lin, Nikola Ljube\v{s}i\'

cs.CL updates on arXiv.org arxiv.org

arXiv:2311.09122v2 Announce Type: replace
Abstract: We introduce Universal NER (UNER), an open, community-driven project to develop gold-standard NER benchmarks in many languages. The overarching goal of UNER is to provide high-quality, cross-lingually consistent annotations to facilitate and standardize multilingual NER research. UNER v1 contains 18 datasets annotated with named entities in a cross-lingual consistent schema across 12 diverse languages. In this paper, we detail the dataset creation and composition of UNER; we also provide initial modeling baselines on both in-language …

abstract annotations arxiv benchmark benchmarks community consistent cs.cl datasets languages multilingual ner project quality recognition research standard type universal

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Consultant - Artificial Intelligence & Data (Google Cloud Data Engineer) - MY / TH

@ Deloitte | Kuala Lumpur, MY