all AI news
Towards Responsible Natural Language Annotation for the Varieties of Arabic. (arXiv:2203.09597v1 [cs.CL])
March 21, 2022, 1:11 a.m. | A. Stevie Bergman, Mona T. Diab
cs.CL updates on arXiv.org arxiv.org
When building NLP models, there is a tendency to aim for broader coverage,
often overlooking cultural and (socio)linguistic nuance. In this position
paper, we make the case for care and attention to such nuances, particularly in
dataset annotation, as well as the inclusion of cultural and linguistic
expertise in the process. We present a playbook for responsible dataset
creation for polyglossic, multidialectal languages. This work is informed by a
study on Arabic annotation of social media content.
More from arxiv.org / cs.CL updates on arXiv.org
Jobs in AI, ML, Big Data
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Data Strategy & Management - Private Equity Sector - Manager - Consulting - Location OPEN
@ EY | New York City, US, 10001-8604
Data Engineer- People Analytics
@ Volvo Group | Gothenburg, SE, 40531