all AI news
GeniL: A Multilingual Dataset on Generalizing Language
April 10, 2024, 4:47 a.m. | Aida Mostafazadeh Davani, Sagar Gubbi, Sunipa Dev, Shachi Dave, Vinodkumar Prabhakaran
cs.CL updates on arXiv.org arxiv.org
Abstract: LLMs are increasingly transforming our digital ecosystem, but they often inherit societal biases learned from their training data, for instance stereotypes associating certain attributes with specific identity groups. While whether and how these biases are mitigated may depend on the specific use cases, being able to effectively detect instances of stereotype perpetuation is a crucial first step. Current methods to assess presence of stereotypes in generated language rely on simple template or co-occurrence based measures, …
abstract arxiv biases cases cs.cl data dataset digital digital ecosystem ecosystem identity instance language llms multilingual stereotypes training training data type use cases
More from arxiv.org / cs.CL updates on arXiv.org
Jobs in AI, ML, Big Data
Software Engineer for AI Training Data (School Specific)
@ G2i Inc | Remote
Software Engineer for AI Training Data (Python)
@ G2i Inc | Remote
Software Engineer for AI Training Data (Tier 2)
@ G2i Inc | Remote
Data Engineer
@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania
Artificial Intelligence – Bioinformatic Expert
@ University of Texas Medical Branch | Galveston, TX
Lead Developer (AI)
@ Cere Network | San Francisco, US