all AI news
GeniL: A Multilingual Dataset on Generalizing Language
April 10, 2024, 4:47 a.m. | Aida Mostafazadeh Davani, Sagar Gubbi, Sunipa Dev, Shachi Dave, Vinodkumar Prabhakaran
cs.CL updates on arXiv.org arxiv.org
Abstract: LLMs are increasingly transforming our digital ecosystem, but they often inherit societal biases learned from their training data, for instance stereotypes associating certain attributes with specific identity groups. While whether and how these biases are mitigated may depend on the specific use cases, being able to effectively detect instances of stereotype perpetuation is a crucial first step. Current methods to assess presence of stereotypes in generated language rely on simple template or co-occurrence based measures, …
abstract arxiv biases cases cs.cl data dataset digital digital ecosystem ecosystem identity instance language llms multilingual stereotypes training training data type use cases
More from arxiv.org / cs.CL updates on arXiv.org
Jobs in AI, ML, Big Data
AI Engineer Intern, Agents
@ Occam AI | US
AI Research Scientist
@ Vara | Berlin, Germany and Remote
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Data Engineer - Takealot Group (Takealot.com | Superbalist.com | Mr D Food)
@ takealot.com | Cape Town