March 18, 2024, 4:43 a.m. | Tomasz Limisiewicz, David Mare\v{c}ek, Tom\'a\v{s} Musil

stat.ML updates on arXiv.org arxiv.org

arXiv:2310.18913v3 Announce Type: replace-cross
Abstract: Large language models are becoming the go-to solution for the ever-growing number of tasks. However, with growing capacity, models are prone to rely on spurious correlations stemming from biases and stereotypes present in the training data. This work proposes a novel method for detecting and mitigating gender bias in language models. We perform causal analysis to identify problematic model components and discover that mid-upper feed-forward layers are most prone to convey bias. Based on the …

abstract algorithm arxiv bias biases capacity correlations cs.ai cs.cl data gender gender bias however language language models large language large language models model adaptation novel solution stat.ml stemming stereotypes tasks through training training data type work

More from arxiv.org / stat.ML updates on arXiv.org

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Software Engineering Manager, Generative AI - Characters

@ Meta | Bellevue, WA | Menlo Park, CA | Seattle, WA | New York City | San Francisco, CA

Senior Operations Research Analyst / Predictive Modeler

@ LinQuest | Colorado Springs, Colorado, United States