all AI news
Race and ethnicity data for first, middle, and last names. (arXiv:2208.12443v1 [stat.OT])
Aug. 29, 2022, 1:10 a.m. | Evan T. R. Rosenman, Santiago Olivella, Kosuke Imai
cs.LG updates on arXiv.org arxiv.org
We provide the largest compiled publicly available dictionaries of first,
middle, and last names for the purpose of imputing race and ethnicity using,
for example, Bayesian Improved Surname Geocoding (BISG). The dictionaries are
based on the voter files of six Southern states that collect self-reported
racial data upon voter registration. Our data cover a much larger scope of
names than any comparable dataset, containing roughly one million first names,
1.1 million middle names, and 1.4 million surnames. Individuals are categorized …
More from arxiv.org / cs.LG updates on arXiv.org
Jobs in AI, ML, Big Data
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Software Engineering Manager, Generative AI - Characters
@ Meta | Bellevue, WA | Menlo Park, CA | Seattle, WA | New York City | San Francisco, CA
Senior Operations Research Analyst / Predictive Modeler
@ LinQuest | Colorado Springs, Colorado, United States