Aug. 29, 2022, 1:10 a.m. | Evan T. R. Rosenman, Santiago Olivella, Kosuke Imai

cs.LG updates on arXiv.org arxiv.org

We provide the largest compiled publicly available dictionaries of first,
middle, and last names for the purpose of imputing race and ethnicity using,
for example, Bayesian Improved Surname Geocoding (BISG). The dictionaries are
based on the voter files of six Southern states that collect self-reported
racial data upon voter registration. Our data cover a much larger scope of
names than any comparable dataset, containing roughly one million first names,
1.1 million middle names, and 1.4 million surnames. Individuals are categorized …

arxiv data ot race

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Software Engineering Manager, Generative AI - Characters

@ Meta | Bellevue, WA | Menlo Park, CA | Seattle, WA | New York City | San Francisco, CA

Senior Operations Research Analyst / Predictive Modeler

@ LinQuest | Colorado Springs, Colorado, United States