May 25, 2022, 1:11 a.m. | Sara Lafia, Lizhou Fan, Libby Hemphill

cs.CL updates on arXiv.org arxiv.org

Discovering authoritative links between publications and the datasets that
they use can be a labor-intensive process. We introduce a natural language
processing pipeline that retrieves and reviews publications for informal
references to research datasets, which complements the work of data librarians.
We first describe the components of the pipeline and then apply it to expand an
authoritative bibliography linking thousands of social science studies to the
data-related publications in which they are used. The pipeline increases recall
for literature to …

academic arxiv data dl language language processing literature natural natural language natural language processing pipeline processing

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Data Engineer

@ Parker | New York City

Sr. Data Analyst | Home Solutions

@ Three Ships | Raleigh or Charlotte, NC