all AI news
A Natural Language Processing Pipeline for Detecting Informal Data References in Academic Literature. (arXiv:2205.11651v1 [cs.DL])
cs.CL updates on arXiv.org arxiv.org
Discovering authoritative links between publications and the datasets that
they use can be a labor-intensive process. We introduce a natural language
processing pipeline that retrieves and reviews publications for informal
references to research datasets, which complements the work of data librarians.
We first describe the components of the pipeline and then apply it to expand an
authoritative bibliography linking thousands of social science studies to the
data-related publications in which they are used. The pipeline increases recall
for literature to …
academic arxiv data dl language language processing literature natural natural language natural language processing pipeline processing