Oct. 21, 2022, 1:18 a.m. | Nils Dycke, Ilia Kuznetsov, Iryna Gurevych

cs.CL updates on arXiv.org arxiv.org

The shift towards publicly available text sources has enabled language
processing at unprecedented scale, yet leaves under-serviced the domains where
public and openly licensed data is scarce. Proactively collecting text data for
research is a viable strategy to address this scarcity, but lacks systematic
methodology taking into account the many ethical, legal and
confidentiality-related aspects of data collection. Our work presents a case
study on proactive data collection in peer review -- a challenging and
under-resourced NLP domain. We outline …

acl arxiv collection data data collection review

Lead Developer (AI)

@ Cere Network | San Francisco, US

Research Engineer

@ Allora Labs | Remote

Ecosystem Manager

@ Allora Labs | Remote

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote