May 16, 2022, 1:11 a.m. | Fan Bai, Alan Ritter, Wei Xu

cs.CL updates on arXiv.org arxiv.org

Recent work has demonstrated that pre-training in-domain language models can
boost performance when adapting to a new domain. However, the costs associated
with pre-training raise an important question: given a fixed budget, what steps
should an NLP practitioner take to maximize performance? In this paper, we view
domain adaptation with a constrained budget as a consumer choice problem, where
the goal is to select an optimal combination of data annotation and
pre-training. We measure annotation costs of three procedural text …

arxiv budget domain adaptation

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Research Associate (Data Science/Information Engineering/Applied Mathematics/Information Technology)

@ Nanyang Technological University | NTU Main Campus, Singapore

Associate Director of Data Science and Analytics

@ Penn State University | Penn State University Park

Student Worker- Data Scientist

@ TransUnion | Israel - Tel Aviv

Vice President - Customer Segment Analytics Data Science Lead

@ JPMorgan Chase & Co. | Bengaluru, Karnataka, India

Middle/Senior Data Engineer

@ Devexperts | Sofia, Bulgaria