Pre-train or Annotate? Domain Adaptation with a Constrained Budget. (arXiv:2109.04711v3 [cs.CL] UPDATED) | allainews.com

May 16, 2022, 1:11 a.m. | Fan Bai, Alan Ritter, Wei Xu

cs.CL updates on arXiv.org arxiv.org

Recent work has demonstrated that pre-training in-domain language models can
boost performance when adapting to a new domain. However, the costs associated
with pre-training raise an important question: given a fixed budget, what steps
should an NLP practitioner take to maximize performance? In this paper, we view
domain adaptation with a constrained budget as a consumer choice problem, where
the goal is to select an optimal combination of data annotation and
pre-training. We measure annotation costs of three procedural text …

arxiv budget domain adaptation

More from arxiv.org / cs.CL updates on arXiv.org

Vesper: A Compact and Effective Pretrained Model for Speech Emotion Recognition 20 hours ago | arxiv.org

abstract artificial artificial general intelligence arxiv +19

Visually grounded few-shot word learning in low-resource settings 20 hours ago | arxiv.org

abstract arxiv cs.cl eess.as +16

KTRL+F: Knowledge-Augmented In-Document Search 20 hours ago | arxiv.org

abstract arxiv challenges cs.cl +12

Knowledgeable Preference Alignment for LLMs in Domain-specific Question Answering 20 hours ago | arxiv.org

abstract alignment applications arxiv +19

Hint-enhanced In-Context Learning wakes Large Language Models up for knowledge-intensive tasks 20 hours ago | arxiv.org

abstract arxiv context cs.cl +17

LibriSQA: A Novel Dataset and Framework for Spoken Question Answering with Large Language Models 20 hours ago | arxiv.org

arxiv cs.cl dataset framework +9

Efficient Sentiment Analysis: A Resource-Aware Evaluation of Feature Extraction Techniques, Ensembling, and Deep Learning Models 20 hours ago | arxiv.org

abstract accuracy analysis arxiv +18

Self-Polish: Enhance Reasoning in Large Language Models via Problem Refinement 20 hours ago | arxiv.org

arxiv cs.ai cs.cl language +6

MFE-NER: Multi-feature Fusion Embedding for Chinese Named Entity Recognition 20 hours ago | arxiv.org

abstract arxiv characters chinese +10

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

View on ai-jobs.net

Research Associate (Data Science/Information Engineering/Applied Mathematics/Information Technology)

@ Nanyang Technological University | NTU Main Campus, Singapore

View on ai-jobs.net

Associate Director of Data Science and Analytics

@ Penn State University | Penn State University Park

View on ai-jobs.net

Student Worker- Data Scientist

@ TransUnion | Israel - Tel Aviv

View on ai-jobs.net

Vice President - Customer Segment Analytics Data Science Lead

@ JPMorgan Chase & Co. | Bengaluru, Karnataka, India

View on ai-jobs.net

Middle/Senior Data Engineer

@ Devexperts | Sofia, Bulgaria

View on ai-jobs.net