Nov. 15, 2022, 2:16 a.m. | Kabir Ahuja, Monojit Choudhury, Sandipan Dandapat

cs.CL updates on arXiv.org arxiv.org

Borrowing ideas from {\em Production functions} in micro-economics, in this
paper we introduce a framework to systematically evaluate the performance and
cost trade-offs between machine-translated and manually-created labelled data
for task-specific fine-tuning of massively multilingual language models. We
illustrate the effectiveness of our framework through a case-study on the
TyDIQA-GoldP dataset. One of the interesting conclusions of the study is that
if the cost of machine translation is greater than zero, the optimal
performance at least cost is always achieved …

arxiv cost data economics few-shot learning machine modeling performance trade

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Data Analyst

@ SEAKR Engineering | Englewood, CO, United States

Data Analyst II

@ Postman | Bengaluru, India

Data Architect

@ FORSEVEN | Warwick, GB

Director, Data Science

@ Visa | Washington, DC, United States

Senior Manager, Data Science - Emerging ML

@ Capital One | McLean, VA