April 25, 2024, 5:44 p.m. | Saksham Bassi, Duygu Ataman, Kyunghyun Cho

cs.CL updates on arXiv.org arxiv.org

arXiv:2404.15928v1 Announce Type: new
Abstract: A model's capacity to generalize its knowledge to interpret unseen inputs with different characteristics is crucial to build robust and reliable machine learning systems. Language model evaluation tasks lack information metrics about model generalization and their applicability in a new setting is measured using task and language-specific downstream performance, which is often lacking in many languages and tasks. In this paper, we explore a set of efficient and reliable measures that could aid in computing …

abstract arxiv build capacity cross-lingual cs.cl evaluation information inputs knowledge language language model learning systems machine machine learning metrics model generalization performance robust systems tasks transfer type zero-shot

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne