May 1, 2024, 4:47 a.m. | Vaishak Narayanan, Prabin Raj KP, Saifudheen Nouphal

cs.CL updates on arXiv.org arxiv.org

arXiv:2404.19254v1 Announce Type: new
Abstract: Current evaluation benchmarks for question answering (QA) in Indic languages often rely on machine translation of existing English datasets. This approach suffers from bias and inaccuracies inherent in machine translation, leading to datasets that may not reflect the true capabilities of EQA models for Indic languages. This paper proposes a new benchmark specifically designed for evaluating Hindi EQA models and discusses the methodology to do the same for any task. This method leverages large language …

abstract arxiv benchmark benchmarks bias capabilities cs.ai cs.cl current datasets english evaluation generated hindi languages machine machine translation paper question question answering translation true type

Software Engineer for AI Training Data (School Specific)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Python)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Tier 2)

@ G2i Inc | Remote

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US