April 4, 2024, 4:42 a.m. | Nataliia Kholodna, Sahib Julka, Mohammad Khodadadi, Muhammed Nurullah Gumus, Michael Granitzer

cs.LG updates on arXiv.org arxiv.org

arXiv:2404.02261v1 Announce Type: cross
Abstract: Low-resource languages face significant barriers in AI development due to limited linguistic resources and expertise for data labeling, rendering them rare and costly. The scarcity of data and the absence of preexisting tools exacerbate these challenges, especially since these languages may not be adequately represented in various NLP datasets. To address this gap, we propose leveraging the potential of LLMs in the active learning loop for data annotation. Initially, we conduct evaluations to assess inter-annotator …

abstract active learning ai development annotations arxiv challenges cs.ai cs.cl cs.ir cs.lg data data labeling development expertise face labeling language language model languages large language large language model llms loop low rendering resources them tools type

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Business Intelligence Manager

@ Sanofi | Budapest

Principal Engineer, Data (Hybrid)

@ Homebase | Toronto, Ontario, Canada