From Bytes to Borsch: Fine-Tuning Gemma and Mistral for the Ukrainian Language Representation | allainews.com

April 16, 2024, 4:43 a.m. | Artur Kiulian, Anton Polishko, Mykola Khandoga, Oryna Chubych, Jack Connor, Raghav Ravishankar, Adarsh Shirawalmath

cs.LG updates on arXiv.org arxiv.org

arXiv:2404.09138v1 Announce Type: cross
Abstract: In the rapidly advancing field of AI and NLP, generative large language models (LLMs) stand at the forefront of innovation, showcasing unparalleled abilities in text understanding and generation. However, the limited representation of low-resource languages like Ukrainian poses a notable challenge, restricting the reach and relevance of this technology. Our paper addresses this by fine-tuning the open-source Gemma and Mistral LLMs with Ukrainian datasets, aiming to improve their linguistic proficiency and benchmarking them against other …

abstract arxiv challenge cs.ai cs.cl cs.lg fine-tuning gemma generative however innovation language language models languages large language large language models llms low mistral nlp representation text text understanding type understanding

More from arxiv.org / cs.LG updates on arXiv.org

Transforming gradient-based techniques into interpretable methods 23 hours ago | arxiv.org

abstract arxiv challenges cnn +20

ChatQA: Surpassing GPT-4 on Conversational QA and RAG 23 hours ago | arxiv.org

arxiv conversational cs.ai cs.cl +7

Towards Truly Zero-shot Compositional Visual Reasoning with LLMs as Programmers 23 hours ago | arxiv.org

abstract arxiv cs.ai cs.cv +22

Calibrating Wireless Ray Tracing for Digital Twinning using Local Phase Error Estimates 23 hours ago | arxiv.org

abstract access arxiv construct +22

Graph Network Surrogate Model for Subsurface Flow Optimization 23 hours ago | arxiv.org

abstract arxiv co2 cs.lg +16

Double Machine Learning for Static Panel Models with Fixed Effects 23 hours ago | arxiv.org

abstract advances algorithms arxiv +20

Dynamic Adversarial Attacks on Autonomous Driving Systems 23 hours ago | arxiv.org

abstract adversarial adversarial attacks arxiv +22

BioCLIP: A Vision Foundation Model for the Tree of Life 23 hours ago | arxiv.org

arxiv cs.cl cs.cv cs.lg +7

On the convergence of adaptive first order methods: proximal gradient and alternating minimization algorithms 23 hours ago | arxiv.org

abstract algorithms arxiv building +12

Software Engineer for AI Training Data (School Specific)

@ G2i Inc | Remote

View on ai-jobs.net

Software Engineer for AI Training Data (Python)

@ G2i Inc | Remote

View on ai-jobs.net

Software Engineer for AI Training Data (Tier 2)

@ G2i Inc | Remote

View on ai-jobs.net

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

View on ai-jobs.net

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

View on ai-jobs.net

Lead Developer (AI)

@ Cere Network | San Francisco, US

View on ai-jobs.net