Aya 23: Open Weight Releases to Further Multilingual Progress | allainews.com

May 27, 2024, 4:49 a.m. | Viraat Aryabumi, John Dang, Dwarak Talupuru, Saurabh Dash, David Cairuz, Hangyu Lin, Bharat Venkitesh, Madeline Smith, Kelly Marchisio, Sebastian Rude

cs.CL updates on arXiv.org arxiv.org

arXiv:2405.15032v1 Announce Type: new
Abstract: This technical report introduces Aya 23, a family of multilingual language models. Aya 23 builds on the recent release of the Aya model (\"Ust\"un et al., 2024), focusing on pairing a highly performant pre-trained model with the recently released Aya collection (Singh et al., 2024). The result is a powerful multilingual large language model serving 23 languages, expanding state-of-art language modeling capabilities to approximately half of the world's population. The Aya model covered 101 languages …

abstract arxiv aya aya 23 collection cs.cl family language language models multilingual multilingual language models pre-trained model progress release releases report singh technical type

More from arxiv.org / cs.CL updates on arXiv.org

MuTox: Universal MUltilingual Audio-based TOXicity Dataset and Zero-shot Detector 1 day, 18 hours ago | arxiv.org

abstract arxiv audio cs.cl +22

Can Large Language Model Summarizers Adapt to Diverse Scientific Communication Goals? 1 day, 18 hours ago | arxiv.org

abstract adapt arxiv communication +23

ReFT: Reasoning with Reinforced Fine-Tuning 1 day, 18 hours ago | arxiv.org

abstract annotations arxiv capability +22

Deductive Closure Training of Language Models for Coherence, Accuracy, and Updatability 1 day, 18 hours ago | arxiv.org

abstract accuracy arxiv cs.cl +13

Exploring Defeasibility in Causal Reasoning 1 day, 18 hours ago | arxiv.org

abstract arxiv causal causal reasoning +7

Can Large Language Models Follow Concept Annotation Guidelines? A Case Study on Scientific and Financial … 1 day, 18 hours ago | arxiv.org

abstract annotation arxiv capacity +26

Theory of Mind for Multi-Agent Collaboration via Large Language Models 1 day, 18 hours ago | arxiv.org

abstract agent agents arxiv +28

Enhancing Text-based Knowledge Graph Completion with Zero-Shot Large Language Models: A Focus on Semantic Enhancement 1 day, 18 hours ago | arxiv.org

arxiv cs.ai cs.cl focus +12

A Large Language Model Approach to Educational Survey Feedback Analysis 1 day, 18 hours ago | arxiv.org

abstract analysis arxiv capabilities +27

Quantitative Researcher – Algorithmic Research

@ Man Group | GB London Riverbank House

View on ai-jobs.net

Software Engineering Expert

@ Sanofi | Budapest

View on ai-jobs.net

Senior Bioinformatics Scientist

@ Illumina | US - Bay Area - Foster City

View on ai-jobs.net

Senior Engineer - Generative AI Product Engineering (Remote-Eligible)

@ Capital One | McLean, VA

View on ai-jobs.net

Graduate Assistant - Bioinformatics

@ University of Arkansas System | University of Arkansas at Little Rock

View on ai-jobs.net

Senior AI-HPC Cluster Engineer

@ NVIDIA | US, CA, Santa Clara

View on ai-jobs.net