all AI news
ArabianGPT: Native Arabic GPT-based Large Language
Feb. 26, 2024, 5:43 a.m. | Anis Koubaa, Adel Ammar, Lahouari Ghouti, Omar Najar, Serry Sibaee
cs.LG updates on arXiv.org arxiv.org
Abstract: The predominance of English and Latin-based large language models (LLMs) has led to a notable deficit in native Arabic LLMs. This discrepancy is accentuated by the prevalent inclusion of English tokens in existing Arabic models, detracting from their efficacy in processing native Arabic's intricate morphology and syntax. Consequently, there is a theoretical and practical imperative for developing LLMs predominantly focused on Arabic linguistic elements. To address this gap, this paper proposes ArabianGPT, a series of …
abstract arabic arxiv cs.ai cs.cl cs.lg deficit english gpt inclusion language language models large language large language models llms processing syntax tokens type
More from arxiv.org / cs.LG updates on arXiv.org
Jobs in AI, ML, Big Data
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
#13721 - Data Engineer - AI Model Testing
@ Qualitest | Miami, Florida, United States
Elasticsearch Administrator
@ ManTech | 201BF - Customer Site, Chantilly, VA