Fostering the Ecosystem of Open Neural Encoders for Portuguese with Albertina PT* Family | allainews.com

March 5, 2024, 2:52 p.m. | Rodrigo Santos, Jo\~ao Rodrigues, Lu\'is Gomes, Jo\~ao Silva, Ant\'onio Branco, Henrique Lopes Cardoso, Tom\'as Freitas Os\'orio, Bernardo Leite

cs.CL updates on arXiv.org arxiv.org

arXiv:2403.01897v1 Announce Type: new
Abstract: To foster the neural encoding of Portuguese, this paper contributes foundation encoder models that represent an expansion of the still very scarce ecosystem of large language models specifically developed for this language that are fully open, in the sense that they are open source and openly distributed for free under an open license for any purpose, thus including research and commercial usages. Like most languages other than English, Portuguese is low-resourced in terms of these …

abstract arxiv cs.cl ecosystem encoder encoding expansion family foundation language language models large language large language models paper sense type

More from arxiv.org / cs.CL updates on arXiv.org

STaR: Distilling Speech Temporal Relation for Lightweight Speech Self-Supervised Learning Models 2 days, 12 hours ago | arxiv.org

abstract arxiv computational cost +14

Large Language Models can Learn Rules 2 days, 12 hours ago | arxiv.org

abstract arxiv cs.ai cs.cl +18

Benchmarking LLMs via Uncertainty Quantification 2 days, 12 hours ago | arxiv.org

abstract arxiv benchmarking bridge +21

CARE: Extracting Experimental Findings From Clinical Literature 2 days, 12 hours ago | arxiv.org

abstract annotation applications arxiv +16

Prompt Cache: Modular Attention Reuse for Low-Latency Inference 2 days, 12 hours ago | arxiv.org

abstract arxiv attention cache +20

SpeechAlign: a Framework for Speech Translation Alignment Evaluation 2 days, 12 hours ago | arxiv.org

abstract advance alignment arxiv +14

I3: Intent-Introspective Retrieval Conditioned on Instructions 2 days, 12 hours ago | arxiv.org

abstract arxiv challenge cs.cl +10

DoDo Learning: DOmain-DemOgraphic Transfer in Language Models for Detecting Abuse Targeted at Public Figures 2 days, 12 hours ago | arxiv.org

abstract abuse arxiv automated +19

Investigating the prompt leakage effect and black-box defenses for multi-turn LLM interactions 2 days, 12 hours ago | arxiv.org

abstract arxiv box cs.ai +24

Data Architect

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

View on ai-jobs.net

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

View on ai-jobs.net

RL Analytics - Content, Data Science Manager

@ Meta | Burlingame, CA

View on ai-jobs.net

Research Engineer

@ BASF | Houston, TX, US, 77079

View on ai-jobs.net