March 5, 2024, 2:52 p.m. | Rodrigo Santos, Jo\~ao Rodrigues, Lu\'is Gomes, Jo\~ao Silva, Ant\'onio Branco, Henrique Lopes Cardoso, Tom\'as Freitas Os\'orio, Bernardo Leite

cs.CL updates on arXiv.org arxiv.org

arXiv:2403.01897v1 Announce Type: new
Abstract: To foster the neural encoding of Portuguese, this paper contributes foundation encoder models that represent an expansion of the still very scarce ecosystem of large language models specifically developed for this language that are fully open, in the sense that they are open source and openly distributed for free under an open license for any purpose, thus including research and commercial usages. Like most languages other than English, Portuguese is low-resourced in terms of these …

abstract arxiv cs.cl ecosystem encoder encoding expansion family foundation language language models large language large language models paper sense type

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

RL Analytics - Content, Data Science Manager

@ Meta | Burlingame, CA

Research Engineer

@ BASF | Houston, TX, US, 77079