Feb. 27, 2024, 5:50 a.m. | Adrien Bazoge, Emmanuel Morin, Beatrice Daille, Pierre-Antoine Gourraud

cs.CL updates on arXiv.org arxiv.org

arXiv:2402.16689v1 Announce Type: new
Abstract: Recently, pretrained language models based on BERT have been introduced for the French biomedical domain. Although these models have achieved state-of-the-art results on biomedical and clinical NLP tasks, they are constrained by a limited input sequence length of 512 tokens, which poses challenges when applied to clinical notes. In this paper, we present a comparative study of three adaptation strategies for long-sequence models, leveraging the Longformer architecture. We conducted evaluations of these models on 16 …

abstract art arxiv bert biomedical clinical cs.ai cs.cl documents domain french language language models nlp pretrained models results state study tasks tokens type

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US

Research Engineer

@ Allora Labs | Remote

Ecosystem Manager

@ Allora Labs | Remote

Founding AI Engineer, Agents

@ Occam AI | New York