April 24, 2023, 12:48 a.m. | Vitaly Shalumov, Harel Haskey

cs.CL updates on arXiv.org arxiv.org

In this paper, we fill in an existing gap in resources available to the
Hebrew NLP community by providing it with the largest so far pre-train dataset
HeDC4, a state-of-the-art pre-trained language model HeRo for standard length
inputs and an efficient transformer LongHeRo for long input sequences. The HeRo
model was evaluated on the sentiment analysis, the named entity recognition,
and the question answering tasks while the LongHeRo model was evaluated on the
document classification task with a dataset composed …

analysis art arxiv classification community dataset gap language language model language models nlp paper performance question answering resources roberta sentiment sentiment analysis standard state transformer

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Lead Data Modeler

@ Sherwin-Williams | Cleveland, OH, United States