May 15, 2023, 12:46 a.m. | Ilias Chalkidis, Nicolas Garneau, Catalina Goanta, Daniel Martin Katz, Anders Søgaard

cs.CL updates on arXiv.org arxiv.org

In this work, we conduct a detailed analysis on the performance of
legal-oriented pre-trained language models (PLMs). We examine the interplay
between their original objective, acquired knowledge, and legal language
understanding capacities which we define as the upstream, probing, and
downstream performance, respectively. We consider not only the models' size but
also the pre-training corpora used as important dimensions in our study. To
this end, we release a multinational English legal corpus (LeXFiles) and a
legal knowledge probing benchmark (LegalLAMA) …

acquired analysis arxiv development english knowledge language language model language models language understanding legal model development performance understanding work

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Principal Applied Scientist

@ Microsoft | Redmond, Washington, United States

Data Analyst / Action Officer

@ OASYS, INC. | OASYS, INC., Pratt Avenue Northwest, Huntsville, AL, United States