all AI news
CELLS: A Parallel Corpus for Biomedical Lay Language Generation. (arXiv:2211.03818v1 [cs.CL])
Nov. 9, 2022, 2:15 a.m. | Yue Guo, Wei Qiu, Gondy Leroy, Sheng Wang, Trevor Cohen
cs.CL updates on arXiv.org arxiv.org
Recent lay language generation systems have used Transformer models trained
on a parallel corpus to increase health information accessibility. However, the
applicability of these models is constrained by the limited size and topical
breadth of available corpora. We introduce CELLS, the largest (63k pairs) and
broadest-ranging (12 journals) parallel corpus for lay language generation. The
abstract and the corresponding lay language summary are written by domain
experts, assuring the quality of our dataset. Furthermore, qualitative
evaluation of expert-authored plain language …
More from arxiv.org / cs.CL updates on arXiv.org
Jobs in AI, ML, Big Data
Data Scientist (m/f/x/d)
@ Symanto Research GmbH & Co. KG | Spain, Germany
Future Opportunity: Managed Services, Data Analyst
@ project44 | Poland - Kraków
Staff Software Engineer, Data Migration
@ Okta | Spain
Data Engineer
@ Red Bull | Thalgau, Austria
Head of Artificial Intelligence & Automation Transformation
@ Guardian | New York
Data Scientist-1
@ Visa | Bengaluru, India