April 9, 2024, 4:42 a.m. | Shreyasi Mandal, Ashutosh Modi

cs.LG updates on arXiv.org arxiv.org

arXiv:2404.04510v1 Announce Type: cross
Abstract: Large Language models (LLMs) have demonstrated state-of-the-art performance in various natural language processing (NLP) tasks across multiple domains, yet they are prone to shortcut learning and factual inconsistencies. This research investigates LLMs' robustness, consistency, and faithful reasoning when performing Natural Language Inference (NLI) on breast cancer Clinical Trial Reports (CTRs) in the context of SemEval 2024 Task 2: Safe Biomedical Natural Language Inference for Clinical Trials. We examine the reasoning capabilities of LLMs and their …

abstract art arxiv biomedical capabilities clinical clinical trials cs.ai cs.cl cs.lg domains inference language language models language processing large language large language models llms multiple natural natural language natural language processing nlp performance processing research robustness safe shortcut state tasks type

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Lead Data Modeler

@ Sherwin-Williams | Cleveland, OH, United States