March 27, 2024, 4:48 a.m. | Shijia Zhou, Leonie Weissweiler, Taiqi He, Hinrich Sch\"utze, David R. Mortensen, Lori Levin

cs.CL updates on arXiv.org arxiv.org

arXiv:2403.17760v1 Announce Type: new
Abstract: In this paper, we make a contribution that can be understood from two perspectives: from an NLP perspective, we introduce a small challenge dataset for NLI with large lexical overlap, which minimises the possibility of models discerning entailment solely based on token distinctions, and show that GPT-4 and Llama 2 fail it with strong bias. We then create further challenging sub-tasks in an effort to explain this failure. From a Computational Linguistics perspective, we identify …

abstract arxiv challenge cs.cl dataset language language models large language large language models nlp paper perspective perspectives possibility small them type

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Principal Applied Scientist

@ Microsoft | Redmond, Washington, United States

Data Analyst / Action Officer

@ OASYS, INC. | OASYS, INC., Pratt Avenue Northwest, Huntsville, AL, United States