May 20, 2022, 1:11 a.m. | Antonia Creswell, Murray Shanahan, Irina Higgins

cs.CL updates on arXiv.org arxiv.org

Large language models (LLMs) have been shown to be capable of impressive
few-shot generalisation to new tasks. However, they still tend to perform
poorly on multi-step logical reasoning problems. Here we carry out a
comprehensive evaluation of LLMs on 50 tasks that probe different aspects of
logical reasoning. We show that language models tend to perform fairly well at
single step inference or entailment tasks, but struggle to chain together
multiple reasoning steps to solve more complex problems. In light …

ai arxiv inference language language models large language models reasoning

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Senior Business Intelligence Developer / Analyst

@ Transamerica | Work From Home, USA

Data Analyst (All Levels)

@ Noblis | Bethesda, MD, United States