April 9, 2024, 4:44 a.m. | Jordan Meadows, Marco Valentino, Damien Teney, Andre Freitas

cs.LG updates on arXiv.org arxiv.org

arXiv:2305.12563v2 Announce Type: replace-cross
Abstract: This paper proposes a methodology for generating and perturbing detailed derivations of equations at scale, aided by a symbolic engine, to evaluate the generalisability of Transformers to out-of-distribution mathematical reasoning problems. Instantiating the framework in the context of sequence classification tasks, we compare the capabilities of GPT-4, GPT-3.5, and a canon of fine-tuned BERT models, exploring the relationship between specific operators and generalisation failure via the perturbation of reasoning aspects such as symmetry and variable …

abstract arxiv classification context cs.cl cs.lg distribution framework mathematical reasoning methodology paper reasoning scale tasks transformers type

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne