April 17, 2024, 4:43 a.m. | Enric Boix-Adsera, Omid Saremi, Emmanuel Abbe, Samy Bengio, Etai Littwin, Joshua Susskind

arXiv:2310.09753v2 Announce Type: replace-cross
Abstract: We investigate the capabilities of transformer models on relational reasoning tasks. In these tasks, models are trained on a set of strings encoding abstract relations, and are then tested out-of-distribution on data that contains symbols that did not appear in the training dataset. We prove that for any relational reasoning task in a large family of tasks, transformers learn the abstract relations and generalize to the test set when trained by gradient descent on sufficiently …

abstract arxiv capabilities cs.ai cs.cl cs.lg data dataset distribution encoding prove reason reasoning relational relations set strings tasks training transformer transformer models transformers type

