June 27, 2024, 4:42 a.m. | Kenneth Li, Aspen K. Hopkins, David Bau, Fernanda Vi\'egas, Hanspeter Pfister, Martin Wattenberg

cs.CL updates on arXiv.org arxiv.org

arXiv:2210.13382v5 Announce Type: replace-cross
Abstract: Language models show a surprising range of capabilities, but the source of their apparent competence is unclear. Do these networks just memorize a collection of surface statistics, or do they rely on internal representations of the process that generates the sequences they see? We investigate this question by applying a variant of the GPT model to the task of predicting legal moves in a simple board game, Othello. Although the network has no a priori …

abstract arxiv capabilities collection cs.ai cs.cl cs.lg language language models networks process replace sequence model show statistics surface synthetic type world

Quantitative Researcher – Algorithmic Research

@ Man Group | GB London Riverbank House

Software Engineering Expert

@ Sanofi | Budapest

Senior Bioinformatics Scientist

@ Illumina | US - Bay Area - Foster City

Senior Engineer - Generative AI Product Engineering (Remote-Eligible)

@ Capital One | McLean, VA

Graduate Assistant - Bioinformatics

@ University of Arkansas System | University of Arkansas at Little Rock

Senior AI-HPC Cluster Engineer

@ NVIDIA | US, CA, Santa Clara