all AI news
Emergent World Representations: Exploring a Sequence Model Trained on a Synthetic Task
June 27, 2024, 4:42 a.m. | Kenneth Li, Aspen K. Hopkins, David Bau, Fernanda Vi\'egas, Hanspeter Pfister, Martin Wattenberg
cs.CL updates on arXiv.org arxiv.org
Abstract: Language models show a surprising range of capabilities, but the source of their apparent competence is unclear. Do these networks just memorize a collection of surface statistics, or do they rely on internal representations of the process that generates the sequences they see? We investigate this question by applying a variant of the GPT model to the task of predicting legal moves in a simple board game, Othello. Although the network has no a priori …
abstract arxiv capabilities collection cs.ai cs.cl cs.lg language language models networks process replace sequence model show statistics surface synthetic type world
More from arxiv.org / cs.CL updates on arXiv.org
ReFT: Reasoning with Reinforced Fine-Tuning
1 day, 16 hours ago |
arxiv.org
Exploring Defeasibility in Causal Reasoning
1 day, 16 hours ago |
arxiv.org
Jobs in AI, ML, Big Data
Quantitative Researcher – Algorithmic Research
@ Man Group | GB London Riverbank House
Software Engineering Expert
@ Sanofi | Budapest
Senior Bioinformatics Scientist
@ Illumina | US - Bay Area - Foster City
Senior Engineer - Generative AI Product Engineering (Remote-Eligible)
@ Capital One | McLean, VA
Graduate Assistant - Bioinformatics
@ University of Arkansas System | University of Arkansas at Little Rock
Senior AI-HPC Cluster Engineer
@ NVIDIA | US, CA, Santa Clara