Feb. 13, 2024, 5:44 a.m. | Philipp Schoenegger Peter S. Park Ezra Karger Philip E. Tetlock

cs.LG updates on arXiv.org arxiv.org

Large language models (LLMs) show impressive capabilities, matching and sometimes exceeding human performance in many domains. This study explores the potential of LLMs to augment judgement in forecasting tasks. We evaluated the impact on forecasting accuracy of two GPT-4-Turbo assistants: one designed to provide high-quality advice ('superforecasting'), and the other designed to be overconfident and base-rate-neglecting. Participants (N = 991) had the option to consult their assigned LLM assistant throughout the study, in contrast to a control group that used …

accuracy advice assistants capabilities cs.ai cs.cl cs.cy cs.lg domains forecasting gpt gpt-4 gpt-4-turbo human human performance impact language language models large language large language models llm llms performance predictions quality show study tasks turbo

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne