March 21, 2024, 4:48 a.m. | Adam Tauman Kalai, Santosh S. Vempala

cs.CL updates on arXiv.org arxiv.org

arXiv:2311.14648v3 Announce Type: replace
Abstract: Recent language models generate false but plausible-sounding text with surprising frequency. Such "hallucinations" are an obstacle to the usability of language-based AI systems and can harm people who rely upon their outputs. This work shows that there is an inherent statistical lower-bound on the rate that pretrained language models hallucinate certain types of facts, having nothing to do with the transformer LM architecture or data quality. For "arbitrary" facts whose veracity cannot be determined from …

abstract ai systems arxiv cs.ai cs.cl false generate hallucinations harm language language models people rate shows statistical systems text type usability work

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Senior Machine Learning Engineer

@ Samsara | Canada - Remote