Examining LLMs' Uncertainty Expression Towards Questions Outside Parametric Knowledge | allainews.com

Feb. 19, 2024, 5:43 a.m. | Genglin Liu, Xingyao Wang, Lifan Yuan, Yangyi Chen, Hao Peng

cs.LG updates on arXiv.org arxiv.org

arXiv:2311.09731v2 Announce Type: replace-cross
Abstract: Can large language models (LLMs) express their uncertainty in situations where they lack sufficient parametric knowledge to generate reasonable responses? This work aims to systematically investigate LLMs' behaviors in such situations, emphasizing the trade-off between honesty and helpfulness. To tackle the challenge of precisely determining LLMs' knowledge gaps, we diagnostically create unanswerable questions containing non-existent concepts or false premises, ensuring that they are outside the LLMs' vast training data. By compiling a benchmark, UnknownBench, which …

abstract arxiv challenge cs.ai cs.cl cs.lg express generate honesty knowledge language language models large language large language models llms parametric questions responses trade trade-off type uncertainty work

More from arxiv.org / cs.LG updates on arXiv.org

Deep learning enhanced mixed integer optimization: Learning to reduce model dimensionality 22 hours ago | arxiv.org

abstract arxiv complexity computational +20

Moderating New Waves of Online Hate with Chain-of-Thought Reasoning in Large Language Models 22 hours ago | arxiv.org

abstract arxiv cs.cl cs.cy +14

CaloQVAE : Simulating high-energy particle-calorimeter interactions using hybrid quantum-classical generative models 22 hours ago | arxiv.org

abstract analysis arxiv challenges +23

Swallowing the Bitter Pill: Simplified Scalable Conformer Generation 22 hours ago | arxiv.org

abstract advantages art arxiv +18

Intrinsic Bayesian Cram\'er-Rao Bound with an Application to Covariance Matrix Estimation 22 hours ago | arxiv.org

abstract application arxiv bayesian +18

Field-level simulation-based inference with galaxy catalogs: the impact of systematic effects 22 hours ago | arxiv.org

abstract arxiv astro-ph.co astro-ph.ga +19

Faithfulness Measurable Masked Language Models 22 hours ago | arxiv.org

abstract arxiv cs.cl cs.lg +12

Preserving Tumor Volumes for Unsupervised Medical Image Registration 22 hours ago | arxiv.org

arxiv cs.cv cs.lg eess.iv +6

Flexible and efficient spatial extremes emulation via variational autoencoders 22 hours ago | arxiv.org

abstract aim arxiv autoencoders +13

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

View on ai-jobs.net

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

View on ai-jobs.net

Lead Developer (AI)

@ Cere Network | San Francisco, US

View on ai-jobs.net

Research Engineer

@ Allora Labs | Remote

View on ai-jobs.net

Ecosystem Manager

@ Allora Labs | Remote

View on ai-jobs.net

Founding AI Engineer, Agents

@ Occam AI | New York

View on ai-jobs.net