Dissociation of Faithful and Unfaithful Reasoning in LLMs | allainews.com

May 27, 2024, 4:49 a.m. | Evelyn Yee, Alice Li, Chenyu Tang, Yeon Ho Jung, Ramamohan Paturi, Leon Bergen

cs.CL updates on arXiv.org arxiv.org

arXiv:2405.15092v1 Announce Type: cross
Abstract: Large language models (LLMs) improve their performance in downstream tasks when they generate Chain of Thought reasoning text before producing an answer. Our research investigates how LLMs recover from errors in Chain of Thought, reaching the correct final answer despite mistakes in the reasoning text. Through analysis of these error recovery behaviors, we find evidence for unfaithfulness in Chain of Thought, but we also identify many clear examples of faithful error recovery behaviors. We identify …

abstract analysis arxiv chain of thought cs.ai cs.cl errors generate language language models large language large language models llms mistakes performance reasoning research tasks text thought through type

More from arxiv.org / cs.CL updates on arXiv.org

A Resilient and Accessible Distribution-Preserving Watermark for Large Language Models 9 hours ago | arxiv.org

abstract arxiv challenge contents +22

NExT-GPT: Any-to-Any Multimodal LLM 9 hours ago | arxiv.org

arxiv cs.ai cs.cl cs.lg +6

Practical Membership Inference Attacks against Fine-tuned Large Language Models via Self-prompt Calibration 9 hours ago | arxiv.org

abstract aim arxiv attacks +22

Knowledge Crosswords: Geometric Knowledge Reasoning with Large Language Models 9 hours ago | arxiv.org

abstract arxiv benchmark beyond +16

LinkTransformer: A Unified Package for Record Linkage with Transformer Language Models 9 hours ago | arxiv.org

abstract arxiv business cs.cl +23

S$^3$HQA: A Three-Stage Approach for Multi-hop Text-Table Hybrid Question Answering 9 hours ago | arxiv.org

abstract arxiv cs.cl framework +16

COFFEE: A Contrastive Oracle-Free Framework for Event Extraction 9 hours ago | arxiv.org

abstract annotations arxiv classification +19

Fine-Grained Detection of Solidarity for Women and Migrants in 155 Years of German Parliamentary Debates 9 hours ago | arxiv.org

abstract arxiv concept cs.cl +14

FedBiOT: LLM Local Fine-tuning in Federated Learning without Full Model 9 hours ago | arxiv.org

abstract arxiv cs.cl cs.dc +21

AI Focused Biochemistry Postdoctoral Fellow

@ Lawrence Berkeley National Lab | Berkeley, CA

View on ai-jobs.net

Senior Data Engineer

@ Displate | Warsaw

View on ai-jobs.net

Staff Software Engineer (Data Platform)

@ Phaidra | Remote

View on ai-jobs.net

Distributed Compute Engineer

@ Magic | San Francisco

View on ai-jobs.net

Power Platform Developer/Consultant

@ Euromonitor | Bengaluru, Karnataka, India

View on ai-jobs.net

Finance Project Senior Manager

@ QIMA | London, United Kingdom

View on ai-jobs.net