June 5, 2024, 4:52 a.m. | Guangliang Liu, Haitao Mao, Bochuan Cao, Zhiyu Xue, Kristen Johnson, Jiliang Tang, Rongrong Wang

cs.CL updates on arXiv.org arxiv.org

arXiv:2406.02378v1 Announce Type: new
Abstract: Large Language Models (LLMs) can improve their responses when instructed to do so, a capability known as self-correction. When these instructions lack specific details about the issues in the response, this is referred to as leveraging the intrinsic self-correction capability. The empirical success of self-correction can be found in various applications, e.g., text detoxification and social bias mitigation. However, leveraging this self-correction capability may not always be effective, as it has the potential to revise …

abstract arxiv capability concept cs.cl intrinsic language language models large language large language models llms responses success type uncertainty

