[R] LLMs cannot find reasoning errors, but can correct them! | allainews.com

Nov. 20, 2023, 5:40 p.m. | /u/gladystyen

Machine Learning www.reddit.com

Hi Reddit,

I recently did an internship at Google and wrote a paper on LLM self-correction. We released a dataset of Chain-of-Thought reasoning steps, generated using PaLM 2, and annotated with the location of the first logical error. Thought some folks here might be interested!

Paper link: [https://arxiv.org/abs/2311.08516](https://arxiv.org/abs/2311.08516)

GitHub link: [https://github.com/WHGTyen/BIG-Bench-Mistake](https://github.com/WHGTyen/BIG-Bench-Mistake)

# TL;DR

Recently, Google DeepMind showed that [LLMs cannot self-correct reasoning errors without external feedback](https://arxiv.org/abs/2310.01798). We wanted to investigate this and set out to answer these questions:

1. Can …

dataset error errors generated google internship llm llms location machinelearning palm palm 2 paper reasoning reddit them thought

More from www.reddit.com / Machine Learning

[P] LoRA from scratch implementation for LLM classifier training 4 hours ago | www.reddit.com

classifier implementation llm lora +3

[D] Dealing with conflicting training configurations in reference works. 4 hours ago | www.reddit.com

active learning compute detection machinelearning +7

[R] Marcus Hutter's work on Universal Artificial Intelligence 10 hours ago | www.reddit.com

artificial artificial intelligence bayesian biography +11

[D] Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow 2nd Edition 12 hours ago | www.reddit.com

book keras learn machine +7

[D] How to train very shallow (dot product) networks with huge embeddings on a GPU … 13 hours ago | www.reddit.com

cluster compute cpu embedding +11

[P] Google Colab crashes before even training my images dataset. 1 day, 2 hours ago | www.reddit.com

binary class classification colab +16

[D] Is Evaluating LLM Performance on Domain-Specific QA Sufficient for a Top-Tier Conference Submission? 1 day, 3 hours ago | www.reddit.com

conference domain five hello +9

[N] Book Lauching: Accelerate Model Training with PyTorch 2.X 1 day, 3 hours ago | www.reddit.com

ai workloads analyst book boosting +12

[D] Best community/website to find ML engineer interested in hourly work 1 day, 6 hours ago | www.reddit.com

apis building community custom models +15

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

View on ai-jobs.net

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

View on ai-jobs.net

Lead Developer (AI)

@ Cere Network | San Francisco, US

View on ai-jobs.net

Research Engineer

@ Allora Labs | Remote

View on ai-jobs.net

Ecosystem Manager

@ Allora Labs | Remote

View on ai-jobs.net

Founding AI Engineer, Agents

@ Occam AI | New York

View on ai-jobs.net