[R] The Expressive Power of Transformers with Chain of Thought | allainews.com

Jan. 6, 2024, 7:23 a.m. | /u/Wiskkey

Machine Learning www.reddit.com

[Paper](https://arxiv.org/abs/2310.07923). I am not affiliated with the authors.

Abstract:

>Recent theoretical work has identified surprisingly simple reasoning problems, such as checking if two nodes in a graph are connected or simulating finite-state machines, that are provably unsolvable by standard transformers that answer immediately after reading their input. However, in practice, transformers' reasoning can be improved by allowing them to use a "chain of thought" or "scratchpad", i.e., generate and condition on a sequence of intermediate tokens before answering. Motivated by …

abstract chain of thought generate graph machinelearning machines practice reading reasoning simple standard state them thought transformers work

More from www.reddit.com / Machine Learning

[D] How to train very shallow (dot product) networks with huge embeddings on a GPU … 4 hours ago | www.reddit.com

cluster compute cpu embedding +11

[P] Google Colab crashes before even training my images dataset. 17 hours ago | www.reddit.com

binary class classification colab +16

[D] Is Evaluating LLM Performance on Domain-Specific QA Sufficient for a Top-Tier Conference Submission? 18 hours ago | www.reddit.com

conference domain five hello +9

[N] Book Lauching: Accelerate Model Training with PyTorch 2.X 18 hours ago | www.reddit.com

ai workloads analyst book boosting +12

[D] Best community/website to find ML engineer interested in hourly work 21 hours ago | www.reddit.com

apis building community custom models +15

[D] What on earth is "discretization" step in Mamba? 23 hours ago | www.reddit.com

article core earth form +11

[R] Better & Faster Large Language Models via Multi-token Prediction 1 day ago | www.reddit.com

abstract efficiency future gpt +17

[D] How to use RAG benchmarks in practice 1 day, 4 hours ago | www.reddit.com

context datasets however machinelearning +5

[D] ECCV-2024 reviews are out 1 day, 13 hours ago | www.reddit.com

eccv machinelearning reviews

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

View on ai-jobs.net

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

View on ai-jobs.net

Lead Developer (AI)

@ Cere Network | San Francisco, US

View on ai-jobs.net

Research Engineer

@ Allora Labs | Remote

View on ai-jobs.net

Ecosystem Manager

@ Allora Labs | Remote

View on ai-jobs.net

Founding AI Engineer, Agents

@ Occam AI | New York

View on ai-jobs.net