[D] Does the existence of mesa optimizers in modern models like transformers make instrumental convergence (think paperclip maximizer) scenarios unlikely? | allainews.com

Sept. 18, 2023, 2:02 p.m. | /u/30299578815310

Machine Learning www.reddit.com

Recent work shows transformers are capable of performing multi-step gradient descent of mesa objectives inside of their transformer layers. This is even possible for linear transformers, which effectively perform linear optimization on deep representations of features calculated by earlier layers.

[https://arxiv.org/pdf/2309.05858.pdf](https://arxiv.org/pdf/2309.05858.pdf)

For those unfamiliar, [instrumental convergence](https://en.wikipedia.org/wiki/Instrumental_convergence) is the idea that entities with different goals will tend towards different subgoals. Examples could include gathering power, not dying, acquiring resources, etc. A famous thought experiment, known as the paperclip maximizer, is the …

convergence features gradient inside linear machinelearning mesa modern optimization shows think transformer transformers work

More from www.reddit.com / Machine Learning

Feeling at a loss with all these transformer models from Hugging Face in NLP "[Discussion]" 2 hours ago | www.reddit.com

classification competition essay face +14

[P] Open source library to scrape PDFs, YouTube, URLs, Presentations, etc for API-hosted vision-language models 13 hours ago | www.reddit.com

fun machinelearning

[P] LoRA from scratch implementation for LLM classifier training 16 hours ago | www.reddit.com

classifier implementation llm lora +3

[D] Dealing with conflicting training configurations in reference works. 17 hours ago | www.reddit.com

active learning compute detection machinelearning +7

[R] Marcus Hutter's work on Universal Artificial Intelligence 22 hours ago | www.reddit.com

artificial artificial intelligence bayesian biography +11

[P] LLMinator: A Llama.cpp + Gradio based opensource Chatbot to run llms locally(cpu/cuda) directly from … 1 day ago | www.reddit.com

chatbot community context cpp +13

[D] Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow 2nd Edition 1 day, 1 hour ago | www.reddit.com

book keras learn machine +7

[D] How to train very shallow (dot product) networks with huge embeddings on a GPU … 1 day, 1 hour ago | www.reddit.com

cluster compute cpu embedding +11

[P] Google Colab crashes before even training my images dataset. 1 day, 14 hours ago | www.reddit.com

binary class classification colab +16

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

View on ai-jobs.net

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

View on ai-jobs.net

Lead Developer (AI)

@ Cere Network | San Francisco, US

View on ai-jobs.net

Research Engineer

@ Allora Labs | Remote

View on ai-jobs.net

Ecosystem Manager

@ Allora Labs | Remote

View on ai-jobs.net

Founding AI Engineer, Agents

@ Occam AI | New York

View on ai-jobs.net