all AI news
[P] Treeformer: hard attention + decision trees = causal language modelling
March 24, 2024, 2:59 p.m. | /u/jessielesbian
Machine Learning www.reddit.com
The Treeformer is a decision tree with transformer-like attention. It works by classifying attention heads by the predicted next token. Each attention head contains the list of all previous tokens, in addition to a single lookback variable.
[Treeformer attention head state](https://preview.redd.it/34zfkcr1paqc1.png?width=382&format=png&auto=webp&s=cfc4e5834f40f04406e72de3ee34b5421eee323a)
The Treeformer has 3 types of decision tree nodes: dynamic relative lookback, static relative lookback, and static absolute lookback.
Dynamic relative lookback checks if a …
attention causal decision decision trees head language language modelling list machinelearning modelling next token tokens transformer tree trees
More from www.reddit.com / Machine Learning
Jobs in AI, ML, Big Data
Founding AI Engineer, Agents
@ Occam AI | New York
AI Engineer Intern, Agents
@ Occam AI | US
AI Research Scientist
@ Vara | Berlin, Germany and Remote
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Business Intelligence Architect - Specialist
@ Eastman | Hyderabad, IN, 500 008