April 22, 2024, 4:45 a.m. | Jiangning Zhang, Xiangtai Li, Yabiao Wang, Chengjie Wang, Yibo Yang, Yong Liu, Dacheng Tao

cs.CV updates on arXiv.org arxiv.org

arXiv:2206.09325v2 Announce Type: replace
Abstract: Motivated by biological evolution, this paper explains the rationality of Vision Transformer by analogy with the proven practical Evolutionary Algorithm (EA) and derives that both have consistent mathematical formulation. Then inspired by effective EA variants, we propose a novel pyramid EATFormer backbone that only contains the proposed \emph{EA-based Transformer} (EAT) block, which consists of three residual parts, i.e., \emph{Multi-Scale Region Aggregation} (MSRA), \emph{Global and Local Interaction} (GLI), and \emph{Feed-Forward Network} (FFN) modules, to model multi-scale, …

algorithm arxiv cs.cv cs.et improving transformer type vision

Lead Developer (AI)

@ Cere Network | San Francisco, US

Research Engineer

@ Allora Labs | Remote

Ecosystem Manager

@ Allora Labs | Remote

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote