May 12, 2023, 12:45 a.m. | Xinyu Liu, Houwen Peng, Ningxin Zheng, Yuqing Yang, Han Hu, Yixuan Yuan

cs.CV updates on arXiv.org arxiv.org

Vision transformers have shown great success due to their high model
capabilities. However, their remarkable performance is accompanied by heavy
computation costs, which makes them unsuitable for real-time applications. In
this paper, we propose a family of high-speed vision transformers named
EfficientViT. We find that the speed of existing transformer models is commonly
bounded by memory inefficient operations, especially the tensor reshaping and
element-wise functions in MHSA. Therefore, we design a new building block with
a sandwich layout, i.e., using …

applications arxiv attention computation costs family memory paper performance real-time real-time applications speed success transformer transformers vision

Lead Developer (AI)

@ Cere Network | San Francisco, US

Research Engineer

@ Allora Labs | Remote

Ecosystem Manager

@ Allora Labs | Remote

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote