July 13, 2023, 5:16 a.m. | /u/bono-93

Machine Learning www.reddit.com

I have conducted experiments and examples on accelerating ViT (Vision Transformer) using methods such as TensorRT, FasterTransformer, and xFormers. The experiments were conducted using a single A100 as a baseline. - [https://github.com/bnabis93/vision-language-examples/tree/main/acceleration](https://github.com/bnabis93/vision-language-examples/tree/main/acceleration)


In xFormers, I tried applying sparse attention and memory-efficient attention to ViT, but there was an issue where the speed actually decreased. Therefore, I excluded those results.


Generally, just performing TensorRT conversion significantly improves latency. In the case of faster transformer, optimized kernels are not provided in fp32, …

attention case conversion faster issue latency machinelearning memory speed tensorrt transformer vit

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US

Research Engineer

@ Allora Labs | Remote

Ecosystem Manager

@ Allora Labs | Remote

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US