all AI news
Parameterization of Cross-Token Relations with Relative Positional Encoding for Vision MLP. (arXiv:2207.07284v2 [cs.CV] UPDATED)
Sept. 13, 2022, 1:15 a.m. | Zhicai Wang, Yanbin Hao, Xingyu Gao, Hao Zhang, Shuo Wang, Tingting Mu, Xiangnan He
cs.CV updates on arXiv.org arxiv.org
Vision multi-layer perceptrons (MLPs) have shown promising performance in
computer vision tasks, and become the main competitor of CNNs and vision
Transformers. They use token-mixing layers to capture cross-token interactions,
as opposed to the multi-head self-attention mechanism used by Transformers.
However, the heavily parameterized token-mixing layers naturally lack
mechanisms to capture local information and multi-granular non-local relations,
thus their discriminative power is restrained. To tackle this issue, we propose
a new positional spacial gating unit (PoSGU). It exploits the attention …
More from arxiv.org / cs.CV updates on arXiv.org
Jobs in AI, ML, Big Data
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Data Analyst - Associate
@ JPMorgan Chase & Co. | Mumbai, Maharashtra, India
Staff Data Engineer (Data Platform)
@ Coupang | Seoul, South Korea
AI/ML Engineering Research Internship
@ Keysight Technologies | Santa Rosa, CA, United States
Sr. Director, Head of Data Management and Reporting Execution
@ Biogen | Cambridge, MA, United States
Manager, Marketing - Audience Intelligence (Senior Data Analyst)
@ Delivery Hero | Singapore, Singapore