all AI news
MobileViTv3: Mobile-Friendly Vision Transformer with Simple and Effective Fusion of Local, Global and Input Features. (arXiv:2209.15159v2 [cs.CV] UPDATED)
Oct. 7, 2022, 1:16 a.m. | Shakti N. Wadekar, Abhishek Chaurasia
cs.CV updates on arXiv.org arxiv.org
MobileViT (MobileViTv1) combines convolutional neural networks (CNNs) and
vision transformers (ViTs) to create light-weight models for mobile vision
tasks. Though the main MobileViTv1-block helps to achieve competitive
state-of-the-art results, the fusion block inside MobileViTv1-block, creates
scaling challenges and has a complex learning task. We propose changes to the
fusion block that are simple and effective to create MobileViTv3-block, which
addresses the scaling and simplifies the learning task. Our proposed
MobileViTv3-block used to create MobileViTv3-XXS, XS and S models outperform
MobileViTv1 …
More from arxiv.org / cs.CV updates on arXiv.org
Jobs in AI, ML, Big Data
Senior ML Researcher - 3D Geometry Processing | 3D Shape Generation | 3D Mesh Data
@ Promaton | Europe
Data Architect
@ Western Digital | San Jose, CA, United States
Senior Data Scientist GenAI (m/w/d)
@ Deutsche Telekom | Bonn, Deutschland
Senior Data Engineer, Telco (Remote)
@ Lightci | Toronto, Ontario
Consultant Data Architect/Engineer H/F - Innovative Tech
@ Devoteam | Lyon, France
(Senior) ML Engineer / Software Engineer Machine Learning & AI (m/f/x) onsite or remote (in Germany or Austria)
@ Scalable GmbH | Wien, Germany