March 28, 2024, 4:42 a.m. | Badri N. Patro, Vinay P. Namboodiri, Vijay S. Agneeswaran

cs.LG updates on arXiv.org arxiv.org

arXiv:2403.18063v1 Announce Type: cross
Abstract: Transformers used in vision have been investigated through diverse architectures - ViT, PVT, and Swin. These have worked to improve the attention mechanism and make it more efficient. Differently, the need for including local information was felt, leading to incorporating convolutions in transformers such as CPVT and CvT. Global information is captured using a complex Fourier basis to achieve global token mixing through various methods, such as AFNO, GFNet, and Spectformer. We advocate combining three …

arxiv cs.ai cs.cl cs.cv cs.lg cs.mm operators transformer type view vision

Data Scientist (m/f/x/d)

@ Symanto Research GmbH & Co. KG | Spain, Germany

Data Analyst

@ S&P Global | IN - HYDERABAD SKYVIEW

EY GDS Internship Program - Junior Data Visualization Engineer (June - July 2024)

@ EY | Wrocław, DS, PL, 50-086

Staff Data Scientist

@ ServiceTitan | INT Armenia Yerevan

Master thesis on deterministic AI inference on-board Telecom Satellites

@ Airbus | Taufkirchen / Ottobrunn

Lead Data Scientist

@ Picket | Seattle, WA