March 22, 2024, 4:45 a.m. | Junyi Wu, Bin Duan, Weitai Kang, Hao Tang, Yan Yan

cs.CV updates on arXiv.org arxiv.org

arXiv:2403.14552v1 Announce Type: new
Abstract: While Transformers have rapidly gained popularity in various computer vision applications, post-hoc explanations of their internal mechanisms remain largely unexplored. Vision Transformers extract visual information by representing image regions as transformed tokens and integrating them via attention weights. However, existing post-hoc explanation methods merely consider these attention weights, neglecting crucial information from the transformed tokens, which fails to accurately illustrate the rationales behind the models' predictions. To incorporate the influence of token transformation into interpretation, …

abstract applications arxiv attention computer computer vision cs.cv extract however image information them token tokens transformation transformer transformers type via vision vision transformers visual

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Senior Data Engineer

@ Cint | Gurgaon, India

Data Science (M/F), setor automóvel - Aveiro

@ Segula Technologies | Aveiro, Portugal