all AI news
Vision Transformers Need Registers
April 15, 2024, 4:45 a.m. | Timoth\'ee Darcet, Maxime Oquab, Julien Mairal, Piotr Bojanowski
cs.CV updates on arXiv.org arxiv.org
Abstract: Transformers have recently emerged as a powerful tool for learning visual representations. In this paper, we identify and characterize artifacts in feature maps of both supervised and self-supervised ViT networks. The artifacts correspond to high-norm tokens appearing during inference primarily in low-informative background areas of images, that are repurposed for internal computations. We propose a simple yet effective solution based on providing additional tokens to the input sequence of the Vision Transformer to fill that …
abstract arxiv cs.cv feature identify images inference low maps networks norm paper tokens tool transformers type vision vision transformers visual vit
More from arxiv.org / cs.CV updates on arXiv.org
Jobs in AI, ML, Big Data
AI Research Scientist
@ Vara | Berlin, Germany and Remote
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
AI Engineering Manager
@ M47 Labs | Barcelona, Catalunya [Cataluña], Spain