Web: http://arxiv.org/abs/2206.06829

June 16, 2022, 1:13 a.m. | Peixian Chen, Mengdan Zhang, Yunhang Shen, Kekai Sheng, Yuting Gao, Xing Sun, Ke Li, Chunhua Shen (Tencent Youtu Lab)

cs.CV updates on arXiv.org arxiv.org

Vision transformers (ViTs) are changing the landscape of object detection
approaches. A natural usage of ViTs in detection is to replace the CNN-based
backbone with a transformer-based backbone, which is straightforward and
effective, with the price of bringing considerable computation burden for
inference. More subtle usage is the DETR family, which eliminates the need for
many hand-designed components in object detection but introduces a decoder
demanding an extra-long time to converge. As a result, transformer-based object
detection can not prevail …

arxiv cv detection free transformers

More from arxiv.org / cs.CV updates on arXiv.org

Machine Learning Researcher - Saalfeld Lab

@ Howard Hughes Medical Institute - Chevy Chase, MD | Ashburn, Virginia

Project Director, Machine Learning in US Health

@ ideas42.org | Remote, US

Data Science Intern

@ NannyML | Remote

Machine Learning Engineer NLP/Speech

@ Play.ht | Remote

Research Scientist, 3D Reconstruction

@ Yembo | Remote, US

Clinical Assistant or Associate Professor of Management Science and Systems

@ University at Buffalo | Buffalo, NY