Web: http://arxiv.org/abs/2104.13921

May 13, 2022, 1:11 a.m. | Xiuye Gu, Tsung-Yi Lin, Weicheng Kuo, Yin Cui

cs.LG updates on arXiv.org arxiv.org

We aim at advancing open-vocabulary object detection, which detects objects
described by arbitrary text inputs. The fundamental challenge is the
availability of training data. It is costly to further scale up the number of
classes contained in existing object detection datasets. To overcome this
challenge, we propose ViLD, a training method via Vision and Language knowledge
Distillation. Our method distills the knowledge from a pretrained
open-vocabulary image classification model (teacher) into a two-stage detector
(student). Specifically, we use the teacher …

arxiv cv detection distillation knowledge language open vision

More from arxiv.org / cs.LG updates on arXiv.org

Predictive Ecology Postdoctoral Fellow

@ Lawrence Berkeley National Lab | Berkeley, CA

Data Analyst, Patagonia Action Works

@ Patagonia | Remote

Data & Insights Strategy & Innovation General Manager

@ Chevron Services Company, a division of Chevron U.S.A Inc. | Houston, TX

Faculty members in Research areas such as Bayesian and Spatial Statistics; Data Privacy and Security; AI/ML; NLP; Image and Video Data Analysis

@ Ahmedabad University | Ahmedabad, India

Director, Applied Mathematics & Computational Research Division

@ Lawrence Berkeley National Lab | Berkeley, Ca

Business Data Analyst

@ MainStreet Family Care | Birmingham, AL