all AI news
F-VLM: Open-Vocabulary Object Detection upon Frozen Vision and Language Models. (arXiv:2209.15639v1 [cs.CV])
Oct. 3, 2022, 1:15 a.m. | Weicheng Kuo, Yin Cui, Xiuye Gu, AJ Piergiovanni, Anelia Angelova
cs.CV updates on arXiv.org arxiv.org
We present F-VLM, a simple open-vocabulary object detection method built upon
Frozen Vision and Language Models. F-VLM simplifies the current multi-stage
training pipeline by eliminating the need for knowledge distillation or
detection-tailored pretraining. Surprisingly, we observe that a frozen VLM: 1)
retains the locality-sensitive features necessary for detection, and 2) is a
strong region classifier. We finetune only the detector head and combine the
detector and VLM outputs for each region at inference time. F-VLM shows
compelling scaling behavior and …
More from arxiv.org / cs.CV updates on arXiv.org
Jobs in AI, ML, Big Data
Senior ML Researcher - 3D Geometry Processing | 3D Shape Generation | 3D Mesh Data
@ Promaton | Europe
Principal Data Engineer
@ RS21 | Remote
SQL/Power BI Developer
@ ICF | Virginia Remote Office (VA99)
Senior Machine Learning Engineer (Canada Remote)
@ Fullscript | Ottawa, ON
Software Engineer - MLOps.
@ Renesas Electronics | Toyosu, Japan
Junior Data Scientist / Artificial Intelligence consultant
@ Deloitte | Luxembourg, LU