all AI news
Bridging the Gap between Object and Image-level Representations for Open-Vocabulary Detection. (arXiv:2207.03482v1 [cs.CV])
July 8, 2022, 1:12 a.m. | Hanoona Rasheed, Muhammad Maaz, Muhammad Uzair Khattak, Salman Khan, Fahad Shahbaz Khan
cs.CV updates on arXiv.org arxiv.org
Existing open-vocabulary object detectors typically enlarge their vocabulary
sizes by leveraging different forms of weak supervision. This helps generalize
to novel objects at inference. Two popular forms of weak-supervision used in
open-vocabulary detection (OVD) include pretrained CLIP model and image-level
supervision. We note that both these modes of supervision are not optimally
aligned for the detection task: CLIP is trained with image-text pairs and lacks
precise localization of objects while the image-level supervision has been used
with heuristics that do …
More from arxiv.org / cs.CV updates on arXiv.org
Jobs in AI, ML, Big Data
Senior ML Researcher - 3D Geometry Processing | 3D Shape Generation | 3D Mesh Data
@ Promaton | Europe
Senior AI Engineer, EdTech (Remote)
@ Lightci | Toronto, Ontario
Data Scientist for Salesforce Applications
@ ManTech | 781G - Customer Site,San Antonio,TX
AI Research Scientist
@ Gridmatic | Cupertino, CA
Data Engineer
@ Global Atlantic Financial Group | Boston, Massachusetts, United States
Machine Learning Engineer - Conversation AI
@ DoorDash | Sunnyvale, CA; San Francisco, CA; Seattle, WA; Los Angeles, CA