all AI news
A Generative Approach for Wikipedia-Scale Visual Entity Recognition
March 5, 2024, 2:49 p.m. | Mathilde Caron, Ahmet Iscen, Alireza Fathi, Cordelia Schmid
cs.CV updates on arXiv.org arxiv.org
Abstract: In this paper, we address web-scale visual entity recognition, specifically the task of mapping a given query image to one of the 6 million existing entities in Wikipedia. One way of approaching a problem of such scale is using dual-encoder models (eg CLIP), where all the entity names and query images are embedded into a unified space, paving the way for an approximate k-NN search. Alternatively, it is also possible to re-purpose a captioning model …
abstract arxiv clip cs.cv encoder generative image mapping paper query recognition scale type visual web wikipedia
More from arxiv.org / cs.CV updates on arXiv.org
Jobs in AI, ML, Big Data
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Business Data Scientist, gTech Ads
@ Google | Mexico City, CDMX, Mexico
Lead, Data Analytics Operations
@ Zocdoc | Pune, Maharashtra, India