all AI news
Zero-Shot Video Captioning with Evolving Pseudo-Tokens. (arXiv:2207.11100v2 [cs.CV] UPDATED)
July 29, 2022, 1:12 a.m. | Yoad Tewel, Yoav Shalev, Roy Nadler, Idan Schwartz, Lior Wolf
cs.CV updates on arXiv.org arxiv.org
We introduce a zero-shot video captioning method that employs two frozen
networks: the GPT-2 language model and the CLIP image-text matching model. The
matching score is used to steer the language model toward generating a sentence
that has a high average matching score to a subset of the video frames. Unlike
zero-shot image captioning methods, our work considers the entire sentence at
once. This is achieved by optimizing, during the generation process, part of
the prompt from scratch, by modifying …
More from arxiv.org / cs.CV updates on arXiv.org
Jobs in AI, ML, Big Data
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Principal Engineer, Deep Learning
@ Outrider | Remote
Data Analyst (Bangkok based, relocation provided)
@ Agoda | Bangkok (Central World Office)
Data Scientist II
@ MoEngage | Bengaluru
Machine Learning Engineer
@ Sika AG | Welwyn Garden City, United Kingdom