all AI news
Zero-Shot Video Captioning with Evolving Pseudo-Tokens. (arXiv:2207.11100v1 [cs.CV])
July 25, 2022, 1:12 a.m. | Yoad Tewel, Yoav Shalev, Roy Nadler, Idan Schwartz, Lior Wolf
cs.CV updates on arXiv.org arxiv.org
We introduce a zero-shot video captioning method that employs two frozen
networks: the GPT-2 language model and the CLIP image-text matching model. The
matching score is used to steer the language model toward generating a sentence
that has a high average matching score to a subset of the video frames. Unlike
zero-shot image captioning methods, our work considers the entire sentence at
once. This is achieved by optimizing, during the generation process, part of
the prompt from scratch, by modifying …
More from arxiv.org / cs.CV updates on arXiv.org
Jobs in AI, ML, Big Data
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Lead Software Engineer - Artificial Intelligence, LLM
@ OpenText | Hyderabad, TG, IN
Lead Software Engineer- Python Data Engineer
@ JPMorgan Chase & Co. | GLASGOW, LANARKSHIRE, United Kingdom
Data Analyst (m/w/d)
@ Collaboration Betters The World | Berlin, Germany
Data Engineer, Quality Assurance
@ Informa Group Plc. | Boulder, CO, United States
Director, Data Science - Marketing
@ Dropbox | Remote - Canada