Web: http://arxiv.org/abs/2205.05739

May 13, 2022, 1:10 a.m. | Avinash Madasu, Junier Oliva, Gedas Bertasius

cs.CV updates on arXiv.org arxiv.org

The majority of traditional text-to-video retrieval systems operate in static
environments, i.e., there is no interaction between the user and the agent
beyond the initial textual query provided by the user. This can be suboptimal
if the initial query has ambiguities, which would lead to many falsely
retrieved videos. To overcome this limitation, we propose a novel framework for
Video Retrieval using Dialog (ViReD), which enables the user to interact with
an AI agent via multiple rounds of dialog. The …

arxiv cv learning videos

