March 27, 2024, 4:45 a.m. | Ganlong Zhao, Guanbin Li, Weikai Chen, Yizhou Yu

cs.CV updates on arXiv.org arxiv.org

arXiv:2403.17334v1 Announce Type: new
Abstract: Recent advances in Iterative Vision-and-Language Navigation (IVLN) introduce a more meaningful and practical paradigm of VLN by maintaining the agent's memory across tours of scenes. Although the long-term memory aligns better with the persistent nature of the VLN task, it poses more challenges on how to utilize the highly unstructured navigation memory with extremely sparse supervision. Towards this end, we propose OVER-NAV, which aims to go over and beyond the current arts of IVLN techniques. …

abstract advances agent arxiv challenges cs.cv detection iterative language long-term memory nature navigation paradigm practical representation type vision vision-and-language

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne