all AI news
Leveraging Large Language Model-based Room-Object Relationships Knowledge for Enhancing Multimodal-Input Object Goal Navigation
March 22, 2024, 4:46 a.m. | Leyuan Sun, Asako Kanezaki, Guillaume Caron, Yusuke Yoshiyasu
cs.CV updates on arXiv.org arxiv.org
Abstract: Object-goal navigation is a crucial engineering task for the community of embodied navigation; it involves navigating to an instance of a specified object category within unseen environments. Although extensive investigations have been conducted on both end-to-end and modular-based, data-driven approaches, fully enabling an agent to comprehend the environment through perceptual knowledge and perform object-goal navigation as efficiently as humans remains a significant challenge. Recently, large language models have shown potential in this task, thanks to …
arxiv cs.ai cs.cv cs.ro knowledge language language model large language large language model multimodal navigation object relationships room type
More from arxiv.org / cs.CV updates on arXiv.org
Eyes Wide Shut? Exploring the Visual Shortcomings of Multimodal LLMs
2 days, 13 hours ago |
arxiv.org
Jobs in AI, ML, Big Data
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
RL Analytics - Content, Data Science Manager
@ Meta | Burlingame, CA
Research Engineer
@ BASF | Houston, TX, US, 77079