March 14, 2024, 4:46 a.m. | Mohammad Nazeri, Junzhe Wang, Amirreza Payandeh, Xuesu Xiao

cs.CV updates on

arXiv:2403.08109v1 Announce Type: cross
Abstract: Humans excel at efficiently navigating through crowds without collision by focusing on specific visual regions relevant to navigation. However, most robotic visual navigation methods rely on deep learning models pre-trained on vision tasks, which prioritize salient objects -- not necessarily relevant to navigation and potentially misleading. Alternative approaches train specialized navigation models from scratch, requiring significant computation. On the other hand, self-supervised learning has revolutionized computer vision and natural language processing, but its application to …

