Aug. 10, 2023, 4:49 a.m. | Rui Liu, Xiaohan Wang, Wenguan Wang, Yi Yang

cs.CV updates on arXiv.org arxiv.org

Vision-language navigation (VLN), which entails an agent to navigate 3D
environments following human instructions, has shown great advances. However,
current agents are built upon panoramic observations, which hinders their
ability to perceive 3D scene geometry and easily leads to ambiguous selection
of panoramic view. To address these limitations, we present a BEV Scene Graph
(BSG), which leverages multi-step BEV representations to encode scene layouts
and geometric cues of indoor environment under the supervision of 3D detection.
During navigation, BSG builds …

agents arxiv bird current environments geometry graph human language leads limitations navigation scene geometry vision

Software Engineer for AI Training Data (School Specific)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Python)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Tier 2)

@ G2i Inc | Remote

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US