Web: http://arxiv.org/abs/2201.10788

Jan. 27, 2022, 2:10 a.m. | Sinan Tan, Mengmeng Ge, Di Guo, Huaping Liu, Fuchun Sun

cs.CV updates on arXiv.org arxiv.org

In the Vision-and-Language Navigation task, the embodied agent follows
linguistic instructions and navigates to a specific goal. It is important in
many practical scenarios and has attracted extensive attention from both
computer vision and robotics communities. However, most existing works only use
RGB images but neglect the 3D semantic information of the scene. To this end,
we develop a novel self-supervised training framework to encode the voxel-level
3D semantic reconstruction into a 3D semantic representation. Specifically, a
region query task …

3d arxiv cv language learning navigation semantic vision

More from arxiv.org / cs.CV updates on arXiv.org

Data Scientist

@ Fluent, LLC | Boca Raton, Florida, United States

Big Data ETL Engineer

@ Binance.US | Vancouver

Data Scientist / Data Engineer

@ Kin + Carta | Chicago

Data Engineer

@ Craft | Warsaw, Masovian Voivodeship, Poland

Senior Manager, Data Analytics Audit

@ Affirm | Remote US

Data Scientist - Nationwide Opportunities, AWS Professional Services

@ Amazon.com | US, NC, Virtual Location - N Carolina