Feb. 25, 2022, 2:10 a.m. | Shizhe Chen, Pierre-Louis Guhur, Makarand Tapaswi, Cordelia Schmid, Ivan Laptev

cs.CV updates on arXiv.org arxiv.org

Following language instructions to navigate in unseen environments is a
challenging problem for autonomous embodied agents. The agent not only needs to
ground languages in visual scenes, but also should explore the environment to
reach its target. In this work, we propose a dual-scale graph transformer
(DUET) for joint long-term action planning and fine-grained cross-modal
understanding. We build a topological map on-the-fly to enable efficient
exploration in global action space. To balance the complexity of large action
space reasoning and …

arxiv cv global graph language navigation scale think transformer vision

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Data Analyst - Associate

@ JPMorgan Chase & Co. | Mumbai, Maharashtra, India

Staff Data Engineer (Data Platform)

@ Coupang | Seoul, South Korea

AI/ML Engineering Research Internship

@ Keysight Technologies | Santa Rosa, CA, United States

Sr. Director, Head of Data Management and Reporting Execution

@ Biogen | Cambridge, MA, United States

Manager, Marketing - Audience Intelligence (Senior Data Analyst)

@ Delivery Hero | Singapore, Singapore