March 14, 2024, 4:46 a.m. | Ran Xu, Yan Shen, Xiaoqi Li, Ruihai Wu, Hao Dong

cs.CV updates on arXiv.org arxiv.org

arXiv:2403.08355v1 Announce Type: cross
Abstract: Enabling home-assistant robots to perceive and manipulate a diverse range of 3D objects based on human language instructions is a pivotal challenge. Prior research has predominantly focused on simplistic and task-oriented instructions, i.e., "Slide the top drawer open". However, many real-world tasks demand intricate multi-step reasoning, and without human instructions, these will become extremely difficult for robot manipulation. To address these challenges, we introduce a comprehensive benchmark, NrVLM, comprising 15 distinct manipulation tasks, containing over …

arxiv cs.cv cs.ro fine-grained language manipulation natural natural language type visual

Senior Data Engineer

@ Displate | Warsaw

Principal Architect

@ eSimplicity | Silver Spring, MD, US

Embedded Software Engineer

@ Carrier | CAN03: Carrier-Charlotte, NC 9701 Old Statesville Road, Charlotte, NC, 28269 USA

(USA) Software Engineer III

@ Roswell Park Comprehensive Cancer Center | (USA) CA SUNNYVALE Home Office SUNNYVALE III - 840 W CALIFORNIA

Experienced Manufacturing and Automation Engineer

@ Boeing | DEU - Munich, Germany

Software Engineering-Sr Engineer (Java 17, Python, Microservices, Spring Boot, REST)

@ FICO | Bengaluru, India