April 25, 2024, 7:43 p.m. | Zerui Chen, Shizhe Chen, Cordelia Schmid, Ivan Laptev

cs.LG updates on arXiv.org arxiv.org

arXiv:2404.15709v1 Announce Type: cross
Abstract: In this work, we aim to learn a unified vision-based policy for a multi-fingered robot hand to manipulate different objects in diverse poses. Though prior work has demonstrated that human videos can benefit policy learning, performance improvement has been limited by physically implausible trajectories extracted from videos. Moreover, reliance on privileged object information such as ground-truth object states further limits the applicability in realistic scenarios. To address these limitations, we propose a new framework ViViDex …

abstract aim arxiv benefit cs.cv cs.lg cs.ro diverse human improvement learn manipulation objects performance policy prior robot type videos vision work

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Data Engineer

@ Kaseya | Bengaluru, Karnataka, India