March 11, 2024, 4:42 a.m. | Yixuan Huang, Jialin Yuan, Chanho Kim, Pupul Pradhan, Bryan Chen, Li Fuxin, Tucker Hermans

cs.LG updates on arXiv.org arxiv.org

arXiv:2309.15278v2 Announce Type: replace-cross
Abstract: Robots need to have a memory of previously observed, but currently occluded objects to work reliably in realistic environments. We investigate the problem of encoding object-oriented memory into a multi-object manipulation reasoning and planning framework. We propose DOOM and LOOM, which leverage transformer relational dynamics to encode the history of trajectories given partial-view point clouds and an object discovery and tracking engine. Our approaches can perform multiple challenging tasks including reasoning with occluded objects, novel …

abstract arxiv cs.ai cs.cv cs.lg cs.ro encoding environments framework manipulation memory mind object object-oriented objects planning reasoning robots tracking type video work

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Director, Clinical Data Science

@ Aura | Remote USA

Research Scientist, AI (PhD)

@ Meta | Menlo Park, CA | New York City