Web: http://arxiv.org/abs/2109.03891

Sept. 15, 2022, 1:13 a.m. | Wentao Yuan, Chris Paxton, Karthik Desingh, Dieter Fox

cs.CV updates on arXiv.org arxiv.org

Sequential manipulation tasks require a robot to perceive the state of an
environment and plan a sequence of actions leading to a desired goal state. In
such tasks, the ability to reason about spatial relations among object entities
from raw sensor inputs is crucial in order to determine when a task has been
completed and which actions can be executed. In this work, we propose SORNet
(Spatial Object-Centric Representation Network), a framework for learning
object-centric representations from RGB images conditioned …


