all AI news
R3M: A Universal Visual Representation for Robot Manipulation. (arXiv:2203.12601v3 [cs.RO] UPDATED)
Nov. 21, 2022, 2:14 a.m. | Suraj Nair, Aravind Rajeswaran, Vikash Kumar, Chelsea Finn, Abhinav Gupta
cs.CV updates on arXiv.org arxiv.org
We study how visual representations pre-trained on diverse human video data
can enable data-efficient learning of downstream robotic manipulation tasks.
Concretely, we pre-train a visual representation using the Ego4D human video
dataset using a combination of time-contrastive learning, video-language
alignment, and an L1 penalty to encourage sparse and compact representations.
The resulting representation, R3M, can be used as a frozen perception module
for downstream policy learning. Across a suite of 12 simulated robot
manipulation tasks, we find that R3M improves …
More from arxiv.org / cs.CV updates on arXiv.org
Retrieval-Augmented Egocentric Video Captioning
2 days, 22 hours ago |
arxiv.org
Mirror-Aware Neural Humans
2 days, 22 hours ago |
arxiv.org
Jobs in AI, ML, Big Data
Software Engineer for AI Training Data (School Specific)
@ G2i Inc | Remote
Software Engineer for AI Training Data (Python)
@ G2i Inc | Remote
Software Engineer for AI Training Data (Tier 2)
@ G2i Inc | Remote
Data Engineer
@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania
Artificial Intelligence – Bioinformatic Expert
@ University of Texas Medical Branch | Galveston, TX
Lead Developer (AI)
@ Cere Network | San Francisco, US