Feb. 7, 2024, 5:47 a.m. | Zhaohui Jiang Paul Weng

cs.CV updates on arXiv.org arxiv.org

To improve the sample efficiency of vision-based deep reinforcement learning (RL), we propose a novel method, called SPIRL, to automatically extract important patches from input images. Following Masked Auto-Encoders, SPIRL is based on Vision Transformer models pre-trained in a self-supervised fashion to reconstruct images from randomly-sampled patches. These pre-trained models can then be exploited to detect and select salient patches, defined as hard to reconstruct from neighboring patches. In RL, the SPIRL agent processes selected salient patches via an attention …

auto cs.ai cs.cv data efficiency extract fashion images novel pre-trained models reinforcement reinforcement learning sample transformer transformer models unsupervised vision

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Sr. VBI Developer II

@ Atos | Texas, US, 75093

Wealth Management - Data Analytics Intern/Co-op Fall 2024

@ Scotiabank | Toronto, ON, CA