LLARVA: Vision-Action Instruction Tuning Enhances Robot Learning | allainews.com

June 18, 2024, 4:49 a.m. | Dantong Niu, Yuvan Sharma, Giscard Biamby, Jerome Quenum, Yutong Bai, Baifeng Shi, Trevor Darrell, Roei Herzig

cs.LG updates on arXiv.org arxiv.org

arXiv:2406.11815v1 Announce Type: cross
Abstract: In recent years, instruction-tuned Large Multimodal Models (LMMs) have been successful at several tasks, including image captioning and visual question answering; yet leveraging these models remains an open question for robotics. Prior LMMs for robotics applications have been extensively trained on language and action data, but their ability to generalize in different settings has often been less than desired. To address this, we introduce LLARVA, a model trained with a novel instruction tuning method that …

abstract action applications arxiv captioning cs.cv cs.lg cs.ro data image instruction-tuned instruction tuning language large multimodal models lmms multimodal multimodal models prior question question answering robot robotics tasks tuning type vision visual

More from arxiv.org / cs.LG updates on arXiv.org

Scientific Machine Learning Based Reduced-Order Models for Plasma Turbulence Simulations 13 hours ago | arxiv.org

abstract arxiv build construction +20

LEDITS++: Limitless Image Editing using Text-to-Image Models 13 hours ago | arxiv.org

abstract aim apply arxiv +22

InterVLS: Interactive Model Understanding and Improvement with Vision-Language Surrogates 13 hours ago | arxiv.org

abstract applications arxiv challenges +22

Multimodal and Force-Matched Imitation Learning with a See-Through Visuotactile Sensor 13 hours ago | arxiv.org

abstract arxiv challenges cs.ai +16

Empathy Detection from Text, Audiovisual, Audio or Physiological Signals: Task Formulations and Machine Learning Methods 13 hours ago | arxiv.org

abstract applications arxiv attention +19

Autoencoder-based Anomaly Detection System for Online Data Quality Monitoring of the CMS Electromagnetic Calorimeter 13 hours ago | arxiv.org

abstract anomaly anomaly detection arxiv +20

Gradient Coding with Iterative Block Leverage Score Sampling 13 hours ago | arxiv.org

abstract arxiv block coding +17

Contextual Dynamic Pricing with Strategic Buyers 13 hours ago | arxiv.org

abstract arxiv behavior consumer +18

On Convex Data-Driven Inverse Optimal Control for Nonlinear, Non-stationary and Stochastic Systems 13 hours ago | arxiv.org

abstract agent arxiv context +19

AI Focused Biochemistry Postdoctoral Fellow

@ Lawrence Berkeley National Lab | Berkeley, CA

View on ai-jobs.net

Senior Data Engineer

@ Displate | Warsaw

View on ai-jobs.net

PhD Student AI simulation electric drive (f/m/d)

@ Volkswagen Group | Kassel, DE, 34123

View on ai-jobs.net

AI Privacy Research Lead

@ Leidos | 6314 Remote/Teleworker US

View on ai-jobs.net

Senior Platform System Architect, Silicon

@ Google | New Taipei, Banqiao District, New Taipei City, Taiwan

View on ai-jobs.net

Fabrication Hardware Litho Engineer, Quantum AI

@ Google | Goleta, CA, USA

View on ai-jobs.net