EmbSpatial-Bench: Benchmarking Spatial Understanding for Embodied Tasks with Large Vision-Language Models | allainews.com

June 11, 2024, 4:42 a.m. | Mengfei Du, Binhao Wu, Zejun Li, Xuanjing Huang, Zhongyu Wei

cs.CL updates on arXiv.org arxiv.org

arXiv:2406.05756v1 Announce Type: cross
Abstract: The recent rapid development of Large Vision-Language Models (LVLMs) has indicated their potential for embodied tasks.However, the critical skill of spatial understanding in embodied environments has not been thoroughly evaluated, leaving the gap between current LVLMs and qualified embodied intelligence unknown. Therefore, we construct EmbSpatial-Bench, a benchmark for evaluating embodied spatial understanding of LVLMs.The benchmark is automatically derived from embodied scenes and covers 6 spatial relationships from an egocentric perspective.Experiments expose the insufficient capacity of …

abstract arxiv benchmarking construct cs.ai cs.cl cs.cv cs.mm current development embodied embodied intelligence environments gap however intelligence language language models potential skill spatial tasks type understanding vision vision-language vision-language models

More from arxiv.org / cs.CL updates on arXiv.org

MuTox: Universal MUltilingual Audio-based TOXicity Dataset and Zero-shot Detector 2 days, 10 hours ago | arxiv.org

abstract arxiv audio cs.cl +22

Can Large Language Model Summarizers Adapt to Diverse Scientific Communication Goals? 2 days, 10 hours ago | arxiv.org

abstract adapt arxiv communication +23

ReFT: Reasoning with Reinforced Fine-Tuning 2 days, 10 hours ago | arxiv.org

abstract annotations arxiv capability +22

Deductive Closure Training of Language Models for Coherence, Accuracy, and Updatability 2 days, 10 hours ago | arxiv.org

abstract accuracy arxiv cs.cl +13

Exploring Defeasibility in Causal Reasoning 2 days, 10 hours ago | arxiv.org

abstract arxiv causal causal reasoning +7

Can Large Language Models Follow Concept Annotation Guidelines? A Case Study on Scientific and Financial … 2 days, 10 hours ago | arxiv.org

abstract annotation arxiv capacity +26

Theory of Mind for Multi-Agent Collaboration via Large Language Models 2 days, 10 hours ago | arxiv.org

abstract agent agents arxiv +28

Enhancing Text-based Knowledge Graph Completion with Zero-Shot Large Language Models: A Focus on Semantic Enhancement 2 days, 10 hours ago | arxiv.org

arxiv cs.ai cs.cl focus +12

A Large Language Model Approach to Educational Survey Feedback Analysis 2 days, 10 hours ago | arxiv.org

abstract analysis arxiv capabilities +27

Data Scientist

@ Ford Motor Company | Chennai, Tamil Nadu, India

View on ai-jobs.net

Systems Software Engineer, Graphics

@ Parallelz | Vancouver, British Columbia, Canada - Remote

View on ai-jobs.net

Engineering Manager - Geo Engineering Team (F/H/X)

@ AVIV Group | Paris, France

View on ai-jobs.net

Data Analyst

@ Microsoft | San Antonio, Texas, United States

View on ai-jobs.net

Azure Data Engineer

@ TechVedika | Hyderabad, India

View on ai-jobs.net

Senior Data & AI Threat Detection Researcher (Cortex)

@ Palo Alto Networks | Tel Aviv-Yafo, Israel

View on ai-jobs.net