A Survey of Video Datasets for Grounded Event Understanding | allainews.com

June 17, 2024, 4:46 a.m. | Kate Sanders, Benjamin Van Durme

cs.CV updates on arXiv.org arxiv.org

arXiv:2406.09646v1 Announce Type: new
Abstract: While existing video benchmarks largely consider specialized downstream tasks like retrieval or question-answering (QA), contemporary multimodal AI systems must be capable of well-rounded common-sense reasoning akin to human visual understanding. A critical component of human temporal-visual perception is our ability to identify and cognitively model "things happening", or events. Historically, video benchmark tasks have implicitly tested for this ability (e.g., video captioning, in which models describe visual events with natural language), but they do not …

abstract ai systems arxiv benchmarks cs.ai cs.cv datasets event human identify multimodal multimodal ai perception question reasoning retrieval sense survey systems tasks temporal things type understanding video visual while

More from arxiv.org / cs.CV updates on arXiv.org

PlaNet-S: Automatic Semantic Segmentation of Placenta 1 day, 6 hours ago | arxiv.org

abstract architectures arxiv automated +15

FDDM: Unsupervised Medical Image Translation with a Frequency-Decoupled Diffusion Model 1 day, 6 hours ago | arxiv.org

abstract arxiv cs.cv current +20

Continuous 3D Myocardial Motion Tracking via Echocardiography 1 day, 6 hours ago | arxiv.org

abstract arxiv clinical continuous +17

Optimal Transport Aggregation for Visual Place Recognition 1 day, 6 hours ago | arxiv.org

aggregation arxiv cs.cv recognition +4

BT-Adapter: Video Conversation is Feasible Without Video Instruction Tuning 1 day, 6 hours ago | arxiv.org

abstract adapter agents arxiv +22

AutoProSAM: Automated Prompting SAM for 3D Multi-Organ Segmentation 1 day, 6 hours ago | arxiv.org

abstract applications arxiv automated +23

LiverUSRecon: Automatic 3D Reconstruction and Volumetry of the Liver with a Few Partial Ultrasound Scans 1 day, 6 hours ago | arxiv.org

3d reconstruction abstract acquisition analysis +10

ALMA: a mathematics-driven approach for determining tuning parameters in generalized LASSO problems, with applications to … 1 day, 6 hours ago | arxiv.org

abstract acquisition applications artifacts +19

Human-Aware Vision-and-Language Navigation: Bridging Simulation to Reality with Dynamic Human Interactions 1 day, 6 hours ago | arxiv.org

abstract agents arxiv cs.ai +21

Senior Clinical Data Scientist

@ Novartis | Home Worker

View on ai-jobs.net

R&D Senior Data Scientist 1

@ Jotun | Sandefjord

View on ai-jobs.net

Data Scientist - Corporate Audit, Officer

@ State Street | Toronto, Ontario

View on ai-jobs.net

Senior Manager, Data Science & Analytics Solutions - Safety

@ Hyundai Motor America | Fountain Valley, CA, US, 92708

View on ai-jobs.net

Data Science Working Student (all genders)

@ Merck Group | Darmstadt, Hessen, DE, 64293

View on ai-jobs.net

Senior Data Scientist (m/f/d)

@ BASF | Limburgerhof, DE

View on ai-jobs.net