April 9, 2024, 4:46 a.m. | Tao Wu, Runyu He, Gangshan Wu, Limin Wang

cs.CV updates on arXiv.org arxiv.org

arXiv:2404.04565v1 Announce Type: new
Abstract: Video-based visual relation detection tasks, such as video scene graph generation, play important roles in fine-grained video understanding. However, current video visual relation detection datasets have two main limitations that hinder the progress of research in this area. First, they do not explore complex human-human interactions in multi-person scenarios. Second, the relation types of existing datasets have relatively low-level semantics and can be often recognized by appearance or simple prior information, without the need for …

abstract arxiv cs.cv current dataset datasets detection explore fine-grained graph hinder however human limitations progress research roles sports tasks type understanding video videos video understanding visual

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US

Research Engineer

@ Allora Labs | Remote

Ecosystem Manager

@ Allora Labs | Remote

Founding AI Engineer, Agents

@ Occam AI | New York