April 9, 2024, 4:46 a.m. | Tao Wu, Runyu He, Gangshan Wu, Limin Wang

cs.CV updates on arXiv.org arxiv.org

arXiv:2404.04565v1 Announce Type: new
Abstract: Video-based visual relation detection tasks, such as video scene graph generation, play important roles in fine-grained video understanding. However, current video visual relation detection datasets have two main limitations that hinder the progress of research in this area. First, they do not explore complex human-human interactions in multi-person scenarios. Second, the relation types of existing datasets have relatively low-level semantics and can be often recognized by appearance or simple prior information, without the need for …

abstract arxiv cs.cv current dataset datasets detection explore fine-grained graph hinder however human limitations progress research roles sports tasks type understanding video videos video understanding visual

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne