all AI news
ViGAT: Bottom-up event recognition and explanation in video using factorized graph attention network. (arXiv:2207.09927v1 [cs.CV])
July 21, 2022, 1:12 a.m. | Nikolaos Gkalelis, Dimitrios Daskalakis, Vasileios Mezaris
cs.CV updates on arXiv.org arxiv.org
In this paper a pure-attention bottom-up approach, called ViGAT, that
utilizes an object detector together with a Vision Transformer (ViT) backbone
network to derive object and frame features, and a head network to process
these features for the task of event recognition and explanation in video, is
proposed. The ViGAT head consists of graph attention network (GAT) blocks
factorized along the spatial and temporal dimensions in order to capture
effectively both local and long-term dependencies between objects or frames.
Moreover, …
More from arxiv.org / cs.CV updates on arXiv.org
Jobs in AI, ML, Big Data
Software Engineer for AI Training Data (School Specific)
@ G2i Inc | Remote
Software Engineer for AI Training Data (Python)
@ G2i Inc | Remote
Software Engineer for AI Training Data (Tier 2)
@ G2i Inc | Remote
Data Engineer
@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania
Artificial Intelligence – Bioinformatic Expert
@ University of Texas Medical Branch | Galveston, TX
.NET Software Engineer (AI Focus)
@ Boskalis | Papendrecht, Netherlands