Cross-Modal Graph with Meta Concepts for Video Captioning. (arXiv:2108.06458v3 [cs.CV] UPDATED) | allainews.com

Aug. 2, 2022, 2:13 a.m. | Hao Wang, Guosheng Lin, Steven C. H. Hoi, Chunyan Miao

cs.CV updates on arXiv.org arxiv.org

Video captioning targets interpreting the complex visual contents as text
descriptions, which requires the model to fully understand video scenes
including objects and their interactions. Prevailing methods adopt
off-the-shelf object detection networks to give object proposals and use the
attention mechanism to model the relations between objects. They often miss
some undefined semantic concepts of the pretrained model and fail to identify
exact predicate relationships between objects. In this paper, we investigate an
open research task of generating text descriptions …

arxiv captioning cv graph meta video

More from arxiv.org / cs.CV updates on arXiv.org

KDAS: Knowledge Distillation via Attention Supervision Framework for Polyp Segmentation 19 hours ago | arxiv.org

arxiv attention cs.cv cs.lg +8

Orbital Polarimetric Tomography of a Flare Near the Sagittarius A* Supermassive Black Hole 19 hours ago | arxiv.org

abstract arxiv astro-ph.he astro-ph.im +9

Bridging the Gap: Learning Pace Synchronization for Open-World Semi-Supervised Learning 19 hours ago | arxiv.org

arxiv cs.cv cs.lg gap +7

The LuViRA Dataset: Measurement Description 19 hours ago | arxiv.org

abstract algorithms arxiv audio +16

The Brain Tumor Sequence Registration (BraTS-Reg) Challenge: Establishing Correspondence Between Pre-Operative and Follow-up MRI Scans … 19 hours ago | arxiv.org

arxiv brain challenge cs.cv +6

GenURL: A General Framework for Unsupervised Representation Learning 19 hours ago | arxiv.org

abstract algorithms arxiv compact +21

Learning to Score Sign Language with Two-stage Method 19 hours ago | arxiv.org

abstract action recognition analysis arxiv +17

Optimization of Prompt Learning via Multi-Knowledge Representation for Vision-Language Models 19 hours ago | arxiv.org

arxiv cs.cv knowledge language +10

OmniSSR: Zero-shot Omnidirectional Image Super-Resolution using Stable Diffusion Model 19 hours ago | arxiv.org

abstract arxiv capabilities cs.cv +16

Data Scientist (m/f/x/d)

@ Symanto Research GmbH & Co. KG | Spain, Germany

View on ai-jobs.net

Enterprise Data Quality, Senior Analyst

@ Toyota North America | Plano

View on ai-jobs.net

Data Analyst & Audit Management Software (AMS) Coordinator

@ World Vision | Philippines - Home Working

View on ai-jobs.net

Product Manager Power BI Platform Tech I&E Operational Insights

@ ING | HBP (Amsterdam - Haarlerbergpark)

View on ai-jobs.net

Sr. Director, Software Engineering, Clinical Data Strategy

@ Moderna | USA-Washington-Seattle-1099 Stewart Street

View on ai-jobs.net

Data Engineer (Data as a Service)

@ Xplor | Atlanta, GA, United States

View on ai-jobs.net