Diverse Video Captioning by Adaptive Spatio-temporal Attention. (arXiv:2208.09266v1 [cs.CV]) | allainews.com

Aug. 22, 2022, 1:14 a.m. | Zohreh Ghaderi, Leonard Salewski, Hendrik P. A. Lensch

cs.CV updates on arXiv.org arxiv.org

To generate proper captions for videos, the inference needs to identify
relevant concepts and pay attention to the spatial relationships between them
as well as to the temporal development in the clip. Our end-to-end
encoder-decoder video captioning framework incorporates two transformer-based
architectures, an adapted transformer for a single joint spatio-temporal video
analysis as well as a self-attention-based decoder for advanced text
generation. Furthermore, we introduce an adaptive frame selection scheme to
reduce the number of required incoming frames while maintaining …

arxiv attention captioning cv temporal video

More from arxiv.org / cs.CV updates on arXiv.org

Mobile-Agent: Autonomous Multi-Modal Mobile Device Agent with Visual Perception 4 hours ago | arxiv.org

agent arxiv autonomous cs.cl +8

Low-resolution Prior Equilibrium Network for CT Reconstruction 4 hours ago | arxiv.org

abstract arxiv cs.cv deep learning +17

MARformer: An Efficient Metal Artifact Reduction Transformer for Dental CBCT Images 4 hours ago | arxiv.org

abstract artifact arxiv cs.cv +16

Back to Basics: Fast Denoising Iterative Algorithm 4 hours ago | arxiv.org

abstract algorithm arxiv basics +10

Predicting Thrombectomy Recanalization from CT Imaging Using Deep Learning Models 4 hours ago | arxiv.org

abstract arxiv benefit clinicians +10

Efficiently Adversarial Examples Generation for Visual-Language Models under Targeted Transfer Scenarios using Diffusion Models 4 hours ago | arxiv.org

abstract adversarial adversarial examples art +20

Methods and strategies for improving the novel view synthesis quality of neural radiation field 4 hours ago | arxiv.org

abstract application arxiv attention +16

AffordanceLLM: Grounding Affordance from Vision Language Models 4 hours ago | arxiv.org

arxiv cs.cv cs.ro language +3

DualFluidNet: an Attention-based Dual-pipeline Network for FLuid Simulation 4 hours ago | arxiv.org

arxiv attention cs.cv cs.gr +4

Data Engineer

@ Bosch Group | San Luis Potosí, Mexico

View on ai-jobs.net

DATA Engineer (H/F)

@ Renault Group | FR REN RSAS - Le Plessis-Robinson (Siège)

View on ai-jobs.net

Advisor, Data engineering

@ Desjardins | 1, Complexe Desjardins, Montréal

View on ai-jobs.net

Data Engineer Intern

@ Getinge | Wayne, NJ, US

View on ai-jobs.net

Software Engineer III- Java / Python / Pyspark / ETL

@ JPMorgan Chase & Co. | Jersey City, NJ, United States

View on ai-jobs.net

Lead Data Engineer (Azure/AWS)

@ Telstra | Telstra ICC Bengaluru

View on ai-jobs.net