Efficient Modeling of Future Context for Image Captioning. (arXiv:2207.10897v1 [cs.CV]) | allainews.com

July 25, 2022, 1:12 a.m. | Zhengcong Fei, Junshi Huang, Xiaoming Wei, Xiaolin Wei

cs.CV updates on arXiv.org arxiv.org

Existing approaches to image captioning usually generate the sentence
word-by-word from left to right, with the constraint of conditioned on local
context including the given image and history generated words. There have been
many studies target to make use of global information during decoding, e.g.,
iterative refinement. However, it is still under-explored how to effectively
and efficiently incorporate the future context. To respond to this issue,
inspired by that Non-Autoregressive Image Captioning (NAIC) can leverage
two-side relation with modified mask …

arxiv captioning context cv future image modeling

More from arxiv.org / cs.CV updates on arXiv.org

Mobile-Agent: Autonomous Multi-Modal Mobile Device Agent with Visual Perception 1 day, 10 hours ago | arxiv.org

agent arxiv autonomous cs.cl +8

Low-resolution Prior Equilibrium Network for CT Reconstruction 1 day, 10 hours ago | arxiv.org

abstract arxiv cs.cv deep learning +17

MARformer: An Efficient Metal Artifact Reduction Transformer for Dental CBCT Images 1 day, 10 hours ago | arxiv.org

abstract artifact arxiv cs.cv +16

Back to Basics: Fast Denoising Iterative Algorithm 1 day, 10 hours ago | arxiv.org

abstract algorithm arxiv basics +10

Predicting Thrombectomy Recanalization from CT Imaging Using Deep Learning Models 1 day, 10 hours ago | arxiv.org

abstract arxiv benefit clinicians +10

Efficiently Adversarial Examples Generation for Visual-Language Models under Targeted Transfer Scenarios using Diffusion Models 1 day, 10 hours ago | arxiv.org

abstract adversarial adversarial examples art +20

Methods and strategies for improving the novel view synthesis quality of neural radiation field 1 day, 10 hours ago | arxiv.org

abstract application arxiv attention +16

AffordanceLLM: Grounding Affordance from Vision Language Models 1 day, 10 hours ago | arxiv.org

arxiv cs.cv cs.ro language +3

DualFluidNet: an Attention-based Dual-pipeline Network for FLuid Simulation 1 day, 10 hours ago | arxiv.org

arxiv attention cs.cv cs.gr +4

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

View on ai-jobs.net

Data Integration Specialist

@ Accenture Federal Services | San Antonio, TX

View on ai-jobs.net

Geospatial Data Engineer - Location Intelligence

@ Allegro | Warsaw, Poland

View on ai-jobs.net

Site Autonomy Engineer (Onsite)

@ May Mobility | Tokyo, Japan

View on ai-jobs.net

Summer Intern, AI (Artificial Intelligence)

@ Nextech Systems | Tampa, FL

View on ai-jobs.net

Permitting Specialist/Wetland Scientist

@ AECOM | Chelmsford, MA, United States

View on ai-jobs.net