March 6, 2024, 5:45 a.m. | Zhiyuan Chang, Mingyang Li, Junjie Wang, Cheng Li, Qing Wang

cs.CV updates on arXiv.org arxiv.org

arXiv:2403.02581v1 Announce Type: new
Abstract: Visual entailment (VE) is a multimodal reasoning task consisting of image-sentence pairs whereby a promise is defined by an image, and a hypothesis is described by a sentence. The goal is to predict whether the image semantically entails the sentence. VE systems have been widely adopted in many downstream tasks. Metamorphic testing is the commonest technique for AI algorithms, but it poses a significant challenge for VE testing. They either only consider perturbations on single …

abstract arxiv cs.cv cs.se hypothesis image multimodal object reasoning systems testing type via visual

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Senior ML Engineer

@ Carousell Group | Ho Chi Minh City, Vietnam

Data and Insight Analyst

@ Cotiviti | Remote, United States