Data Curation for Image Captioning with Text-to-Image Generative Models. (arXiv:2305.03610v1 [cs.CV]) | allainews.com

May 8, 2023, 12:45 a.m. | Wenyan Li, Jonas F. Lotz, Chen Qiu, Desmond Elliott

cs.CL updates on arXiv.org arxiv.org

Recent advances in image captioning are mainly driven by large-scale
vision-language pretraining, relying heavily on computational resources and
increasingly large multimodal datasets. Instead of scaling up pretraining data,
we ask whether it is possible to improve performance by improving the quality
of the samples in existing datasets. We pursue this question through two
approaches to data curation: one that assumes that some examples should be
avoided due to mismatches between the image and caption, and one that assumes
that the …

arxiv captioning computational curation data data curation datasets generative generative models image language multimodal performance quality resources scale scaling scaling up text text-to-image vision

More from arxiv.org / cs.CL updates on arXiv.org

LLMs for Science: Usage for Code Generation and Data Analysis 13 hours ago | arxiv.org

abstract analysis arxiv become +26

VAL: Interactive Task Learning with GPT Dialog Parsing 13 hours ago | arxiv.org

abstract acquisition arxiv box +22

Convergences and Divergences between Automatic Assessment and Human Evaluation: Insights from Comparing ChatGPT-Generated Translation and … 13 hours ago | arxiv.org

abstract arxiv assessment automated +23

Some things are more CRINGE than others: Iterative Preference Optimization with the Pairwise Cringe Loss 13 hours ago | arxiv.org

abstract arxiv binary cs.ai +13

DBCopilot: Scaling Natural Language Querying to Massive Databases 13 hours ago | arxiv.org

abstract advances arxiv challenges +31

ARN: Analogical Reasoning on Narratives 13 hours ago | arxiv.org

abstract analogy arxiv cognitive +17

Applying BioBERT to Extract Germline Gene-Disease Associations for Building a Knowledge Graph from the Biomedical … 13 hours ago | arxiv.org

abstract arxiv biomedical building +24

Learning the meanings of function words from grounded language using a visual question answering model 13 hours ago | arxiv.org

abstract acquisition arxiv children +17

RETVec: Resilient and Efficient Text Vectorizer 13 hours ago | arxiv.org

arxiv cs.ai cs.cl resilient +2

Data Architect

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

View on ai-jobs.net

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

View on ai-jobs.net

Data Science Specialist

@ Telstra | Telstra ICC Bengaluru

View on ai-jobs.net

Senior Staff Engineer, Machine Learning

@ Nagarro | Remote, India

View on ai-jobs.net