Oct. 13, 2022, 4:53 p.m. | Google AI (noreply@blogger.com)

Google AI Blog ai.googleblog.com

Posted by Ashish Thapliyal, Software Engineer, and Jordi Pont-Tuset, Research Scientist, Google Research

Image captioning is the machine learning task of automatically generating a fluent natural language description for a given image. This task is important for improving accessibility for visually impaired users and is a core task in multimodal research encompassing both vision and language modeling.

However, datasets for image captioning are primarily available in English. Beyond that, there are only a few datasets covering a limited number of …

datasets emnlp images multimodal learning reference

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Associate Data Engineer

@ Nominet | Oxford/ Hybrid, GB

Data Science Senior Associate

@ JPMorgan Chase & Co. | Bengaluru, Karnataka, India