Modular visual question answering via code generation | allainews.com

July 7, 2023, 6:01 p.m. | Google AI (noreply@blogger.com)

Google AI Blog ai.googleblog.com

Posted by Sanjay Subramanian, PhD student, UC Berkeley, and Arsha Nagrani, Research Scientist, Google Research, Perception Team

Visual question answering (VQA) is a machine learning task that requires a model to answer a question about an image or a set of images. Conventional VQA approaches need a large amount of labeled training data consisting of thousands of human-annotated question-answer pairs associated with images. In recent years, advances in large-scale pre-training have led to the development of VQA methods that …

code code generation computer vision data google google research image images machine machine learning modular multimodal learning perception phd question answering research set team training training data uc berkeley video

More from ai.googleblog.com / Google AI Blog

Generative AI to quantify uncertainty in weather forecasting 1 month ago | ai.googleblog.com

climate decisions engineer example +17

AutoBNN: Probabilistic time series forecasting with compositional bayesian neural networks 1 month, 1 week ago | ai.googleblog.com

bayesian data economic engineer +23

Computer-aided diagnosis for lung cancer screening 1 month, 2 weeks ago | ai.googleblog.com

cancer cancer screening computer diagnosis +16

Using AI to expand global access to reliable flood forecasts 1 month, 2 weeks ago | ai.googleblog.com

billion disaster engineering environment +13

ScreenAI: A visual language model for UI and visually-situated language understanding 1 month, 2 weeks ago | ai.googleblog.com

charts communication design diagrams +24

SCIN: A new resource for representative dermatology images 1 month, 2 weeks ago | ai.googleblog.com

crowd-sourcing dataset datasets dermatology +14

MELON: Reconstructing 3D objects from images with unknown poses 1 month, 2 weeks ago | ai.googleblog.com

3d objects capacity computer vision engineer +16

HEAL: A framework for health equity assessment of machine learning performance 1 month, 2 weeks ago | ai.googleblog.com

assessment clinical core differences +17

Cappy: Outperforming and boosting large multi-task language models with a small scorer 1 month, 3 weeks ago | ai.googleblog.com

boosting engineers framework google +25

Founding AI Engineer, Agents

@ Occam AI | New York

View on ai-jobs.net

AI Engineer Intern, Agents

@ Occam AI | US

View on ai-jobs.net

AI Research Scientist

@ Vara | Berlin, Germany and Remote

View on ai-jobs.net

Data Architect

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

DevOps Engineer (Data Team)

@ Reward Gateway | Sofia/Plovdiv

View on ai-jobs.net