Sept. 15, 2022, 7:16 p.m. | Google AI (noreply@blogger.com)

Google AI Blog ai.googleblog.com

Posted by Xi Chen and Xiao Wang, Software Engineers, Google Research

Advanced language models (e.g., GPT, GLaM, PaLM and T5) have demonstrated diverse capabilities and achieved impressive results across tasks and languages by scaling up their number of parameters. Vision-language (VL) models can benefit from similar scaling to address many tasks, such as image captioning, visual question answering (VQA), object recognition, and in-context optical-character-recognition (OCR). Increasing the success rates for these practical tasks is important …

computer vision image language machine learning multimodal learning scaling

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Enterprise AI Architect

@ Oracle | Broomfield, CO, United States

Cloud Data Engineer France H/F (CDI - Confirmé)

@ Talan | Nantes, France