all AI news
Wukong: A 100 Million Large-scale Chinese Cross-modal Pre-training Benchmark. (arXiv:2202.06767v3 [cs.CV] UPDATED)
June 20, 2022, 1:13 a.m. | Jiaxi Gu, Xiaojun Meng, Guansong Lu, Lu Hou, Minzhe Niu, Xiaodan Liang, Lewei Yao, Runhui Huang, Wei Zhang, Xin Jiang, Chunjing Xu, Hang Xu
cs.CV updates on arXiv.org arxiv.org
Vision-Language Pre-training (VLP) models have shown remarkable performance
on various downstream tasks. Their success heavily relies on the scale of
pre-trained cross-modal datasets. However, the lack of large-scale datasets and
benchmarks in Chinese hinders the development of Chinese VLP models and broader
multilingual applications. In this work, we release a large-scale Chinese
cross-modal dataset named Wukong, which contains 100 million Chinese image-text
pairs collected from the web. Wukong aims to benchmark different multi-modal
pre-training methods to facilitate the VLP research …
More from arxiv.org / cs.CV updates on arXiv.org
Jobs in AI, ML, Big Data
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Data Analyst
@ SEAKR Engineering | Englewood, CO, United States
Data Analyst II
@ Postman | Bengaluru, India
Data Architect
@ FORSEVEN | Warwick, GB
Director, Data Science
@ Visa | Washington, DC, United States
Senior Manager, Data Science - Emerging ML
@ Capital One | McLean, VA