all AI news
CrossGET: Cross-Guided Ensemble of Tokens for Accelerating Vision-Language Transformers
June 17, 2024, 4:41 a.m. | Dachuan Shi, Chaofan Tao, Anyi Rao, Zhendong Yang, Chun Yuan, Jiaqi Wang
cs.CL updates on arXiv.org arxiv.org
Abstract: Recent vision-language models have achieved tremendous advances. However, their computational costs are also escalating dramatically, making model acceleration exceedingly critical. To pursue more efficient vision-language Transformers, this paper introduces Cross-Guided Ensemble of Tokens (CrossGET), a general acceleration framework for vision-language Transformers. This framework adaptively combines tokens in real-time during inference, significantly reducing computational costs while maintaining high performance. CrossGET features two primary innovations: 1) Cross-Guided Matching and Ensemble. CrossGET leverages cross-modal guided token matching and …
arxiv cs.cl cs.cv ensemble language replace tokens transformers type vision vision-language
More from arxiv.org / cs.CL updates on arXiv.org
Jobs in AI, ML, Big Data
AI Focused Biochemistry Postdoctoral Fellow
@ Lawrence Berkeley National Lab | Berkeley, CA
Senior Data Engineer
@ Displate | Warsaw
Data Architect
@ Unison Consulting Pte Ltd | Kuala Lumpur, Federal Territory of Kuala Lumpur, Malaysia
Data Architect
@ Games Global | Isle of Man, Isle of Man
Enterprise Data Architect
@ Ent Credit Union | Colorado Springs, CO, United States
Lead Data Architect (AWS, Azure, GCP)
@ CapTech Consulting | Chicago, IL, United States