all AI news
LLaVA-Gemma: Accelerating Multimodal Foundation Models with a Compact Language Model
April 3, 2024, 4:46 a.m. | Musashi Hinck, Matthew L. Olson, David Cobbley, Shao-Yen Tseng, Vasudev Lal
cs.CL updates on arXiv.org arxiv.org
Abstract: We train a suite of multimodal foundation models (MMFM) using the popular LLaVA framework with the recently released Gemma family of large language models (LLMs). Of particular interest is the 2B parameter Gemma model, which provides opportunities to construct capable small-scale MMFMs. In line with findings from other papers in this space, we test the effect of ablating three design features: pretraining the connector, utilizing a more powerful image backbone, and increasing the size of …
abstract arxiv compact construct cs.ai cs.cl family foundation framework gemma language language model language models large language large language models line llava llms multimodal opportunities popular scale small train type
More from arxiv.org / cs.CL updates on arXiv.org
Jobs in AI, ML, Big Data
Lead Developer (AI)
@ Cere Network | San Francisco, US
Research Engineer
@ Allora Labs | Remote
Ecosystem Manager
@ Allora Labs | Remote
Founding AI Engineer, Agents
@ Occam AI | New York
AI Engineer Intern, Agents
@ Occam AI | US
AI Research Scientist
@ Vara | Berlin, Germany and Remote