Feb. 12, 2024, 5:46 a.m. | Yichen Zhu Minjie Zhu Ning Liu Zhicai Ou Xiaofeng Mou Jian Tang

cs.CV updates on arXiv.org arxiv.org

In this paper, we introduce LLaVA-$\phi$ (LLaVA-Phi), an efficient multi-modal assistant that harnesses the power of the recently advanced small language model, Phi-2, to facilitate multi-modal dialogues. LLaVA-Phi marks a notable advancement in the realm of compact multi-modal models. It demonstrates that even smaller language models, with as few as 2.7B parameters, can effectively engage in intricate dialogues that integrate both textual and visual elements, provided they are trained with high-quality corpora. Our model delivers commendable performance on publicly available …

cs.cl cs.cv

Doctoral Researcher (m/f/div) in Automated Processing of Bioimages

@ Leibniz Institute for Natural Product Research and Infection Biology (Leibniz-HKI) | Jena

Research Scholar (Technical Research)

@ Centre for the Governance of AI | Hybrid; Oxford, UK

HPC Engineer (x/f/m) - DACH

@ Meshcapade GmbH | Remote, Germany

ETL Developer

@ Gainwell Technologies | Bengaluru, KA, IN, 560100

Medical Radiation Technologist, Breast Imaging

@ University Health Network | Toronto, ON, Canada

Data Scientist

@ PayPal | USA - Texas - Austin - Corp - Alterra Pkwy