all AI news
Microsoft Open-Sources 13 Billion Parameter Language and Vision Chatbot LLaVA
InfoQ - AI, ML & Data Engineering www.infoq.com
Researchers from Microsoft, the University of Wisconsin–Madison, and Columbia University have open-sourced Large Language and Vision Assistant (LLaVA). LLaVA is based on a CLIP image encoder and a LLaMA language decoder, is fine-tuned on a synthetic instruction-following dataset, and achieved state-of-the-art accuracy on the ScienceQA benchmark.
By Anthony Alfordaccuracy ai art assistant benchmark chatbot chatgpt clip columbia university dataset decoder encoder image language large language models llama microsoft ml & data engineering researchers state synthetic university vision