Oct. 23, 2023, 3:30 p.m. | Venelin Valkov

Venelin Valkov www.youtube.com

LLaVA, a Large Multimodal Model (LMM), allows you to have image-based conversations. Similar to GPT-4V but without the price tag, LLaVA is free and open source. In this video, we'll explore the original model and then level up with the newer and improved LLaVA 1.5. We'll set up a Google Colab notebook and put LLaVA to the test by running some prompts for different tasks (OCR, image understanding, Q&A over images, etc). What type of results do we get?

Project …

chat conversations explore free gpt gpt-4v image images llava llava 1.5 lmm multimodal multimodal model open source price video

More from www.youtube.com / Venelin Valkov

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Data Engineer (m/f/d)

@ Project A Ventures | Berlin, Germany

Principle Research Scientist

@ Analog Devices | US, MA, Boston