Feb. 15, 2024, 2:08 a.m. | Furkan Gözükara

DEV Community dev.to

Sota Image Captioning Model Kosmos-2 Added To Our Image Captioning Scripts Arsenal


You can download them at here : https://www.patreon.com/posts/90744385


The batch image captioning models we have right now as follows:



  • CogVML with quantization 4-bit, 8-bit, 16-bit

  • LLaVA including 34b with quantization such as 4-bit, 8-bit, 16-bit

  • Blip2 Models

  • Clip Vision Models

  • Kosmos-2 Model


Kosmos-2 supports both single image captioning and also batch image captioning. I also did some research to find a good prompt.


1 click to install both …

16-bit beginners captioning clip image kosmos llava programming python quantization scripts sota tutorial vision vision models

Software Engineer for AI Training Data (School Specific)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Python)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Tier 2)

@ G2i Inc | Remote

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US