April 24, 2024, 4:45 a.m. | R\'emi Kazmierczak, Elo\"ise Berthier, Goran Frehse, Gianni Franchi

cs.CV updates on arXiv.org arxiv.org

arXiv:2312.00110v2 Announce Type: replace
Abstract: In this paper, we introduce an explainable algorithm designed from a multi-modal foundation model, that performs fast and explainable image classification. Drawing inspiration from CLIP-based Concept Bottleneck Models (CBMs), our method creates a latent space where each neuron is linked to a specific word. Observing that this latent space can be modeled with simple distributions, we use a Mixture of Gaussians (MoG) formalism to enhance the interpretability of this latent space. Then, we introduce CLIP-QDA, …

abstract algorithm arxiv classification clip concept cs.cv foundation foundation model image inspiration modal multi-modal neuron paper space type word

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US

Research Engineer

@ Allora Labs | Remote

Ecosystem Manager

@ Allora Labs | Remote

Founding AI Engineer, Agents

@ Occam AI | New York