April 24, 2024, 4:45 a.m. | R\'emi Kazmierczak, Elo\"ise Berthier, Goran Frehse, Gianni Franchi

cs.CV updates on arXiv.org arxiv.org

arXiv:2312.00110v2 Announce Type: replace
Abstract: In this paper, we introduce an explainable algorithm designed from a multi-modal foundation model, that performs fast and explainable image classification. Drawing inspiration from CLIP-based Concept Bottleneck Models (CBMs), our method creates a latent space where each neuron is linked to a specific word. Observing that this latent space can be modeled with simple distributions, we use a Mixture of Gaussians (MoG) formalism to enhance the interpretability of this latent space. Then, we introduce CLIP-QDA, …

abstract algorithm arxiv classification clip concept cs.cv foundation foundation model image inspiration modal multi-modal neuron paper space type word

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne