April 12, 2024, 2:55 a.m. | /u/IllustriousSir_007

machinelearningnews www.reddit.com

Can pre-computed embeddings obtained from the teacher model be used to train the student model in knowledge distillation?

This project extends CLIP for efficient knowledge distillation, by utilizing embeddings as teachers. Typical knowledge distillation frameworks require running forward passes through a teacher model, which is often prohibitive in the case of billion or trillion parameter teachers. Using only the embeddings of the teacher models to guide the distillation can yield significant computational savings.

GitHub: https://github.com/lnairGT/CLIP-Distillation

clip distillation embeddings frameworks knowledge machinelearningnews project running teachers through train

More from www.reddit.com / machinelearningnews

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne