Aug. 23, 2022, 1:15 a.m. | Zhendong Yang, Zhe Li, Yuan Gong, Tianke Zhang, Shanshan Lao, Chun Yuan, Yu Li

cs.CV updates on arXiv.org arxiv.org

Knowledge Distillation (KD) has developed extensively and boosted various
tasks. The classical KD method adds the KD loss to the original cross-entropy
(CE) loss. We try to decompose the KD loss to explore its relation with the CE
loss. Surprisingly, we find it can be regarded as a combination of the CE loss
and an extra loss which has the identical form as the CE loss. However, we
notice the extra loss forces the student's relative probability to learn the …

arxiv cross-entropy cv distillation entropy knowledge

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Data Engineer - Takealot Group (Takealot.com | Superbalist.com | Mr D Food)

@ takealot.com | Cape Town