Web: http://arxiv.org/abs/2205.01915

May 5, 2022, 1:12 a.m. | Han-Jia Ye, Su Lu, De-Chuan Zhan

cs.LG updates on arXiv.org arxiv.org

The knowledge of a well-trained deep neural network (a.k.a. the "teacher") is
valuable for learning similar tasks. Knowledge distillation extracts knowledge
from the teacher and integrates it with the target model (a.k.a. the
"student"), which expands the student's knowledge and improves its learning
efficacy. Instead of enforcing the teacher to work on the same task as the
student, we borrow the knowledge from a teacher trained from a general label
space -- in this "Generalized Knowledge Distillation (GKD)", the classes …

arxiv cv distillation knowledge relationship

More from arxiv.org / cs.LG updates on arXiv.org

Director, Applied Mathematics & Computational Research Division

@ Lawrence Berkeley National Lab | Berkeley, Ca

Business Data Analyst

@ MainStreet Family Care | Birmingham, AL

Assistant/Associate Professor of the Practice in Business Analytics

@ Georgetown University McDonough School of Business | Washington DC

Senior Data Science Writer

@ NannyML | Remote

Director of AI/ML Engineering

@ Armis Industries | Remote (US only), St. Louis, California

Digital Analytics Manager

@ Patagonia | Ventura, California