We propose a response-based method of knowledge distillation (KD) for the
head pose estimation problem. A student model trained by the proposed KD
achieves results better than a teacher model, which is atypical for the
response-based method. Our method consists of two stages. In the first stage,
we trained the base neural network (NN), which has one regression head and four
regression via classification (RvC) heads. We build the convolutional ensemble
over the base NN using offsets of face bounding …

