Asset Details

MbrlCatalogueTitleDetail

Do you wish to reserve the book?

Collaborative multi-knowledge distillation under the influence of softmax regression representation

by Huang, Dailin , Chang, Zhaobin , Chen, Kangping , Zhao, Hong

in Computer Communication Networks / Computer Graphics / Computer Science / Cryptology / Data Storage Representation / Multimedia Information Systems / Operating Systems / Regular Paper

2024

Yes Please

Hey, we have placed the reservation for you!

By the way, why not check out events that you can attend while you pick your title.

Oops! Something went wrong.

Looks like we were not able to place the reservation. Kindly try again later.

Do you wish to request the book?

Collaborative multi-knowledge distillation under the influence of softmax regression representation

by Huang, Dailin , Chang, Zhaobin , Chen, Kangping , Zhao, Hong

in Computer Communication Networks / Computer Graphics / Computer Science / Cryptology / Data Storage Representation / Multimedia Information Systems / Operating Systems / Regular Paper

2024

Please be aware that the book you have requested cannot be checked out. If you would like to checkout this book, you can reserve another copy

How would you like to get it?

Submit

We have requested the book for you!

Your request is successful and it will be processed during the Library working hours. Please check the status of your request in My Requests.

Oops! Something went wrong.

Looks like we were not able to place your request. Kindly try again later.

Journal Article

Collaborative multi-knowledge distillation under the influence of softmax regression representation

Huang, Dailin,

Chang, Zhaobin,

Chen, Kangping,

Zhao, Hong

2024

Overview

Knowledge distillation can transfer knowledge from a powerful yet cumbersome teacher model to a less-parameterized student model, thus effectively achieving model compression. Various knowledge distillation methods have mainly focused on the task of knowledge transfer, and distillation location selection, which in turn increases the difficulty of model interpretation on the one hand, and on the other hand, there have been few works on the role of the teacher classifier in distillation. In this study, we propose a novel collaborative multi-knowledge distillation under the influence of softmax regression representation. Firstly, we propose a stage-wise logit knowledge distillation, where the teacher classifier is used as an auxiliary structure to align the features of the student and teacher models. By leveraging the teacher classifier, the student features are aligned with the teacher features in the logits space, eliminating the need for a complex feature projector that requires extensive computation to match the features between the teacher and student models. Secondly, considering the teacher classifier’s adaptability to classification features, we introduce a stage-wise feature knowledge distillation. This mechanism maps the features of the student model to a latent space with the same dimensions as the features of the teacher model, guiding the student’s features to align with the teacher’s final features using a Mean Square Error (MSE) loss. Finally, we propose a pseudo-teacher knowledge distillation loss to optimize the modeling of the deformation relationship between the student and teacher features. This loss provides additional gradient optimization information for the parameters of the feature projector. Extensive experiments on CIFAR-100 and ImageNet datasets demonstrate the superiority of the proposed model compared with the state-of-the-art methods. The code is available at https://github.com/chenKP/CMKD.git

Share this book

Add to My Shelf

Publisher

Springer Berlin Heidelberg

Subject

Computer Communication Networks

/ Computer Graphics

/ Computer Science

/ Cryptology

/ Data Storage Representation

/ Multimedia Information Systems

/ Operating Systems