Distilling Knowledge
目录
Distilling knowledge
文章标题:Distilling the knowledge in a neural network
作者:Hinton G, Vinyals O, Dean J.
发表时间:(NIPS 2014)
Distillation
$$ q_i = \frac{exp(z_i/T)}{\sum _j exp(z_j/T)} $$拓展阅读
蒸馏机理论文
Hinton-NeurIPS2019论文:When Does Label Smoothing Help?
ICLR2021论文:Is Label Smoothing Truly Incompatible with Knowledge Distillation: An Empirical Study :知乎解读
发展趋势
多老师多学生
知识的表示、数据集蒸馏、对比学习:
Attention Transfer论文:https://arxiv.org/abs/1612.03928
多模态、知识图谱、预训练大模型的知识蒸馏
论文解读
Knowledge distillation in deep learning and its applications
Knowledge Distillation: Principles, Algorithms, Applications
知乎:【经典简读】知识蒸馏(Knowledge Distillation) 经典之作