Adversarial Distillation via Attention Helps Enhance Accuracy and Robustness

Ruicheng Niu

Zhihong Liang Email

Yuxiang Huang

Yuanyuan Ma 

Linyi Zheng

College of Big Data and Intelligent Engineering, Southwest Forestry University, Kunming, 650224, China

Abstract

Lightweight neural networks are widely deployed in resource-constrained environments such as mobile devices and edge computing. However, they often struggle to achieve a reliable balance between accuracy and robustness, particularly under adversarial attacks. This limitation poses significant risks in safety-critical applications like autonomous driving and healthcare, where both high performance and reliability are essential. To address this challenge, we propose attention distillation enhancing robustness (ADER), a novel adversarial distillation framework that integrates self-attention mechanisms and a dual-teacher strategy. Unlike conventional single-teacher methods, ADER simultaneously distills knowledge from a clean teacher and an adversarially trained teacher. Furthermore, it incorporates cross-domain attention maps as auxiliary supervision to guide the student model’s spatial focus during training. This design enables the student to capture both discriminative and robust features effectively. Extensive experiments on Canadian institute for advanced research (CIFAR)-10 and CIFAR-100 demonstrate that ADER consistently outperforms state-of the-art adversarial training and distillation methods. The proposed method achieves substantial improvements in both clean accuracy and adversarial robustness, highlighting its potential for secure and efficient deployment of lightweight models.