With Great Training Comes Great Vulnerability: Practical Attacks against Transfer Learning
Ben Y. Zhao
Proceedings of the 27th USENIX Security Symposium (USENIX Security 2018)
[Full Text in PDF Format, 3MB]
Transfer learning is a powerful approach that allows users to quickly build accurate deep-learning (Student) models
by "learning" from centralized (Teacher) models pretrained with large datasets, e.g. Google's InceptionV3. We
hypothesize that the centralization of model training increases their vulnerability to misclassification attacks
leveraging knowledge of publicly accessible Teacher models. In this paper, we describe our efforts to understand
and experimentally validate such attacks in the context of image recognition. We identify techniques that allow
attackers to associate Student models with their Teacher counterparts, and launch highly effective
misclassification attacks on black-box Student models. We validate this on widely used Teacher models in the wild.
Finally, we propose and evaluate multiple approaches for defense, including a neuron-distance technique that
successfully defends against these attacks while also obfuscates the link between Teacher and Student models.