Preprint Article Version 1 This version is not peer-reviewed

Mixed Perturbation: Generating Directionally Diverse Perturbations for Adversarial Training

Version 1 : Received: 1 October 2024 / Approved: 1 October 2024 / Online: 1 October 2024 (14:00:01 CEST)

How to cite: Hyun, C.; Park, H. Mixed Perturbation: Generating Directionally Diverse Perturbations for Adversarial Training. Preprints 2024, 2024100073. https://doi.org/10.20944/preprints202410.0073.v1 Hyun, C.; Park, H. Mixed Perturbation: Generating Directionally Diverse Perturbations for Adversarial Training. Preprints 2024, 2024100073. https://doi.org/10.20944/preprints202410.0073.v1

Abstract

The adversarial vulnerability of deep learning models is a critical issue that must be addressed to ensure the safe commercialization of AI technologies. Although numerous studies on adversarial defense methods are actively being conducted from various perspectives, most of them still provide limited robustness, and even the relatively trusted adversarial training is no exception. To develop more reliable defense methods, ongoing research exploring the properties and causes of adversarial vulnerabilities is essential. In this study, we focus on a hypothesis regarding the existence of adversarial examples: The adversarial examples represent low-probability “pockets” in the manifold. Assuming that the hypothesis holds true, we propose a method for generating perturbation: “mixed perturbation (MP)”, which aims at discovering diverse pocket samples in a defensive perspective. The proposed method generates perturbations by leveraging information from both the main task and auxiliary tasks in multi-task learning scenarios, combining them through random weighted summation. The generated mixed perturbation intends to maintain the primary directionality of the main task perturbation to improve the model’s main task recognition performance while introducing variability in the perturbation directions. We then utilize them for adversarial training to form more robust decision boundary. Through experiments and analyses conducted on five benchmark datasets, we validated the effectiveness of our proposed method.

Keywords

adversarial robustness; adversarial training; adversarial perturbations, evasion attack, multi-task learning

Subject

Computer Science and Mathematics, Artificial Intelligence and Machine Learning

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.