publications
2025
-
Seal Your Backdoor with Variational DefenseIvan Sabolić, Matej Grcić, and Siniša ŠegvićInternational Conference on Computer Vision, 2025We propose VIBE, a model-agnostic framework that trains classifiers resilient to backdoor attacks. The key concept behind our approach is to treat malicious inputs and corrupted labels from the training dataset as observed random variables, while the actual clean labels are latent. VIBE then recovers the corresponding latent clean label posterior through variational inference. The resulting training procedure follows the expectation-maximization (EM) algorithm. The E-step infers the clean pseudolabels by solving an entropy-regularized optimal transport problem, while the M-step updates the classifier parameters via gradient descent. Being modular, VIBE can seamlessly integrate with recent advancements in self-supervised representation learning, which enhance its ability to resist backdoor attacks. We experimentally validate the method effectiveness against contemporary backdoor attacks on standard datasets, a large-scale setup with 1k classes, and a dataset poisoned with multiple attacks. VIBE consistently outperforms previous defenses across all tested scenarios.
@article{sabolic2025seal, title = {Seal Your Backdoor with Variational Defense}, author = {Saboli{\'c}, Ivan and Grci{\'c}, Matej and {\v{S}}egvi{\'c}, Sini{\v{s}}a}, journal = {International Conference on Computer Vision}, year = {2025}, }
2024
-
Backdoor Defense through Self-Supervised and Generative LearningIvan Sabolic, Ivan Grubišić, and Siniša ŠegvićIn 35th British Machine Vision Conference 2024, BMVC 2024, Glasgow, UK, November 25-28, 2024, 2024Backdoor attacks change a small portion of training data by introducing hand-crafted triggers and rewiring the corresponding labels towards a desired target class. Training on such data injects a backdoor which causes malicious inference in selected test samples. Most defenses mitigate such attacks through various modifications of the discriminative learning procedure. In contrast, this paper explores an approach based on generative modelling of per-class distributions in a self-supervised representation space. Interestingly, these representations get either preserved or heavily disturbed under recent backdoor attacks. In both cases, we find that per-class generative models allow to detect poisoned data and cleanse the dataset. Experiments show that training on cleansed dataset greatly reduces the attack success rate and retains the accuracy on benign inputs.
@inproceedings{Sabolic_2024_BMVC, author = {Sabolic, Ivan and Grubišić, Ivan and Šegvić, Siniša}, title = {Backdoor Defense through Self-Supervised and Generative Learning}, booktitle = {35th British Machine Vision Conference 2024, {BMVC} 2024, Glasgow, UK, November 25-28, 2024}, publisher = {BMVA}, year = {2024}, url = {https://papers.bmvc2024.org/0346.pdf}, }
2023
-
Computational Color Constancy-Based Backdoor AttacksDonik Vršnak, Ivan Sabolić, Marko Subašić, and 1 more authorIn 2023 International Symposium on Image and Signal Processing and Analysis (ISPA), 2023Deep neural networks (DNNs) have become an integral part of many computer vision tasks. However, training complex neural networks requires a large amount of computational resources. Therefore, many users outsource training to third parties. This introduces an attack vector for backdoor attacks. These attacks are described as attacks in which the neural network behaves as expected for benign inputs but acts maliciously when a backdoor trigger is present in the input. Triggers are small, preferably stealthy additions to the input. However, most of these triggers are based on the additive model, i.e., the trigger is simply added onto the image. Furthermore, optimized triggers are artificial, which means that it is difficult or impossible to reproduce them in the real-world, making them impractical to use in a real-world setting. In this work, we present a novel way of trigger injection for the classification problem. It is based on the von Kries model for image color correction, a frequently used component in all image processing pipelines. Our trigger uses a multiplicative rather than an additive model. This makes it harder to detect the injection by defensive methods. Second, the trigger is based on real-world phenomena of changing illumination. Finally, it can be made harder to spot by a human observer, when compared to some additive triggers. We test the performance of our attack strategy against various defense methods on several frequently used datasets, and achieve excellent results. Furthermore, we show that the malicious behavior of models trained on artificially colored images can be activated in real-world scenarios, further increasing the usefulness of our attack strategy.
@inproceedings{10278694, author = {Vršnak, Donik and Sabolić, Ivan and Subašić, Marko and Lončarić, Sven}, booktitle = {2023 International Symposium on Image and Signal Processing and Analysis (ISPA)}, title = {Computational Color Constancy-Based Backdoor Attacks}, year = {2023}, volume = {}, number = {}, pages = {1-6}, keywords = {Training;Computer vision;Additives;Image color analysis;Computational modeling;Pipelines;Lighting}, }