Detecting Adversarial Attacks

We'll be diving into adversarial attack techniques, specifically white-box attacks. In contrast to black-box attacks, white-box techniques require full access to the model (architecture, weights, gradients, etc). To further explore the differences between white-box and black-box attacks, make sure to check out the Pentesting Fundamentals room. Additionally, if you want to learn more about and machine learning () attacks, make sure you check out the Introduction to threats room. Most adversarial attacks work by adding small, crafted noise to an input to trick the model into outputting the wrong prediction or result, but in a way that is nearly identical to the expected result. We'll be looking at three examples: , BIM and .

Learning Prerequisites

Python proficiency is not necessary to complete the room, but it will allow you to explore the practical more deeply.
Familiarity with Jupyter Notebook. (opens in new tab)

Learning Objectives

Understand adversarial attacks in and Machine Learning.
Identify adversarial attacks, specifically the Fast Gradient Sign Method ().
Learn examples of detection mechanisms for adversarial attacks.

Answer the questions below

I'm ready!

Detecting Adversarial Attacks

Task 1Introduction

Learning Prerequisites

Learning Objectives

Task 2Adversarial Techniques OverviewPremium

Task 3Understanding FGSM and variantsPremium

Task 4Identifying Adversarial AttacksPremium

Task 5ConclusionPremium

Ready to learn Cyber Security?