Skip to main contentSkip to main content
Room Banner
Back to all walkthroughs
Room Icon

Detecting Adversarial Attacks

Premium room

Learn how to identify and analyse adversarial attacks.

medium

60 min

2,651

User profile photo.
User profile photo.

To access material, start machines and answer questions login.

We'll be diving into adversarial attack techniques, specifically white-box attacks. In contrast to black-box attacks, white-box techniques require full access to the model (architecture, weights, gradients, etc). To further explore the differences between white-box and black-box attacks, make sure to check out the Pentesting Fundamentals room. Additionally, if you want to learn more about and machine learning () attacks, make sure you check out the Introduction to threats room. Most adversarial attacks work by adding small, crafted noise to an input to trick the model into outputting the wrong prediction or result, but in a way that is nearly identical to the expected result. We'll be looking at three examples: , BIM and .

Learning Prerequisites

Learning Objectives

  • Understand adversarial attacks in and Machine Learning.
  • Identify adversarial attacks, specifically the Fast Gradient Sign Method ().
  • Learn examples of detection mechanisms for adversarial attacks.
Answer the questions below

I'm ready!