TAAFT
Free mode
100% free
Freemium
Free Trial
Deals

Adversarial Attacks

[ædˈvɜrsəriəl əˈtæks]
Ethics & Safety
Last updated: December 9, 2024

Definition

Intentional manipulations of AI systems designed to cause incorrect outputs or behaviors through specially crafted inputs.

Detailed Explanation

Adversarial attacks exploit vulnerabilities in AI models by introducing carefully designed perturbations to input data that can cause the model to make mistakes while appearing normal to human observers. These attacks can be white-box (with knowledge of model architecture) or black-box (without such knowledge) and can target either training or inference phases.

Use Cases

Testing autonomous vehicle perception systems Evaluating facial recognition security Assessing medical diagnosis model reliability Protecting against fraud in financial systems

Related Terms