The Core Challenge

AI systems face a unique threat landscape combining traditional cybersecurity risks with novel attack vectors that exploit machine learning mechanisms. Nation-state actors and sophisticated criminals are actively developing AI attack capabilities.

Key Concepts

Adversarial attacks Carefully crafted inputs designed to cause AI systems to fail or produce incorrect outputs.
Data poisoning Corrupting training data to embed malicious behaviour into AI models.
Model extraction Attacks that steal the AI model itself or sensitive information embedded within it.
Prompt injection Manipulating AI systems through malicious inputs that override intended behaviour.
AI supply chain security Protecting against vulnerabilities in pre-trained models, datasets, and tools from external sources.

Warning Signs

Watch for these indicators of AI security vulnerability:

  • AI systems are only tested for accuracy, not adversarial resilience
  • No threat modelling has been done for AI-specific attack vectors
  • Provenance of pre-trained models and datasets is unknown
  • Incident response plans don't address AI-specific scenarios
  • Input validation doesn't include anomaly detection for adversarial patterns
  • Security assessments treat AI like traditional software

Questions to Ask in AI Project Reviews

  • "What adversarial testing has been done? What attack vectors were tested?"
  • "Where do the pre-trained models and training data come from? What validation was done?"
  • "What input validation exists to detect and reject potentially adversarial inputs?"

Questions to Ask in Governance Discussions

  • "What's our visibility into the security status of AI components—models, data, tools?"
  • "Are incident response plans updated for AI-specific attack scenarios?"
  • "What AI security expertise do we have access to?"

Questions to Ask in Strategy Sessions

  • "How does AI change our threat landscape? Are we investing appropriately?"
  • "What's our exposure to AI supply chain risks?"
  • "What would happen if an adversary successfully compromised one of our AI systems?"

Reflection Prompts

  1. Your awareness: Do you understand how AI systems can be attacked differently from traditional systems?
  2. Your organisation's posture: If attackers targeted your AI systems, how confident are you in defences?
  3. Your role: What could you do to ensure AI security receives appropriate attention?

Good Practice Checklist

  • AI-specific threat modelling is part of security assessment
  • Adversarial testing occurs before deployment
  • Red teams include AI attack expertise
  • Supply chain security covers models, data, and tools
  • Input validation includes adversarial pattern detection
  • Incident response is prepared for AI-specific scenarios

Quick Reference

Element Question to Ask Red Flag
Threat model What AI attacks are we defending against? Only traditional cyber considered
Testing What adversarial testing was done? Only accuracy testing
Supply chain Where do AI components come from? Unknown provenance
Detection How are attacks detected? No AI-specific monitoring
Response What's the plan if AI is compromised? No specific procedures

AI Attack Types Explained

Evasion attacks: Inputs crafted to fool the AI (e.g., modified images causing misclassification). Defence: adversarial testing and hardening.

Poisoning attacks: Corrupting training data to embed bad behaviour. Defence: data provenance and integrity monitoring.

Model stealing: Extracting the model or sensitive data through queries. Defence: access controls and query monitoring.

Prompt injection: Hijacking AI behaviour through malicious inputs. Defence: input validation and output filtering.

NCSC guidance: The National Cyber Security Centre has published guidance on AI security. Your security team should be familiar with it.