Framework 9: Security & Adversarial Resilience

The Core Challenge

AI systems face a unique threat landscape combining traditional cybersecurity risks with novel attack vectors that exploit machine learning mechanisms. Nation-state actors and sophisticated criminals are actively developing AI attack capabilities.

Key Concepts

Adversarial attacks	Carefully crafted inputs designed to cause AI systems to fail or produce incorrect outputs.
Data poisoning	Corrupting training data to embed malicious behaviour into AI models.
Model extraction	Attacks that steal the AI model itself or sensitive information embedded within it.
Prompt injection	Manipulating AI systems through malicious inputs that override intended behaviour.
AI supply chain security	Protecting against vulnerabilities in pre-trained models, datasets, and tools from external sources.

Warning Signs

Watch for these indicators of AI security vulnerability:

AI systems are only tested for accuracy, not adversarial resilience
No threat modelling has been done for AI-specific attack vectors
Provenance of pre-trained models and datasets is unknown
Incident response plans don't address AI-specific scenarios
Input validation doesn't include anomaly detection for adversarial patterns
Security assessments treat AI like traditional software

Questions to Ask in AI Project Reviews

"What adversarial testing has been done? What attack vectors were tested?"
"Where do the pre-trained models and training data come from? What validation was done?"
"What input validation exists to detect and reject potentially adversarial inputs?"

Questions to Ask in Governance Discussions

"What's our visibility into the security status of AI components—models, data, tools?"
"Are incident response plans updated for AI-specific attack scenarios?"
"What AI security expertise do we have access to?"

Questions to Ask in Strategy Sessions

"How does AI change our threat landscape? Are we investing appropriately?"
"What's our exposure to AI supply chain risks?"
"What would happen if an adversary successfully compromised one of our AI systems?"

        Reflection Prompts
        Your awareness: Do you understand how AI systems can be attacked differently from traditional systems?
Your organisation's posture: If attackers targeted your AI systems, how confident are you in defences?
Your role: What could you do to ensure AI security receives appropriate attention?

      

Good Practice Checklist

AI-specific threat modelling is part of security assessment
Adversarial testing occurs before deployment
Red teams include AI attack expertise
Supply chain security covers models, data, and tools
Input validation includes adversarial pattern detection
Incident response is prepared for AI-specific scenarios

Quick Reference

Element	Question to Ask	Red Flag
Threat model	What AI attacks are we defending against?	Only traditional cyber considered
Testing	What adversarial testing was done?	Only accuracy testing
Supply chain	Where do AI components come from?	Unknown provenance
Detection	How are attacks detected?	No AI-specific monitoring
Response	What's the plan if AI is compromised?	No specific procedures

AI Attack Types Explained

Evasion attacks: Inputs crafted to fool the AI (e.g., modified images causing misclassification). Defence: adversarial testing and hardening.

Poisoning attacks: Corrupting training data to embed bad behaviour. Defence: data provenance and integrity monitoring.

Model stealing: Extracting the model or sensitive data through queries. Defence: access controls and query monitoring.

Prompt injection: Hijacking AI behaviour through malicious inputs. Defence: input validation and output filtering.

NCSC guidance: The National Cyber Security Centre has published guidance on AI security. Your security team should be familiar with it.

← Previous: Organisational Learning Next: Reliability & Performance →