The Core Challenge
AI systems face a unique threat landscape combining traditional cybersecurity risks with novel attack vectors that exploit machine learning mechanisms. Nation-state actors and sophisticated criminals are actively developing AI attack capabilities.
Key Concepts
| Adversarial attacks | Carefully crafted inputs designed to cause AI systems to fail or produce incorrect outputs. |
| Data poisoning | Corrupting training data to embed malicious behaviour into AI models. |
| Model extraction | Attacks that steal the AI model itself or sensitive information embedded within it. |
| Prompt injection | Manipulating AI systems through malicious inputs that override intended behaviour. |
| AI supply chain security | Protecting against vulnerabilities in pre-trained models, datasets, and tools from external sources. |
Warning Signs
Watch for these indicators of AI security vulnerability:
- AI systems are only tested for accuracy, not adversarial resilience
- No threat modelling has been done for AI-specific attack vectors
- Provenance of pre-trained models and datasets is unknown
- Incident response plans don't address AI-specific scenarios
- Input validation doesn't include anomaly detection for adversarial patterns
- Security assessments treat AI like traditional software
Questions to Ask in AI Project Reviews
- "What adversarial testing has been done? What attack vectors were tested?"
- "Where do the pre-trained models and training data come from? What validation was done?"
- "What input validation exists to detect and reject potentially adversarial inputs?"
Questions to Ask in Governance Discussions
- "What's our visibility into the security status of AI components—models, data, tools?"
- "Are incident response plans updated for AI-specific attack scenarios?"
- "What AI security expertise do we have access to?"
Questions to Ask in Strategy Sessions
- "How does AI change our threat landscape? Are we investing appropriately?"
- "What's our exposure to AI supply chain risks?"
- "What would happen if an adversary successfully compromised one of our AI systems?"
Reflection Prompts
- Your awareness: Do you understand how AI systems can be attacked differently from traditional systems?
- Your organisation's posture: If attackers targeted your AI systems, how confident are you in defences?
- Your role: What could you do to ensure AI security receives appropriate attention?
Good Practice Checklist
- AI-specific threat modelling is part of security assessment
- Adversarial testing occurs before deployment
- Red teams include AI attack expertise
- Supply chain security covers models, data, and tools
- Input validation includes adversarial pattern detection
- Incident response is prepared for AI-specific scenarios
Quick Reference
| Element | Question to Ask | Red Flag |
|---|---|---|
| Threat model | What AI attacks are we defending against? | Only traditional cyber considered |
| Testing | What adversarial testing was done? | Only accuracy testing |
| Supply chain | Where do AI components come from? | Unknown provenance |
| Detection | How are attacks detected? | No AI-specific monitoring |
| Response | What's the plan if AI is compromised? | No specific procedures |
AI Attack Types Explained
Evasion attacks: Inputs crafted to fool the AI (e.g., modified images causing misclassification). Defence: adversarial testing and hardening.
Poisoning attacks: Corrupting training data to embed bad behaviour. Defence: data provenance and integrity monitoring.
Model stealing: Extracting the model or sensitive data through queries. Defence: access controls and query monitoring.
Prompt injection: Hijacking AI behaviour through malicious inputs. Defence: input validation and output filtering.
NCSC guidance: The National Cyber Security Centre has published guidance on AI security. Your security team should be familiar with it.