What We Target
Organisations are shipping AI systems quickly — LLM-powered chatbots, RAG pipelines, automated decision engines — and often without the security controls that traditional IT systems get as a matter of course. The attack surface is genuinely different. Prompt injection doesn't look like SQL injection. Training data poisoning doesn't show up in a port scan. Most standard pen testers haven't seen it before.
We assess the full lifecycle of your AI deployments — from training data and model architecture through production APIs and end-user interactions — looking for exploitable vulnerabilities, data leakage, compliance gaps, and bias risks before they turn into incidents.
LLM Integrations
ChatGPT, Claude, Gemini, and custom LLM deployments — prompt injection testing, output filter bypass, system prompt extraction, and sensitive data leakage via generation.
RAG Pipelines
Retrieval-Augmented Generation systems — indirect prompt injection via poisoned documents, vector database access controls, and cross-tenant information boundary testing.
ML Model Deployments
Production ML models — model inversion attacks, membership inference, input fuzzing, model extraction via API probing, and supply chain integrity of third-party model weights.
Automated Decision Systems
AI-driven decisioning in HR, finance, lending, and operations — output fairness testing, explainability gaps, and EU AI Act compliance obligations for high-risk systems.
AI-Specific Security Testing
Standard penetration testing doesn't cover AI-specific attack classes. Prompt injection, training data poisoning, model extraction, and inference attacks require testing methodologies that understand how these systems actually work — how they process context, retrieve documents, generate outputs, and expose state through repeated API interactions.
- Prompt injection testing — direct injection into user inputs, indirect injection via retrieved documents in RAG pipelines, and multi-turn manipulation to erode system prompt constraints
- Training data poisoning assessment — training data integrity, fine-tuning data provenance, and RAG corpus contamination (inserting adversarial documents to manipulate model outputs)
- Model inversion and extraction attacks — can an attacker reconstruct training data or proprietary model logic through repeated API queries? We test it
- API security testing — rate limiting, authentication, authorisation, and input validation on model-serving endpoints (including Burp Suite-based fuzzing of API parameters)
- Supply chain security — third-party model provenance, dependency integrity, and container security for model-serving infrastructure
- Output security testing — PII leakage in generated outputs, cross-tenant data exposure in multi-tenant deployments, and generation of harmful or policy-violating content
- Jailbreak and guardrail bypass — systematic testing of content safety filters and system prompt protections to find bypasses before users do
Data Protection & Regulatory Alignment
AI systems process and generate data in ways that don't map cleanly onto traditional data protection frameworks. The EU AI Act, GDPR, and sector regulators impose real obligations — transparency requirements, data subject rights, risk tier classifications. We audit against the actual regulatory text, not a generic checklist.
- Training data provenance — sourcing documentation, consent basis, and lawful processing assessment
- PII handling — detection of personal data in training sets, RAG corpora, model outputs, and logs
- GDPR compliance — data subject rights (access, erasure, rectification) applied to AI-processed data
- EU AI Act classification — risk tier assessment and corresponding obligation mapping
- Data retention and minimisation — assessing whether AI systems retain data beyond stated purposes
- Cross-border data transfer — identifying where AI processing occurs and applicable transfer mechanisms
- Transparency and disclosure — assessment of user-facing AI transparency notices and consent mechanisms
Fairness Testing & Ethical Assessment
Biased AI outputs aren't just an ethics problem — they're a legal liability. Automated decision systems in HR, lending, insurance, and customer service attract regulatory scrutiny and litigation. We test for exploitable bias systematically, using the same adversarial mindset applied to security testing.
- Output bias testing — systematic evaluation across protected characteristics (race, gender, age, disability, religion)
- Decision fairness analysis — statistical parity, equalised odds, and disparate impact assessment for automated decisions
- Explainability evaluation — assessing whether AI decisions can be meaningfully explained to affected individuals
- Documentation for regulatory readiness — conformity assessments, impact assessments, and technical documentation aligned to EU AI Act requirements
- Benchmark testing — comparing model outputs against fairness benchmarks and industry standards
- Remediation guidance — specific recommendations for mitigating identified bias with minimal impact on model performance
What We Deliver
Every assessment closes with a deliverable package built for both technical teams and executive leadership — enough detail for internal governance and enough context for board-level reporting.
AI Risk Register
All identified vulnerabilities, privacy risks, and bias findings with severity ratings and exploitability context.
Compliance Gap Analysis
Current state mapped against GDPR, EU AI Act, and relevant sector-specific requirements.
Remediation Roadmap
Prioritised recommendations with implementation guidance and estimated effort.
Executive Summary
Non-technical findings overview with clear risk posture and recommendations — board-ready.
Technical Appendices
Testing methodology, evidence, and reproduction steps for all findings.