The zero-trust security model is foundational for traditional IT: assume breach, verify everything, grant least privilege. It works well for networks, applications, and infrastructure. But when your security team tries to apply zero-trust to AI systems, something breaks. And usually, it breaks spectacularly.
The problem is this: zero-trust assumes you know what "normal" looks like. For traditional systems, normal is deterministic—a user either should have access to a file or shouldn't. For AI systems, normal is probabilistic. A model's behaviour is inherently uncertain. You can't implement zero-trust for something you don't understand yet.
Why Zero-Trust Breaks for AI Systems
Zero-trust has three core pillars: verify identity, enforce least privilege, and assume breach. On the surface, these sound right for AI.
But the assumption embedded in zero-trust is that you can definitively answer: "Is this normal?" For traditional systems, you can. For AI systems, you cannot—not initially.
Consider a recommendation model in a retail bank. Zero-trust says: log all model inputs and outputs, verify they're legitimate, restrict data access to minimum needed. But what does "minimum needed" mean? The model needs customer transaction history to make recommendations. It also needs market data. Where's the line?
More fundamentally: what does "legitimate" output look like? If the model recommends an unusual investment product to a customer, is that anomalous (potential breach) or just normal model behaviour? Without baseline understanding, you can't know.
"Zero-trust assumes you know what normal is. For most AI systems, you don't. Not yet. That's the gap."
The Baseline Trust Model
Before you implement zero-trust for AI, you need to establish what normal looks like. This is the baseline trust model.
A baseline trust model answers these questions:
- What are typical input distributions for this model?
- What are typical output distributions?
- What data does the model actually need?
- What confidence scores are normal?
- How does the model behave on edge cases?
- What's the model's error distribution?
Building this requires observation. You run the model in a normal operating environment, log everything, and collect statistics. Only after you have months of baseline data can you detect anomalies with confidence.
The Three-Phase Approach
Phase 1: Establish Baseline (Months 1-3)
Deploy the model to production with comprehensive logging but minimal restrictions.
- Log all inputs: every query, every parameter, every data request
- Log all outputs: predictions, confidence scores, latency
- Log all data access: what databases did the model query, what was returned
- Capture metadata: user identity, timestamp, model version
During this phase, security is intentionally loose. You're gathering signal. The model operates with generous data access. You don't restrict query patterns. The goal is to see how the model naturally behaves.
Phase 2: Define Normal (Months 3-6)
Analyze baseline data to establish statistical models of normal behaviour.
- Input profiling: Distribution of input feature values, correlations, anomalies
- Output profiling: Expected confidence distribution, error rates, decision boundaries
- Data flow mapping: Which databases does the model actually query? Which features actually matter?
- Latency and resource profiling: Memory usage, computation time, infrastructure load
- User pattern analysis: Who queries the model? How frequently? What types of requests?
This is detective work. You're building a statistical model of "normal" that you can use to detect "abnormal".
Phase 3: Implement Zero-Trust Controls (Month 6+)
Now you lock things down based on what you learned.
- Least privilege data access: Restrict model to only the databases and tables it actually queries
- Input validation: Reject requests that deviate significantly from normal input distributions
- Anomaly detection: Alert on confidence scores, output distributions, or latency that deviate from baseline
- Query rate limiting: Based on normal user patterns, limit queries per user, per hour, per day
- Output filtering: Block outputs that violate policies or contain PII
Real-World Case Study: Australian Financial Services
A major Australian bank deployed an AI credit decisioning model. The security team, familiar with zero-trust, wanted to implement it immediately: log everything, restrict data access, flag anomalies.
Problem: after two weeks, the model stopped working. Why? The zero-trust implementation was blocking legitimate queries because the security team didn't understand the model's data requirements.
The model needed to query: customer transaction history (last 24 months), credit bureau data, market indicators, and internal risk models. But the security team didn't know which of these were critical. They restricted access to the most sensitive database (customer transactions) immediately.
Model accuracy dropped 15%. Business users complained. The project nearly failed.
What should have happened:
- Month 1-2: Deploy model with full data access, log everything
- Month 2: Analyze logs. Find that customer transactions are used in 87% of decisions. Credit bureau data in 92%. Market indicators in 31%
- Month 3: Implement least privilege: grant full access to frequently-used databases, limited access to others
- Month 4+: Layer zero-trust controls on top of this foundation
The bank eventually followed this approach and successfully deployed zero-trust controls by month 4. But they lost two months due to starting with the wrong assumption.
Mapping Data Flows
A critical part of establishing baseline trust is understanding data flows. Where does data come from? Where does it go?
For an AI model, this includes:
- Input sources: APIs, databases, user uploads, real-time feeds
- Computational resources: CPU, GPU, memory, network bandwidth
- Data access: Which databases, tables, and fields does the model query
- Output destinations: APIs, dashboards, reports, downstream systems
- Dependencies: Other models, feature stores, reference data
Map all of these before you lock things down. Then, zero-trust controls can be applied surgically: restrict access to what's necessary, monitor for deviations, and escalate on anomalies.
Practical Implementation
Here's a practical checklist for implementing baseline trust + zero-trust for AI:
- Deploy model with comprehensive logging (3 months)
- Analyse baseline data to define normal (3 months)
- Implement least-privilege data access based on analysis
- Deploy anomaly detection on inputs, outputs, and performance metrics
- Implement query rate limiting based on normal user patterns
- Set up alerts for deviations from baseline
- Monthly reviews: update baseline models as data distribution shifts
Key Takeaways
- Zero-trust requires understanding what "normal" is. For AI, you must establish baseline first
- Baseline trust models are built through observation and statistical analysis, not assumptions
- Three-phase approach: establish baseline (loose), define normal (analysis), implement zero-trust (tight)
- Data flow mapping is critical before applying zero-trust controls
- Least-privilege access should be based on actual model requirements, not security team assumptions
- Anomaly detection is more effective than static rules for AI systems