Your organisation has probably never approved these AI tools, but someone in accounting is using ChatGPT to draft reports. Someone in HR is using an AI tool to write job descriptions. Someone in product is using Claude for code generation. And someone in finance is feeding sensitive spreadsheets to a cloud LLM for analysis.
This is shadow AI, and it's the governance problem that your security team is probably ignoring.
Shadow AI is fundamentally different from shadow IT. When employees use unsanctioned cloud applications, it's a data leakage risk and a compliance headache. When they use AI tools with sensitive data, it's the same risks plus the specific attack surface of LLMs.
The Scale of the Problem
A 2026 survey of Australian organisations found that 78% of employees use generative AI tools for work, but only 23% of organisations have formal policies governing their use. The gap is enormous.
And it's growing. As AI tools become easier to use and free tiers proliferate, the adoption curve is exponential. By the time your organisation finalises its AI governance policy, shadow AI usage will have doubled.
The risks are real:
- Data leakage: Employees pasting sensitive information into cloud AI tools that train on user data
- Compliance violations: Customer data fed to AI tools violates GDPR, Privacy Act, financial services regulations
- Intellectual property exposure: Proprietary code, strategies, and customer lists exposed to AI tools whose training practices are opaque
- Model poisoning: Sensitive data fed to commercial AI tools poisons those models, affecting other users
"Shadow AI isn't a security problem tomorrow. It's a security problem today, and you probably don't even know the scope of it."
Detection: Finding What You Don't Know You Have
Before you can control shadow AI, you need to know it exists.
Network-Based Detection
- Monitor outbound traffic for connections to known AI tools (OpenAI, Anthropic, Cohere, etc.)
- Scan DNS queries for AI tool domains
- Analyse outbound traffic patterns—AI tools have characteristic bandwidth signatures
Endpoint-Based Detection
- Browser history analysis: which AI tools are being accessed
- Clipboard monitoring: what's being copied to/from AI tools
- Application inventory: which tools are installed
User Surveys and Interviews
- Ask teams directly: "What AI tools do you use?" (Often, the honest answer is surprising)
- Interview power users in high-risk departments (finance, legal, R&D)
- Incentivise disclosure: "Tell us what you're using so we can help you use it safely"
Log Analysis
- Email: scan for attachments being sent to AI tools or forwarded to suspicious addresses
- VPN logs: which external addresses are frequently accessed
- API logs: unusual API calls or integrations with external services
The Data Leakage Vectors
Why is shadow AI a data security problem?
Training Data Usage
Most commercial AI tools reserve the right to use conversations for training. When an employee pastes customer data into ChatGPT, that data goes into OpenAI's training pipeline. It then becomes part of the model that OpenAI trains and sells.
For organisations handling PII or confidential customer information, this is a catastrophic breach of privacy and compliance obligations.
Cross-Customer Contamination
When data from your organisation is added to an LLM's training, it can inadvertently leak to competitors using the same model. A prompt injection attack could expose your sensitive data to another user.
Model Memorization
As we discussed earlier, LLMs memorise training data. Sensitive information fed to cloud AI tools becomes memorised and extractable by anyone querying the model.
Building an AI Acceptable Use Policy
You need governance, not just restrictions. Here's what an effective AI policy includes:
1. Approved Tools and Platforms
- Maintain a whitelist of approved AI tools
- For each tool, document: data handling practices, privacy commitments, vendor risk assessment
- Approval process: teams can request tools; security team evaluates and approves
- Categories: unrestricted tools (for non-sensitive work), restricted tools (internal only), banned tools
2. Data Classification Requirements
- What data can be sent to which tools?
- Customer PII: generally banned from cloud AI tools
- Internal strategies: restricted to approved internal tools only
- Public data: generally safe for any tool
- Require employees to assess data classification before using AI tools
3. Contractual Requirements
- Demand Data Processing Agreements from AI tool vendors
- Require opt-outs: "Do not train on my data"
- Demand transparency: vendors must disclose training practices
- Liability clauses: vendors liable if data is breached or misused
4. Usage Guidelines
- Don't paste sensitive data into public AI tools
- Don't upload files containing PII
- Don't use AI tools for creative work that might incorporate training data from competitors
- Review AI outputs before using them in customer-facing contexts (models can hallucinate)
5. Monitoring and Compliance
- Continuous monitoring for shadow AI usage
- Regular audits: sample employee usage, verify compliance
- Incident response: if policy violation detected, immediate escalation
- Training: annual refresher on AI acceptable use
6. Exceptions and Request Process
- Teams may request exceptions for specific use cases
- Formal review: security, compliance, privacy teams assess risk
- Mitigations: if approved, implement controls (data pseudonymisation, contractual terms, etc.)
Practical Implementation Tips
- Start with discovery: Don't build a policy in a vacuum. Survey teams and understand current usage patterns.
- Be pragmatic: A restrictive policy that everyone violates is worse than a balanced policy with buy-in.
- Provide alternatives: If you ban ChatGPT, offer an approved alternative.
- Focus on high-risk areas: Legal, finance, and R&D are highest priority. Marketing can wait.
- Communicate the why: Employees need to understand why AI governance matters. Data breaches are not abstract.
Key Takeaways
- Shadow AI adoption is rampant in most organisations and largely invisible to security teams
- The risk isn't just data leakage—it's compliance violations, IP exposure, and model poisoning
- Effective detection requires network monitoring, endpoint analysis, and user engagement
- AI acceptable use policies must balance security with enabling innovation
- Contractual requirements with AI vendors are critical to protecting sensitive data
- Governance is more effective than prohibition