Skip to main content
Back to Blog
aisecuritywebsocketshieldcortexopen-source

A Real WebSocket Hijack Hit an AI Agent Framework. Here's What We Learned About Defense-in-Depth.

Drakon Systems··6 min read
Share:

Layered security architecture for AI agents

The Vulnerability Nobody Expected

Last week, a critical vulnerability was disclosed in OpenClaw — one of the more capable open-source AI agent frameworks. The issue? WebSocket brute-force hijacking on the localhost gateway.

The gateway — the nerve centre that connects your AI agent to messaging surfaces, tools, and the outside world — was using predictable authentication tokens. An attacker on the same network could brute-force the WebSocket connection and inject arbitrary commands into the agent's session.

Think about what that means. Your AI agent has access to your emails, your files, your APIs, maybe your smart home or your financial systems. Someone connects to the gateway, and they are you.

The fix landed in v2026.2.25 with cryptographically strong token generation. If you're running OpenClaw, update immediately.

But this incident exposed something far more important than a single vulnerability.

The Layer Problem in AI Agent Security

Here's the uncomfortable truth: most AI agent deployments have zero defense-in-depth.

Traditional software security thinks in layers:

  • Network Security — Firewalls, network segmentation
  • Transport Security — TLS, authentication tokens ← The WebSocket fix lives here
  • Application Security — Input validation, access controls
  • Data Security — Encryption, PII protection, audit logging

But AI agents? Most operators patch one layer and call it done. The OpenClaw fix secured the transport layer — excellent. But what happens when the next vulnerability isn't at the transport layer?

What if it's:

  • A prompt injection hidden in an email your agent reads?
  • A malicious webhook payload that tricks your agent into exfiltrating data?
  • A compromised sub-agent in a multi-agent workflow that escalates privileges?

Patching the front door doesn't help when the attack comes through the mail slot.

What We Built (and Why)

At Drakon Systems, we run AI agents in production — not demos, not proofs-of-concept. Real agents handling real school administration, real business operations, real financial data. That means we couldn't afford to rely on a single security layer.

Here's the architecture we've developed, and how each layer would have helped during a WebSocket hijack scenario — even before the framework patched it.

1. Instruction Gateway Control

Every external input your agent processes is a potential attack vector. Emails, API responses, webhook payloads, even documents uploaded by users.

Our Instruction Gateway scans all inbound content for instruction-like patterns before the agent ever sees it. Things like "ignore previous instructions", encoded payloads, or social engineering attempts get flagged and blocked.

During a hijack? The attacker connects to the gateway, but their injected commands hit the instruction scanner first. Suspicious patterns get blocked before reaching the agent's reasoning loop.

2. Action Gating

Your agent should not have a blank cheque for external actions. We separate "thinking" (reading files, searching, organising) from "acting" (sending emails, making API calls, posting publicly).

Every external action passes through an allowlist. If the target isn't pre-approved, the action is blocked and the owner is alerted.

During a hijack? Even if the attacker sends "email all contacts with this payload" — the action gate blocks it because the target isn't in the allowlist. The blast radius drops to near zero.

3. PII Protection

AI agents process sensitive data. Student records, financial details, personal information. Hard-coded rules prevent specific data categories from ever appearing in output, regardless of what the agent is asked to do.

This isn't just good security practice — in the UK, it's GDPR compliance. Our agents handle school data but are physically incapable of outputting individual pupil records. Aggregates only, every time.

4. Sub-Agent Sandboxing

If you're running multi-agent workflows — and you should be, they're powerful — each sub-agent should inherit a security context but never escalate beyond it. A sub-agent spawned for a specific task shouldn't be able to access secrets, send emails, or reach APIs it doesn't need.

During a hijack? Even if the attacker spawns sub-agents, they inherit restricted permissions. No lateral movement, no privilege escalation.

5. Audit Everything

Every external action hits an append-only log. Not just for security — for debugging, for compliance, for understanding what your agent actually does when you're not watching.

During a hijack? The audit trail captures every command the attacker sent, making forensics straightforward and providing evidence for incident response.

The Honest Take

Could ShieldCortex or Iron Dome have prevented the WebSocket brute-force connection? No. That was a transport-layer flaw in the framework itself, and it needed a framework-level fix.

But would they have limited the damage once an attacker connected? Absolutely. That's the entire point of defense-in-depth — you assume every layer will eventually be breached, and you build the next layer to contain the blast radius.

The security industry learned this decades ago for traditional software. The AI agent ecosystem is still catching up.

What You Can Do Today

Whether you use our tools or build your own, the principles are the same:

  1. Update your framework. Patch the transport layer first — it's the easiest win.
  2. Audit your agent's access. List every tool, API, and system your agent can touch. Minimise that list aggressively.
  3. Scan inbound content. Even basic pattern matching for injection attempts catches a surprising amount of automated attacks.
  4. Gate external actions. Your agent should need approval before sending, not apologise after.
  5. Log everything. You can't secure what you can't see.

Our Open-Source Tools

We've open-sourced the security patterns we use in production:

  • Iron Dome — A security framework for AI agents that implements instruction scanning, action gating, PII protection, and sub-agent sandboxing as an OpenClaw skill.
  • ShieldCortex — The broader project building production-grade security tooling for AI agents, including memory integrity protection.

Both are free, both are open source. But even if you never use them — start thinking about agent security in layers. The frameworks will keep improving their transport security. The question is: what's protecting your agent when the next vulnerability isn't at the transport layer?


We build AI infrastructure and security tooling at Drakon Systems. If you're running AI agents in production and want to talk security architecture, get in touch.

Want to save hours on invoice processing?

Try Drakon Invoice Importer free - 15 invoices/month, no credit card required.

Start Free Trial