ClawHavoc, supply chain attacks, and building a safe skill ecosystem
1,184Malicious Skills
6Attack Types
5Mitigations
The ClawHavoc Incident
First Major Supply Chain Attack on AI Agent Skills
Security researchers discovered 1,184 malicious skills on ClawHub, the community registry for OpenClaw skills. The attack, dubbed ClawHavoc, demonstrated full attack chains from skill installation to data exfiltration.
Malicious skills mimicked popular legitimate skills with typosquatted names. Thousands of developers installed them before detection. The incident exposed fundamental gaps in the trust model for shared AI agent skills.
Attack timeline:
Attackers published skills with names similar to popular ones
Skills contained hidden instructions in markdown comments
Agent loaded the skill and followed hidden instructions
MCP tools were called with malicious parameters
Credentials and code were exfiltrated via HTTP requests
Detection took weeks due to no scanning infrastructure
Attack Surface
Why Skills Are Vulnerable
Skills are executable instructions that an AI agent follows with the same authority as user commands. Community registries are open-submission. There is no built-in sandboxing or permission scoping in most implementations.
ExecutableOpen RegistryNo SandboxTrust Gap
Scale of Risk
By the Numbers
1,184
malicious skills found
6
distinct attack categories
1000s
of downloads before detection
1. Prompt Injection
Hidden Instructions
Malicious instructions embedded in skill body (often in HTML comments or zero-width characters). Agent executes attacker commands believing they are part of the skill procedure.
SEVERITY: CRITICAL
2. Tool Poisoning
Legitimate Tools, Malicious Use
Skills misuse MCP tools with malicious parameters. The tool call itself is legitimate (e.g., HTTP request), but the destination or payload is attacker-controlled.
SEVERITY: CRITICAL
3. Malware Delivery
Scripts That Bite
Scripts in the skills scripts/ directory that download and execute malware. Leverages the agent's system access and user permissions to install payloads.
SEVERITY: HIGH
4. Credential Leakage
Secrets Stolen Silently
Skills that log, transmit, or expose environment variables and API keys. Exfiltration often hidden in seemingly innocent tool calls or network requests.
SEVERITY: HIGH
Attack Flow
How a Malicious Skill Executes
5. Untrusted Content
Remote Fetch Risks
Skills that fetch content from remote URLs without validation. Enables TOCTOU (time-of-check-time-of-use) attacks where content changes between review and execution.
SEVERITY: MEDIUM
6. Toxic Flows
Innocent Steps, Harmful Result
Each individual step looks legitimate. But chained together, they form a destructive sequence. Hardest to detect because no single action triggers alerts.
Static analysis tool that scans SKILL.md files for known prompt injection patterns, suspicious tool calls, and credential access.
Skill Signing
Cryptographic verification of skill authorship. Like GPG signing for git commits. Ensures a skill has not been tampered with after publication.
Sandboxed Execution
Run skill scripts in isolated containers with no network access by default. Must explicitly declare network, filesystem, and tool permissions.
Permission Boundaries
Skills declare which MCP tools they need. Agent enforces least-privilege. A deploy skill cannot access email tools. A formatting skill cannot make HTTP requests.
No single layer is sufficient. The combination of static analysis, cryptographic integrity, runtime isolation, permission scoping, and community review creates a layered defense that is much harder to bypass than any individual measure.
05 — Security & Trust · AI Agent Skills · See also: 01 Helicopter · 02 Spec Deep Dive · 03 Progressive Disclosure · 04 Knowledge StackAI Agent Skills