Capsule Blog

We Analyzed 206,435 AI Agent Skills. Here's What We Found.

Bar Kaduri
June 23, 2026

Agent skills are rapidly becoming the fastest-growing software supply chain in AI, and arguably one of the least governed.

A "skill" is essentially a lightweight package of instructions that teaches an AI agent how to perform a specific task. By installing a skill, an agent can instantly learn a deployment workflow, a code review process, an incident response procedure, or even just how to format its output. The concept is undeniably powerful because it allows teams to package expertise once and share it seamlessly across agents instead of rebuilding it from scratch.

However, as part of Capsule’s State of AI Agent Security research, we collected and analyzed 206,435 publicly available agent skills from GitHub and major skill registries. Our analysis revealed an ecosystem outpacing its own security controls. While we found traditional malware, the bigger threat lies in widespread access to sensitive credentials, thousands of skills capable of silent data exfiltration, and a near-total lack of guardrails governing what these skills can actually do once installed.

Agent skills are the new software supply chain. But unlike traditional packages, they influence behavior rather than just executing code. That distinction creates a fundamentally new security challenge that most existing controls were never designed to address.

The Fastest-Growing Layer in the Agent Stack

The appeal of agent skills is straightforward: they offer instant, plug-and-play expertise.

Since Anthropic formalized the Agent Skills specification in December 2025, adoption has skyrocketed. Major players like OpenAI, Microsoft, GitHub, Atlassian, Cursor, and Figma quickly embraced compatible implementations. By April 2026, we observed roughly 800,000 skill files on public GitHub repositories and additional skills distributed through dedicated registries.

To understand how frictionless this distribution is, look at a popular community skill called "Caveman". Its premise is incredibly simple:

"Why use many token when few token do trick."

Once installed, the agent immediately begins communicating in a highly compressed style while preserving technical accuracy. It became widely adopted simply because it saved tokens and improved efficiency for common workflows. The installation takes seconds, and the value is immediate.

Unfortunately, the exact characteristics that make skills so useful also make them incredibly difficult to govern.

The Anatomy of a Skill: Memory Injection, Not Just Data

To understand the security risk, we have to look at how a skill is actually structured. A skill is not just a loose collection of text. Under the standard specification, a skill is typically packaged as a single, highly structured Markdown (.md) file. This file dictates exactly how the agent should operate using a formalized shape:

  • The Frontmatter (Manifest): A metadata block at the top of the file (usually YAML) that defines the skill's name, the APIs it needs to access, and the specific permissions it requests.
  • System Instructions: Designated Markdown headers containing the raw natural language prompts that dictate how the agent should behave.
  • Executable Tools: Fenced code blocks (often Python or JavaScript) embedded directly within the file that the agent can trigger to interact with local files or external services.

When an agent loads this .md file, it does not treat it as external reference material. In a traditional Retrieval-Augmented Generation (RAG) setup, an agent searches a database, reads a fact, and uses it to answer a question.

Skills operate completely differently. They function very similarly to memory injection.

The system instructions and tool definitions from the Markdown file are injected directly into the agent's active context window or core system prompt. The agent does not "read" the skill, it absorbs it as a fundamental operating directive. Once loaded, the skill's instructions carry the same weight as the developer's original programming.

This mechanism is exactly why skill-based attacks are so effective. The agent implicitly trusts this injected context, allowing a structured text document to completely overwrite its behavioral boundaries and security constraints.

Two Campaigns That Highlight the Threat

During our analysis, two active campaigns, ClawHavoc and 26medias, highlighted how this trust model is being actively abused, albeit in completely different ways.

ClawHavoc: Malware Disguised as Workflow

Disclosed by Koi Security in February 2026, ClawHavoc relied heavily on social engineering. During normal agent execution, malicious skills presented what appeared to be a legitimate dependency installation prompt. Once approved, the agent executed a base64-encoded reverse shell, contacting attacker-controlled infrastructure to download the Atomic macOS Stealer. The payload specifically targeted credentials, browser sessions, cryptocurrency wallets, and macOS Keychain data.

The attackers meticulously designed the operation to look legitimate, using coordinated GitHub accounts and automated registry mirroring to embed themselves in the standard distribution process. Alarmingly, months after public disclosure, hundreds of ClawHavoc skills remained publicly accessible and their infrastructure was still active.

26medias: The Natural Language Payload

The second campaign represents a far more profound shift in the threat landscape: no malware was required.

A publisher known as "26medias" distributed skills containing natural language instructions that directed agents to:

  • Store cryptocurrency private keys in plaintext.
  • Purchase attacker-controlled tokens.
  • Route payments through attacker-controlled wallets.

There was no exploit, no obfuscated code, and no suspicious shell commands. Because of how memory injection works, the instruction itself was the payload. Traditional security tools are designed to identify dangerous code; they are largely blind to seemingly legitimate instructions encouraging an agent to perform harmful actions.

The Lethal Trifecta: Dangerous Capability Combinations

While active attacks are concerning, our most significant finding was the widespread presence of risky capability combinations within otherwise legitimate skills. We evaluated every skill for the permissions and actions it enabled. Individually, most of these capabilities appear harmless. Combined, they create a massive attack surface.

Security researcher Simon Willison describes the most dangerous combination of agent capabilities as the "Lethal Trifecta": the ability to access sensitive data, execute actions, and communicate externally, in the same agent. This combination creates the classic, most persistent form of data leakage path in AI agents. Google later operationalized a similar concept through its "Rule of Two", recommending agents be limited to no more than two of these capability classes simultaneously.

When all three exist together, an agent can exfiltrate information with almost zero resistance. Yet, nearly one in ten skills we analyzed provided the complete trifecta.

Key Findings from Our Analysis:

  • 23.7% (48,984 skills) accessed sensitive local stores, such as SSH keys, cloud credentials, browser cookies, or OS credential vaults.
  • 21.7% combined code execution with credential access.
  • 14.2% combined credential handling with outbound network communication.
  • 9.5% (19,618 skills) combined code execution, access to sensitive data, and external communication capabilities.

Most users installing these skills have no idea what permissions they are granting.

The Governance Gap

The skill ecosystem isn't just suffering from a malware problem; it has a severe governance problem. Out of the 206,435 skills analyzed, only 44 passed all five baseline security checks used in our analysis.

Security Control Adoption Rates

  • Capability declarations: 14.2%
  • Sandboxing or containment: 4.7%
  • Human approval requirements: 3.6%
  • Dependency version pinning: 0.5%

Even more concerning, nearly 80% of skills failed the three most fundamental controls simultaneously: they lacked declared capabilities, checkpoints, and sandboxing.

Because many legitimate skills already request broad permissions, execute shell commands, and communicate externally, dangerous behavior often appears completely normal. Malicious skills easily blend into the background noise.

Securing the Agent Runtime with Capsule

Most software supply chain security focuses heavily on static code analysis. While that works for traditional packages, agent skills introduce a different paradigm where risk stems from instructions. Static analysis cannot reliably determine how an AI agent will behave when injected memory, tool permissions, and organizational data interact. The critical moment in agent security is execution. That is where instructions become actions, credentials are accessed, and data moves.

To adapt, security teams must prioritize visibility and control over the runtime behavior of AI agents. This is exactly what we built Capsule to solve.

Capsule delivers comprehensive security tailored specifically for the AI agent stack:

  • Complete Stack Visibility: We provide a real-time, comprehensive inventory of your entire agentic stack. Capsule automatically maps every agent, skill, tool, and harness operating within your environment so you always know exactly what is deployed.
  • Deep Skill Analysis: We go far beyond static scanning. Capsule actively analyzes skills to identify hidden risks and dangerous capability combinations, like the Lethal Trifecta, before they can be exploited.
  • Active Runtime Protection: Most importantly, Capsule secures the critical moment of execution. We monitor behavior continuously and have the ability to stop agents from going rogue in runtime. If an agent attempts an unauthorized action or tries to exfiltrate data, Capsule intervenes immediately to block the threat, before it happens.

The goal is not to bottleneck the adoption of AI agents. Skills are one of the most valuable productivity developments in recent years. Capsule introduces the same level of governance to agent skills that organizations already apply to cloud infrastructure and traditional software packages.

Because agent skills have become a new software supply chain operating through context injection instead of code compilation, they require an entirely new approach to security. Capsule delivers exactly that.

Research Methodology: Capsule Security collected and analyzed 206,435 publicly available AI agent skills from GitHub repositories and major skill registries during April 2026 as part of the State of AI Agent Security research initiative.

Read more articles

Article

Mitigating the Agentic AI Threat: What Security Leadership Needs to Prioritize

The theoretical phase of agentic AI security is over—the attack surface is real and the incidents are documented. This post breaks down the defensive architecture taking shape in response: Meta's Agents Rule of Two, deterministic enforcement hooks, identity governance for non-human agents, and the questions security leaders need to be asking right now.

Bar Kaduri
June 16, 2026
Article

OWASP State of Agentic AI Security and Governance 2026: What Changed, and What It Means

A year after the first edition, plausible agentic AI threats now carry CVEs and real incidents. What changed in the OWASP State of Agentic AI Security and Governance 2026.

Bar Kaduri
May 31, 2026
Article

Every agent needs a "stop". We're standardizing it.

The industry standardized how agents talk, but never how to stop one mid-action. Capsule is helping change that through the Agent Control Standard, with hooks.security as the developer-facing companion.

Bar Kaduri
May 27, 2026
Research

The Agentic AI Threat Landscape Has Crossed a Threshold

The security risks of AI agents are no longer theoretical. This blog examines the active threat landscape facing agentic AI in 2026, from prompt injection and supply chain attacks against MCP and skill registries to the governance gap created by vibe coding and Shadow AI.

Bar Kaduri
May 24, 2026
Article

The Rise of Guardian Agents: Securing the Agentic AI Ecosystem

Guardian agents are emerging as a critical security layer for the agentic AI era. As enterprises adopt AI agents that execute tools, handle sensitive data, and operate inside real workflows, human approval loops no longer scale. Guardian agents solve this by supervising other agents in real time: monitoring actions, enforcing policy, and blocking risky behavior before execution.

Lidan Hazout
May 7, 2026
Research

CurseChain: How Hidden README Comments Trick Cursor Into Stealing - and Spreading - Your SSH Keys

Capsule found two Cursor IDE vulnerabilities that let hidden prompt-injection instructions in referenced files steal developers’ SSH keys and contaminate future unrelated projects, causing zero-click or one-click exfiltration even when the attacker ships no malicious code.

Bar Kaduri
April 29, 2026
Research

The State of AI Agent Security 2026

Capsule Security’s State of AI Agent Security 2026 report is the largest independent audit of AI agents to date, showing that the ecosystem is rapidly shipping publicly exposed, weakly guarded, highly connected agents with recurring misconfigurations, near-absent runtime controls, widespread prompt-injection risk, expanding supply-chain exposure, and active malicious campaigns still propagating through agent skill and tool registries.

Bar Kaduri
April 27, 2026
News

Capsule Security Raises $7M to Prevent AI Agents from Going Rogue in Runtime: Intent is the New Perimeter

Capsule is launching a runtime security platform for the agentic AI era, built to monitor and stop autonomous agents that can bypass traditional guardrails, misuse legitimate access, and create a new class of enterprise security risk.

Naor Paz
April 13, 2026
Article

Why MCP Gateways are a Bad Idea (and What to Do Instead)

MCP gateways secure only one protocol and create blind spots, while runtime hooks plus approved MCP registries secure the full agent runtime where real risk lives.

Lidan Hazout
April 12, 2026
Article

ClawGuard: Open Source Security for the Agentic Era

ClawGuard was built to stop dangerous agent behavior at the intent level before execution, and NVIDIA’s NemoClaw reinforces that need by securing the runtime environment from the infrastructure side.

Lidan Hazout
April 12, 2026
Research

PipeLeak: The Lead That Stole Your Database - Exploiting Salesforce Agentforce With Indirect Prompt Injection

Capsule research team discover a critical prompt injection vulnerability in Salesforce Agentforce that allows attackers to exfiltrate CRM data through a simple lead from a form submission. No authentication required.

Bar Kaduri
April 9, 2026
Research

ShareLeak: Taking the Wheel of Microsoft’s Copilot Studio (CVE-2026-21520)

The Capsule research team discovered a high severity indirect prompt injection vulnerability in Microsoft Copilot Studio that enables attackers to exfiltrate sensitive data through external SharePoint form.

Bar Kaduri
April 9, 2026