Capsule Blog

Your AI Agent Inventory is Lying to You: The Rise of the "Inline Agent"

Guy Bidkar
July 1, 2026

Picture the security review going well. Your team has a clean inventory of AI agents: each one is properly declared in the AWS Bedrock console, Azure AI Foundry, or Google Cloud Vertex. Each agent has a unique ID, attached guardrails, and reviewed action groups. 

Then, someone pulls the model-invocation logs and notices a steady drumbeat of calls to a Claude model, thousands a day, from an IAM role nobody can quite place. It has the same system prompt every time. It executes a small, repeating set of tool calls. It's caught in a loop.

It plans, it calls tools, it acts on the results. Make no mistake: it's an agent. It just never went through the front door. It has no ID, no entry in any management console, and no row in your meticulously crafted inventory.

What are Inline Agents?

That phantom entity is what we call an Inline Agent. And once you know what to look for, you realize they are everywhere.

An inline agent could be a LangChain script running in a cron job. It could be a developer running Claude Code against your Bedrock endpoint. It might be a "quick prototype" that quietly morphed into an unsanctioned production workflow.

Inline agents are the shadow IT of the agentic era. Except this time, nobody needed to sneak out and buy an unsanctioned SaaS account with a corporate credit card. They just needed access to an API endpoint, such as bedrock:InvokeModel.

1. Two doors to the same model

Every major agent platform gives you two ways to reach a model, and the gap between them is the whole story.

The First Door: The Declared Agent

You only have to define it once: after selecting a model, drafting the instructions, attaching your tools, and bolting on guardrails, you simply give it a name. In return, the platform assigns it a durable identity and remembers its configuration forever.

On AWS Bedrock, that identity relies on a specific architecture:

  • The Core Identity: An Agent ID and a primary ARN (e.g., arn:aws:bedrock:...:agent/ABCDE12345), plus an alias ARN for the exact version you are invoking.
  • The Attachments: Action groups (usually backed by Lambda), knowledge bases, and guardrails all hang directly off that core resource.
  • The Execution: You trigger it using InvokeAgent by passing the agent ID, the alias, and a custom session ID. Bedrock takes care of the rest, maintaining the conversation state entirely on the server. 

How Bedrock Agents works

The shape repeats across vendors; only the nouns change:

  • Azure AI Foundry - agents are first-class, with an Entra Agent ID for identity and SDK objects like asst_… and thread_… for the agent and its conversation.
  • Vertex AI Agent Engine - a ReasoningEngine resource named projects/.../reasoningEngines/<id>, with managed sessions and memory.

Regardless of the vendor's terminology, the core concept remains the same: a durable, server-side resource with a distinct identity that you can instantly inventory by querying the platform.

The Second Door: Raw Inference

You are hitting the exact same model, but without the management wrapper. On Bedrock, this means making calls to InvokeModel or Converse via the runtime endpoint. You send your messages, a system prompt, inference configurations, and perhaps a toolConfig, and you receive a completion in return. Crucially, there is no agent resource created, no persistent ID assigned, and no server-side session maintained. (Interestingly, AWS explicitly acknowledges this pattern with their InvokeInlineAgent API, which lets you assemble an agent dynamically at call time, passing the model, instructions, and guardrails directly in the request body without persisting a single thing.)

Here is what your inventory misses: functionally, the code operating behind that second door is absolutely an agent. It relies on a system prompt to define its behavior. It utilizes tools, whether declared formally or executed through a hand-rolled loop. It plans, maintains conversational context, and takes action. The only difference between this shadow entity and the declared agent sitting next to it is that nobody formally introduced it to the platform.

Ultimately, if your inventory relies strictly on what the platform willingly admits to knowing, you have a massive blind spot, one exactly the size of every agent built outside a console. To fix this, we have to reconstruct these phantom agents, piecing together their identities, tools, and sessions - from the only evidence they leave behind: raw logs.

Which brings us to the hard part.

2. Tracing and Composing an Inline Agent

Our job is to reconstruct these phantom agents. We need to piece together their identities, tool belts, and sessions from the only evidence they leave at the scene: raw logs. And that is exactly where things get messy.

Take AWS Bedrock telemetry, for example. If you start your hunt in CloudTrail, a raw InvokeModel call just looks like a generic management event. Sure, you get the caller, the source IP, the user agent, and the model ID. But it is a complete black box:

{
  "eventSource": "bedrock.amazonaws.com",
  "eventName": "InvokeModel",
  "awsRegion": "us-east-1",
  "sourceIPAddress": "203.0.113.42",
  "userAgent": "Boto3/1.34.162 md/Botocore#1.34.162 lang/python#3.12.0",
  "userIdentity": {
    "type": "AssumedRole",
    "arn": "arn:aws:sts::123456789012:assumed-role/data-pipeline-role/i-0abc123",
    "sessionContext": { ... }
  },
  "requestParameters": {
    "modelId": "anthropic.claude-3-5-sonnet-20240620-v1:0"
  },
  "responseElements": null
}

Trimmed for readability. Note what isn't here.

requestParameters carries only a modelId. It tells you absolutely nothing about the system instructions, the prompts, or what the agent was actually trying to do. CloudTrail will happily tell you that a principal hit a model four thousand times yesterday; it will not tell you what the agent is or what it was doing.

To pry that box open, you have to enable model invocation logging (a separate setting, off by default). Once you do, you finally get the request and response bodies:

{
  "schemaType": "ModelInvocationLog",
  "operation": "Converse",
  "modelId": "anthropic.claude-3-5-sonnet-20240620-v1:0",
  "identity": {
    "arn": "arn:aws:sts::123456789012:assumed-role/data-pipeline-role/i-0abc123"
  },
  "input": {
    "inputBodyJson": {
      "system": [
        { "text": "You are InvoiceBot. Extract line items from the attached invoice and call the `lookup_vendor` tool to validate each vendor before returning JSON." }
      ],
      "messages": [ { "role": "user", "content": [ { "text": "…" } ] } ],
      "toolConfig": {
        "tools": [ { "toolSpec": { "name": "lookup_vendor" } } ]
      }
    }
  },
  "output": { ... }
}

This is much better. You can read the system block to see the agent's core directives, and parse the toolConfig to see exactly which levers it is allowed to pull.

But there is a catch. Look at the identity story: identity.arn is a generic STS assumed-role ARN - a role plus a session name. There is no neat little agent ID, and usually no application session ID either. If this inline agent is being driven by a framework like LangChain, the whole concept of a "session" is a mirage. It is literally just the entire message history being resubmitted to the API at every turn.

You are staring at a massive mountain of disjointed log records. How do you actually figure out which ones belong to the exact same phantom agent? You have to triangulate the data using a combination of behavioral signals:

  • Group by Caller Principal: This is a solid starting point. Unfortunately, it falls apart the second multiple agents share a single IAM role or when role-session names are left completely blank.
  • Fingerprint the System Prompt: A stable system prompt is essentially an agent's DNA written out in plain text. If two separate invocations carry nearly identical core instructions, you are almost certainly looking at the same logical entity.
  • Watch Tool Patterns Over Time: Agents are creatures of habit. They tend to reach for the same small set of tools in highly predictable sequences. Clustering these requests by toolConfig is a brilliant way to untangle multiple agents hiding behind the exact same model and IAM role.
  • Lean on IP and User Agent: Never underestimate the basics. The User-Agent string is a cheap and highly effective way to distinguish automated SDK traffic from generic HTTP clients, providing quick clues about both identity and security posture.

In isolation, none of these metrics are enough to build a solid case. The real magic happens when you weave them all together. You let a weak IAM principal signal, a highly specific prompt fingerprint, and a consistent tool pattern reinforce one another until the agent's true footprint finally comes into focus.

3. Seeing Both Kinds of Agent

This level of forensic reconstruction is exactly what Capsule does across every major integration, well beyond just Bedrock.

We grab the declared agents the easy way by querying the platform's official lists. Then we hunt down the inline agents the hard way. We fuzzy-match their actions, correlate their caller identities, and track their session paths until those scattered logs snap together into a single cohesive profile.

The output is a single pane of glass where a reconstructed inline agent sits directly next to a declared one. When a CISO asks for an agent inventory, the difference between "declared" and "inferred" is entirely irrelevant. They all need to be visible. They all need the exact same posture mapping and alerting.

This correlation does more than just find agents. It actively categorizes the risk. A steady stream of traffic from one user running Claude Code is just a personal productivity tool. A headless loop running on a cron job and hitting internal APIs is a completely different beast. That is a script someone promoted to production. They look identical in raw logs, but their security profiles are worlds apart.

Best of all, you do not have to change the agent to get this visibility. No SDKs to import, no sidecars to manage, and zero code changes. Capsule runs entirely on existing platform telemetry. That means it works perfectly on the massive agent you deployed last year and on the quick prototype someone spun up this morning.

Declared or inline, personal or production. If it operates like an agent in your environment, it needs to be in your inventory. That is the bottom line.

Want to see what we'd reconstruct in your environment? Get a demo today.

Read more articles

Research

We Analyzed 206,435 AI Agent Skills. Here's What We Found.

Our analysis of 206,435 AI agent skills reveals a rapidly growing software supply chain vulnerable to natural language payloads and dangerous capability combinations. Read the report to understand how these skills bypass traditional security controls and learn how Capsule protects your organization by securing the agent runtime.

Bar Kaduri
June 22, 2026
Article

Mitigating the Agentic AI Threat: What Security Leadership Needs to Prioritize

The theoretical phase of agentic AI security is over—the attack surface is real and the incidents are documented. This post breaks down the defensive architecture taking shape in response: Meta's Agents Rule of Two, deterministic enforcement hooks, identity governance for non-human agents, and the questions security leaders need to be asking right now.

Bar Kaduri
June 16, 2026
Article

OWASP State of Agentic AI Security and Governance 2026: What Changed, and What It Means

A year after the first edition, plausible agentic AI threats now carry CVEs and real incidents. What changed in the OWASP State of Agentic AI Security and Governance 2026.

Bar Kaduri
May 31, 2026
Article

Every agent needs a "stop". We're standardizing it.

The industry standardized how agents talk, but never how to stop one mid-action. Capsule is helping change that through the Agent Control Standard, with hooks.security as the developer-facing companion.

Bar Kaduri
May 27, 2026
Research

The Agentic AI Threat Landscape Has Crossed a Threshold

The security risks of AI agents are no longer theoretical. This blog examines the active threat landscape facing agentic AI in 2026, from prompt injection and supply chain attacks against MCP and skill registries to the governance gap created by vibe coding and Shadow AI.

Bar Kaduri
May 24, 2026
Article

The Rise of Guardian Agents: Securing the Agentic AI Ecosystem

Guardian agents are emerging as a critical security layer for the agentic AI era. As enterprises adopt AI agents that execute tools, handle sensitive data, and operate inside real workflows, human approval loops no longer scale. Guardian agents solve this by supervising other agents in real time: monitoring actions, enforcing policy, and blocking risky behavior before execution.

Lidan Hazout
May 7, 2026
Research

CurseChain: How Hidden README Comments Trick Cursor Into Stealing - and Spreading - Your SSH Keys

Capsule found two Cursor IDE vulnerabilities that let hidden prompt-injection instructions in referenced files steal developers’ SSH keys and contaminate future unrelated projects, causing zero-click or one-click exfiltration even when the attacker ships no malicious code.

Bar Kaduri
April 29, 2026
Research

The State of AI Agent Security 2026

Capsule Security’s State of AI Agent Security 2026 report is the largest independent audit of AI agents to date, showing that the ecosystem is rapidly shipping publicly exposed, weakly guarded, highly connected agents with recurring misconfigurations, near-absent runtime controls, widespread prompt-injection risk, expanding supply-chain exposure, and active malicious campaigns still propagating through agent skill and tool registries.

Bar Kaduri
April 27, 2026
News

Capsule Security Raises $7M to Prevent AI Agents from Going Rogue in Runtime: Intent is the New Perimeter

Capsule is launching a runtime security platform for the agentic AI era, built to monitor and stop autonomous agents that can bypass traditional guardrails, misuse legitimate access, and create a new class of enterprise security risk.

Naor Paz
April 13, 2026
Article

Why MCP Gateways are a Bad Idea (and What to Do Instead)

MCP gateways secure only one protocol and create blind spots, while runtime hooks plus approved MCP registries secure the full agent runtime where real risk lives.

Lidan Hazout
April 12, 2026
Article

ClawGuard: Open Source Security for the Agentic Era

ClawGuard was built to stop dangerous agent behavior at the intent level before execution, and NVIDIA’s NemoClaw reinforces that need by securing the runtime environment from the infrastructure side.

Lidan Hazout
April 12, 2026
Research

PipeLeak: The Lead That Stole Your Database - Exploiting Salesforce Agentforce With Indirect Prompt Injection

Capsule research team discover a critical prompt injection vulnerability in Salesforce Agentforce that allows attackers to exfiltrate CRM data through a simple lead from a form submission. No authentication required.

Bar Kaduri
April 9, 2026
Research

ShareLeak: Taking the Wheel of Microsoft’s Copilot Studio (CVE-2026-21520)

The Capsule research team discovered a high severity indirect prompt injection vulnerability in Microsoft Copilot Studio that enables attackers to exfiltrate sensitive data through external SharePoint form.

Bar Kaduri
April 9, 2026