Your AI Agent Inventory is Lying to You: The Rise of the "Inline Agent"

Guy Bidkar

July 1, 2026

Picture the security review going well. Your team has a clean inventory of AI agents: each one is properly declared in the AWS Bedrock console, Azure AI Foundry, or Google Cloud Vertex. Each agent has a unique ID, attached guardrails, and reviewed action groups.

Then, someone pulls the model-invocation logs and notices a steady drumbeat of calls to a Claude model, thousands a day, from an IAM role nobody can quite place. It has the same system prompt every time. It executes a small, repeating set of tool calls. It's caught in a loop.

It plans, it calls tools, it acts on the results. Make no mistake: it's an agent. It just never went through the front door. It has no ID, no entry in any management console, and no row in your meticulously crafted inventory.

What are Inline Agents?

That phantom entity is what we call an Inline Agent. And once you know what to look for, you realize they are everywhere.

An inline agent could be a LangChain script running in a cron job. It could be a developer running Claude Code against your Bedrock endpoint. It might be a "quick prototype" that quietly morphed into an unsanctioned production workflow.

Inline agents are the shadow IT of the agentic era. Except this time, nobody needed to sneak out and buy an unsanctioned SaaS account with a corporate credit card. They just needed access to an API endpoint, such as bedrock:InvokeModel.

1. Two doors to the same model

Every major agent platform gives you two ways to reach a model, and the gap between them is the whole story.

The First Door: The Declared Agent

You only have to define it once: after selecting a model, drafting the instructions, attaching your tools, and bolting on guardrails, you simply give it a name. In return, the platform assigns it a durable identity and remembers its configuration forever.

On AWS Bedrock, that identity relies on a specific architecture:

The Core Identity: An Agent ID and a primary ARN (e.g., arn:aws:bedrock:...:agent/ABCDE12345), plus an alias ARN for the exact version you are invoking.
The Attachments: Action groups (usually backed by Lambda), knowledge bases, and guardrails all hang directly off that core resource.
The Execution: You trigger it using InvokeAgent by passing the agent ID, the alias, and a custom session ID. Bedrock takes care of the rest, maintaining the conversation state entirely on the server.

How Bedrock Agents works

The shape repeats across vendors; only the nouns change:

Azure AI Foundry - agents are first-class, with an Entra Agent ID for identity and SDK objects like asst_… and thread_… for the agent and its conversation.
Vertex AI Agent Engine - a ReasoningEngine resource named projects/.../reasoningEngines/<id>, with managed sessions and memory.

Regardless of the vendor's terminology, the core concept remains the same: a durable, server-side resource with a distinct identity that you can instantly inventory by querying the platform.

The Second Door: Raw Inference

You are hitting the exact same model, but without the management wrapper. On Bedrock, this means making calls to InvokeModel or Converse via the runtime endpoint. You send your messages, a system prompt, inference configurations, and perhaps a toolConfig, and you receive a completion in return. Crucially, there is no agent resource created, no persistent ID assigned, and no server-side session maintained. (Interestingly, AWS explicitly acknowledges this pattern with their InvokeInlineAgent API, which lets you assemble an agent dynamically at call time, passing the model, instructions, and guardrails directly in the request body without persisting a single thing.)

Here is what your inventory misses: functionally, the code operating behind that second door is absolutely an agent. It relies on a system prompt to define its behavior. It utilizes tools, whether declared formally or executed through a hand-rolled loop. It plans, maintains conversational context, and takes action. The only difference between this shadow entity and the declared agent sitting next to it is that nobody formally introduced it to the platform.

Ultimately, if your inventory relies strictly on what the platform willingly admits to knowing, you have a massive blind spot, one exactly the size of every agent built outside a console. To fix this, we have to reconstruct these phantom agents, piecing together their identities, tools, and sessions - from the only evidence they leave behind: raw logs.

Which brings us to the hard part.

2. Tracing and Composing an Inline Agent

Our job is to reconstruct these phantom agents. We need to piece together their identities, tool belts, and sessions from the only evidence they leave at the scene: raw logs. And that is exactly where things get messy.

Take AWS Bedrock telemetry, for example. If you start your hunt in CloudTrail, a raw InvokeModel call just looks like a generic management event. Sure, you get the caller, the source IP, the user agent, and the model ID. But it is a complete black box:

{
  "eventSource": "bedrock.amazonaws.com",
  "eventName": "InvokeModel",
  "awsRegion": "us-east-1",
  "sourceIPAddress": "203.0.113.42",
  "userAgent": "Boto3/1.34.162 md/Botocore#1.34.162 lang/python#3.12.0",
  "userIdentity": {
    "type": "AssumedRole",
    "arn": "arn:aws:sts::123456789012:assumed-role/data-pipeline-role/i-0abc123",
    "sessionContext": { ... }
  },
  "requestParameters": {
    "modelId": "anthropic.claude-3-5-sonnet-20240620-v1:0"
  },
  "responseElements": null
}

Trimmed for readability. Note what isn't here.

requestParameters carries only a modelId. It tells you absolutely nothing about the system instructions, the prompts, or what the agent was actually trying to do. CloudTrail will happily tell you that a principal hit a model four thousand times yesterday; it will not tell you what the agent is or what it was doing.

To pry that box open, you have to enable model invocation logging (a separate setting, off by default). Once you do, you finally get the request and response bodies:

{
  "schemaType": "ModelInvocationLog",
  "operation": "Converse",
  "modelId": "anthropic.claude-3-5-sonnet-20240620-v1:0",
  "identity": {
    "arn": "arn:aws:sts::123456789012:assumed-role/data-pipeline-role/i-0abc123"
  },
  "input": {
    "inputBodyJson": {
      "system": [
        { "text": "You are InvoiceBot. Extract line items from the attached invoice and call the `lookup_vendor` tool to validate each vendor before returning JSON." }
      ],
      "messages": [ { "role": "user", "content": [ { "text": "…" } ] } ],
      "toolConfig": {
        "tools": [ { "toolSpec": { "name": "lookup_vendor" } } ]
      }
    }
  },
  "output": { ... }
}

This is much better. You can read the system block to see the agent's core directives, and parse the toolConfig to see exactly which levers it is allowed to pull.

But there is a catch. Look at the identity story: identity.arn is a generic STS assumed-role ARN - a role plus a session name. There is no neat little agent ID, and usually no application session ID either. If this inline agent is being driven by a framework like LangChain, the whole concept of a "session" is a mirage. It is literally just the entire message history being resubmitted to the API at every turn.

You are staring at a massive mountain of disjointed log records. How do you actually figure out which ones belong to the exact same phantom agent? You have to triangulate the data using a combination of behavioral signals:

Group by Caller Principal: This is a solid starting point. Unfortunately, it falls apart the second multiple agents share a single IAM role or when role-session names are left completely blank.
Fingerprint the System Prompt: A stable system prompt is essentially an agent's DNA written out in plain text. If two separate invocations carry nearly identical core instructions, you are almost certainly looking at the same logical entity.
Watch Tool Patterns Over Time: Agents are creatures of habit. They tend to reach for the same small set of tools in highly predictable sequences. Clustering these requests by toolConfig is a brilliant way to untangle multiple agents hiding behind the exact same model and IAM role.
Lean on IP and User Agent: Never underestimate the basics. The User-Agent string is a cheap and highly effective way to distinguish automated SDK traffic from generic HTTP clients, providing quick clues about both identity and security posture.

In isolation, none of these metrics are enough to build a solid case. The real magic happens when you weave them all together. You let a weak IAM principal signal, a highly specific prompt fingerprint, and a consistent tool pattern reinforce one another until the agent's true footprint finally comes into focus.

‍

3. Seeing Both Kinds of Agent

This level of forensic reconstruction is exactly what Capsule does across every major integration, well beyond just Bedrock.

We grab the declared agents the easy way by querying the platform's official lists. Then we hunt down the inline agents the hard way. We fuzzy-match their actions, correlate their caller identities, and track their session paths until those scattered logs snap together into a single cohesive profile.

The output is a single pane of glass where a reconstructed inline agent sits directly next to a declared one. When a CISO asks for an agent inventory, the difference between "declared" and "inferred" is entirely irrelevant. They all need to be visible. They all need the exact same posture mapping and alerting.

This correlation does more than just find agents. It actively categorizes the risk. A steady stream of traffic from one user running Claude Code is just a personal productivity tool. A headless loop running on a cron job and hitting internal APIs is a completely different beast. That is a script someone promoted to production. They look identical in raw logs, but their security profiles are worlds apart.

Best of all, you do not have to change the agent to get this visibility. No SDKs to import, no sidecars to manage, and zero code changes. Capsule runs entirely on existing platform telemetry. That means it works perfectly on the massive agent you deployed last year and on the quick prototype someone spun up this morning.

Declared or inline, personal or production. If it operates like an agent in your environment, it needs to be in your inventory. That is the bottom line.

Want to see what we'd reconstruct in your environment? Get a demo today.

‍

Your AI Agent Inventory is Lying to You: The Rise of the "Inline Agent"

What are Inline Agents?

1. Two doors to the same model

The First Door: The Declared Agent

The Second Door: Raw Inference

2. Tracing and Composing an Inline Agent

3. Seeing Both Kinds of Agent

Read more articles

We Analyzed 206,435 AI Agent Skills. Here's What We Found.

Mitigating the Agentic AI Threat: What Security Leadership Needs to Prioritize

OWASP State of Agentic AI Security and Governance 2026: What Changed, and What It Means

Every agent needs a "stop". We're standardizing it.

The Agentic AI Threat Landscape Has Crossed a Threshold

The Rise of Guardian Agents: Securing the Agentic AI Ecosystem

CurseChain: How Hidden README Comments Trick Cursor Into Stealing - and Spreading - Your SSH Keys

The State of AI Agent Security 2026

Capsule Security Raises $7M to Prevent AI Agents from Going Rogue in Runtime: Intent is the New Perimeter

Why MCP Gateways are a Bad Idea (and What to Do Instead)

ClawGuard: Open Source Security for the Agentic Era

PipeLeak: The Lead That Stole Your Database - Exploiting Salesforce Agentforce With Indirect Prompt Injection

ShareLeak: Taking the Wheel of Microsoft’s Copilot Studio (CVE-2026-21520)