Amazon Bedrock AgentCore: Deploy and Operate AI Agents at Scale

The complete guide to Amazon Bedrock AgentCore -- the managed platform for building, deploying, and operating AI agents securely at any scale. From Runtime, Memory, Gateway, and Identity primitives to building agents with Strands SDK and LangGraph, MCP Gateway wiring, AgentCore Evaluations, Agent Registry governance, Spring AI for JVM, A2A protocol support, cost optimization, cloud platform comparison, and enterprise security with IAM and VPC isolation.

1. AgentCore Primitives

Amazon Bedrock AgentCore is an agentic platform that reached general availability on October 13, 2025. It provides managed infrastructure for building, deploying, and operating AI agents securely at scale using any framework and any foundation model. The platform is organized around five core primitives that can be used independently or composed together.

CORE

Runtime

A secure, serverless, purpose-built hosting environment for deploying and running AI agents or tools. Transform any local agent code to cloud-native deployments with a few lines of code. Auto-scales from zero to thousands of concurrent sessions with per-second billing. Supports both MCP and A2A protocol servers, session isolation per user, and runs containers on port 9000 by default.

CORE

Memory

Fully managed, serverless memory primitive for session and long-term storage. Agents access Memory to persist conversation context, store investigation knowledge for future incidents, and personalize user experiences across sessions. Supports both short-term session memory and long-term semantic memory with automatic extraction and retrieval.

CORE

Gateway

Provides secure access to infrastructure APIs and external tools through MCP tool integration. Groups multiple task-specific MCP servers behind a single, manageable interface. Handles authentication, authorization, and request signing (SigV4) automatically. Supports OAuth 2.0 Authorization Code flow through AgentCore Identity for third-party service access.

CORE

Identity

Centralized capability for managing agent identities and securing credentials. Agents authenticate to AWS services and third-party tools either on behalf of users or as themselves with pre-authorized consent. Supports SigV4, standardized OAuth 2.0, and API keys. Credentials encrypted using customer-managed AWS KMS keys.

CORE

Observability

Captures detailed metrics and traces in CloudWatch for monitoring and debugging. Provides end-to-end visibility into agent execution, tool invocations, latency breakdowns, and error rates. Integrates with existing AWS monitoring workflows and dashboards for unified operational awareness.

2. Build a First Agent with Strands SDK / LangGraph

AgentCore Runtime works with any framework that produces an HTTP server or provides an A2A AgentExecutor, including Strands Agents, LangGraph, Google ADK, OpenAI Agents SDK, and custom solutions. The Strands Agents SDK -- an open-source, model-driven framework from AWS -- is the most tightly integrated option, requiring just a few lines of code to go from local prototype to cloud-deployed agent.

Strands Agents (Python)

# Install dependencies
pip install strands-agents strands-agents-tools bedrock-agentcore

# agent.py -- minimal Strands agent
from strands import Agent
from strands.models.bedrock import BedrockModel
from strands_tools import calculator, web_search

model = BedrockModel(
    model_id="us.anthropic.claude-sonnet-4-20250514",
    temperature=0.3,
    streaming=True
)

agent = Agent(
    model=model,
    tools=[calculator, web_search],
    system_prompt="You are a helpful research assistant."
)

response = agent("What is the current population of Tokyo?")
print(response)

Deploy to AgentCore Runtime

# deploy.py -- wrap agent for AgentCore Runtime
from bedrock_agentcore.runtime import BedrockAgentCoreApp

def agent_handler(request):
    """Handle incoming agent requests."""
    from strands import Agent
    from strands.models.bedrock import BedrockModel

    model = BedrockModel(model_id="us.anthropic.claude-sonnet-4-20250514")
    agent = Agent(model=model, tools=[])
    response = agent(request.input_text)
    return {"output": str(response)}

app = BedrockAgentCoreApp(agent_handler)

if __name__ == "__main__":
    app.run(port=9000)

LangGraph Alternative

LangGraph agents deploy identically. The AgentCore Runtime Python SDK provides a lightweight HTTP wrapper that exposes any agent function as an AgentCore-compatible endpoint. The Runtime passes JSON-RPC payloads directly to the container without modification, so any framework producing a compliant HTTP server works out of the box.

3. MCP Gateway Wiring

AgentCore Gateway transforms how agents connect to external tools. Instead of each agent managing its own MCP server connections, Gateway centralizes tool access behind a single managed interface. You define MCP servers as targets when creating a gateway, and the SynchronizeGatewayTargets API performs protocol handshakes and indexes available tools automatically.

Supported MCP protocol versions include 2025-03-26, 2025-06-18, and 2025-11-25 (minimum required for gateway creation). The Gateway signs requests to MCP servers using SigV4 with the gateway service role credentials. For third-party services requiring OAuth, Gateway integrates with AgentCore Identity to handle the Authorization Code flow.

# Wire an MCP server to AgentCore Gateway (AWS CLI)
aws bedrock-agentcore create-gateway \
  --gateway-name "tools-gateway" \
  --protocol-type MCP \
  --targets '[{
    "name": "github-tools",
    "type": "MCP_SERVER",
    "endpoint": "https://mcp.github.internal/sse",
    "authConfig": {
      "type": "IAM",
      "roleArn": "arn:aws:iam::123456789012:role/GatewayMCPRole"
    }
  }]'

# Synchronize targets to discover available tools
aws bedrock-agentcore synchronize-gateway-targets \
  --gateway-id "gw-abc123"

As of February 2026, Amazon Bedrock supports server-side tool execution through AgentCore Gateway integrated with the Responses API. This enables models to call Gateway tools directly without client-side orchestration -- the model decides which tool to call, Gateway executes it, and results flow back automatically. This is ideal for reducing latency and simplifying agent architectures where the client does not need to inspect or modify tool calls.

4. AgentCore Evaluations -- Production Quality Gates

AgentCore Evaluations became generally available on March 31, 2026. The service provides automated quality assessment for AI agents through two evaluation modes: online evaluation continuously monitors production traffic by sampling and scoring live traces, while on-demand evaluation enables programmatic testing that integrates into CI/CD pipelines and interactive development workflows.

Teams can evaluate agents using 13 built-in evaluators covering response quality, safety, task completion, and tool usage. Custom evaluators support two approaches: LLM-based evaluation using configurable prompts and model selection, or code-based evaluation through Python or JavaScript functions hosted on Lambda. This flexibility allows teams to enforce domain-specific quality standards beyond generic metrics.

EVAL

Online Evaluation

Continuously monitors agent performance in production by sampling and scoring live traces. Detects quality degradation in real time without requiring dedicated test suites. Configure sampling rates and scoring thresholds to balance cost and coverage.

EVAL

On-Demand Evaluation

Programmatic evaluation for CI/CD integration and regression testing. Run evaluations against test datasets before deploying agent changes to production. Supports batch execution with configurable parallelism for fast feedback loops.

EVAL

Custom Evaluators

Build evaluators using LLM-based prompts (choose any Bedrock model) or code-based logic (Python/JavaScript on Lambda). Enforce domain-specific quality standards, compliance rules, and business-logic validation beyond built-in metrics.

AgentCore Evaluations is available in nine AWS Regions: US East (N. Virginia), US East (Ohio), US West (Oregon), Asia Pacific (Mumbai, Singapore, Sydney, Tokyo), and Europe (Frankfurt, Ireland).

5. Agent Registry Governance

AWS Agent Registry entered public preview in April 2026. It is a private, governed catalog and discovery layer for agents, tools, skills, MCP servers, and custom resources within an organization. The registry solves the duplication problem in enterprises running hundreds of agents: teams can discover existing capabilities instead of rebuilding them.

Discovery uses hybrid search combining keyword matching and semantic understanding. Short queries use keyword matching, while longer natural-language queries also surface conceptually related results. Resources can be registered manually through the console or API, or via URL-based discovery that automatically retrieves metadata (tool schemas, capability descriptions) from a live MCP server or agent endpoint.

The registry includes approval workflows for publishing new agents, CloudTrail audit trails for compliance, and access control through IAM policies. It can be accessed via the AgentCore Console UI, AWS CLI/SDK, or as an MCP server that developers query directly from their IDEs. Preview availability covers five regions: US West (Oregon), Asia Pacific (Tokyo, Sydney), Europe (Ireland), and US East (N. Virginia).

6. Spring AI AgentCore SDK for JVM

The Spring AI AgentCore SDK brings Amazon Bedrock AgentCore capabilities into the Java/JVM ecosystem through familiar Spring patterns: annotations, auto-configuration, and composable advisors. A single @AgentCoreInvocation annotation transforms any Spring bean method into an AgentCore-compatible endpoint with automatic serialization, streaming detection, and response formatting.

The SDK supports AgentCore Runtime for fully managed deployment, but individual modules (Memory, Browser, Code Interpreter) work standalone in applications running on Amazon EKS, ECS, or any infrastructure. MCP annotations are first-class in Spring AI 2.0, with Spring-specific MCP transport implementations built into the core project.

// Spring AI AgentCore -- annotation-driven agent endpoint
@Service
public class ResearchAgent {

    @AgentCoreInvocation
    public Flux<String> research(@RequestBody AgentRequest request) {
        // Spring AI advisors chain: Memory -> Tools -> Model
        return chatClient.prompt()
            .advisors(memoryAdvisor, toolAdvisor)
            .user(request.getInput())
            .stream()
            .content();
    }
}

Spring Boot 4.0 (November 2025) combined with Spring AI 2.0 (Milestone 3, March 2026) provides a production-grade, Spring-native path to building AI applications for JVM shops. This eliminates the Python detour that many Java-focused enterprises previously required for agent development.

7. A2A Protocol Support via AgentCore Runtime

AgentCore Runtime added first-class Agent-to-Agent (A2A) protocol support in October 2025. With this addition, agents built on different frameworks -- Strands, LangGraph, OpenAI Agents SDK, Google ADK, Claude Agents SDK -- can discover peers, share capabilities, and coordinate actions across platforms using standardized communication.

When configured for A2A, AgentCore expects containers to run stateless, streamable HTTP servers on port 9000 at the root path. The service passes JSON-RPC payloads from the InvokeAgentRuntime API directly to the A2A container without modification, maintaining protocol transparency. Session isolation is automatic via the X-Amzn-Bedrock-AgentCore-Runtime-Session-Id header.

Authentication for A2A calls supports OAuth 2.0 and IAM (SigV4), enabling secure, identity-aware communication between agents. A2A servers can run inside a VPC and use PrivateLink for private connectivity, ensuring that inter-agent traffic never traverses the public internet. Combined with MCP support, AgentCore Runtime enables agents to both expose tools (MCP) and collaborate with peers (A2A) from a single deployment.

8. Cost and Scaling Patterns

AgentCore uses modular, consumption-based pricing with no upfront commitments or minimum fees. Each primitive is billed independently, and you pay only for what you use. Billing is calculated per second, using actual CPU consumption and peak memory consumed up to that second, with a 1-second minimum.

PrimitiveBilling UnitRate
RuntimevCPU-hour + GB-hour$0.0895/vCPU-hr, $0.00945/GB-hr
GatewayAuthorization requestsPer request during agent execution
IdentityAuth requestsPer authentication event
MemoryRecords stored + retrievedPer record operation
EvaluationsEvaluation tokensPer token processed
PolicyPolicy evaluationsPer policy check

Runtime's consumption-based billing captures the "I/O wait is free" benefit -- while an agent waits for an LLM response or external API call, CPU consumption drops to near zero and billing reflects that. This delivers substantial cost efficiency compared to traditional always-on compute options. For bursty workloads, auto-scaling from zero eliminates idle costs entirely.

For cost optimization: use smaller model IDs for simple routing agents (Haiku for orchestration, Sonnet for reasoning), batch tool calls where possible, and leverage Memory to avoid redundant LLM calls for previously resolved queries. Monitor costs through CloudWatch billing metrics and set budget alarms to catch unexpected spikes early.

9. Bedrock AgentCore vs Azure AI Foundry vs Vertex AI

The three major cloud providers each offer managed agent platforms with different strengths. The selection primarily depends on which cloud and business systems you already use -- switching costs are highest at the integration layer, not the agent framework layer.

CapabilityAWS Bedrock AgentCoreAzure AI FoundryGoogle Vertex AI
GA DateOct 2025Dec 2025Agent Engine 2025
Agent RuntimeServerless, per-second billing, auto-scale from zeroAzure Container Apps basedManaged infrastructure with session support
MCP SupportGateway with centralized MCP wiringVia Semantic Kernel pluginsNative MCP tool integration
A2A ProtocolFirst-class Runtime supportVia Microsoft Agent FrameworkOriginated A2A; native support
EvaluationsGA Mar 2026 -- online + on-demandAzure AI Evaluation SDKVertex AI Evaluation
Security ModelIAM + VPC + PrivateLink + KMSAzure AD + VNet + Key VaultIAM + VPC-SC + CMEK
Ecosystem FitAWS-native orgs, multi-framework flexibilityMicrosoft 365, Teams, SharePoint shopsGCP-first teams, Gemini-centric

All three platforms support both MCP and A2A protocols, enabling interoperability across clouds. AgentCore's key differentiator is its framework-agnostic Runtime -- any agent framework that produces an HTTP server deploys without modification. Azure AI Foundry's advantage is deep Microsoft 365 integration with 1,400+ action connectors. Vertex AI excels when teams are already invested in GCP and want tight Gemini model integration.

10. Security: IAM Roles, VPC Isolation, Encryption

AgentCore deploys with enterprise-grade security by default. Every user session runs in its own protected environment with complete isolation to prevent data leakage. The platform integrates with the full AWS security stack: IAM for access control, VPC for network isolation, KMS for encryption, and CloudTrail for audit logging.

IAM and Access Control

AgentCore supports IAM condition keys including bedrock-agentcore:subnets and bedrock-agentcore:securityGroups for fine-grained network policy enforcement. Control Plane API operations (CreateAgentRuntime, UpdateAgentRuntime, CreateCodeInterpreter, CreateBrowser) support these condition keys. AgentCore Policy (GA March 2026) provides centralized, fine-grained controls for agent-tool interactions, with security teams able to define tool access and input validation rules using natural language.

VPC Isolation and PrivateLink

By default, agent requests to AgentCore Gateway traverse the public internet. With interface VPC endpoints, organizations route communications through the AWS internal network backbone, delivering enhanced security, reduced latency, and improved compliance alignment. A2A servers run inside VPCs with PrivateLink for private inter-agent communication that never touches the public internet.

Encryption

Customer data and metadata stored in DynamoDB and S3 are encrypted with either a service key or a customer-managed KMS key. AgentCore Identity encrypts all stored credentials using customer-managed KMS keys. Data never leaves your AWS account and is never used to train models. All API calls are logged in CloudTrail for compliance auditing.

Key Features at a Glance

AGENT

Framework Agnostic

Deploy agents built with Strands, LangGraph, Google ADK, OpenAI Agents SDK, Claude Agent SDK, or custom frameworks. Any HTTP server on port 9000 works with Runtime.

TOOL

Centralized MCP Gateway

Group multiple MCP servers behind a single managed interface. Automatic tool discovery, protocol handshakes, and SigV4 request signing.

SCALE

Scale to Zero

Per-second billing with auto-scaling from zero. No idle costs. I/O wait periods incur near-zero CPU charges. Handles burst traffic without pre-provisioning.

GOV

Agent Registry

Private catalog with semantic search for discovering existing agents, tools, and MCP servers across the organization. Approval workflows and CloudTrail audit trails.

EVAL

Built-in Evaluations

13 built-in evaluators plus custom LLM-based and code-based evaluators. Online monitoring of production traffic and on-demand CI/CD integration.

SEC

Enterprise Security

Session isolation, IAM condition keys, VPC endpoints with PrivateLink, customer-managed KMS encryption, and Policy for natural-language tool access rules.

Production Outcomes

Zero idle cost

Runtime scales from zero and bills per second of actual CPU use. I/O wait periods (LLM inference, API calls) incur near-zero charges, making bursty agent workloads significantly cheaper than always-on compute.

Any framework

Deploy agents built with Strands, LangGraph, Claude Agent SDK, or custom Python/Java frameworks to the same Runtime. No vendor lock-in at the agent framework layer.

MCP + A2A unified

A single AgentCore deployment supports both MCP tool servers and A2A agent-to-agent communication, enabling agents that both use tools and collaborate with peers.

9 AWS Regions

AgentCore Evaluations available across nine regions. Runtime and Gateway support continues expanding, with VPC, PrivateLink, and CloudFormation available in all GA regions.

11. Managed Harness, AgentCore CLI, and Memory Enhancements

The Managed Harness (preview, April 2026) provides a pre-built runtime environment that removes the need to write HTTP server code entirely. Instead of wrapping your agent in an HTTP handler, you provide the agent function and Managed Harness handles HTTP serving, health checks, session management, and graceful shutdown automatically. This reduces the deployment surface from a Dockerfile + server code to a single agent function and a manifest file. Managed Harness currently supports Python and Java agents via the Strands and Spring AI SDKs.

The AgentCore CLI with CDK integration enables infrastructure-as-code workflows for AgentCore deployments. Define Runtime endpoints, Gateway configurations, Memory stores, and evaluation pipelines as CDK constructs, then deploy with cdk deploy. The CLI includes agentcore init for project scaffolding, agentcore deploy for direct Runtime deployment, and agentcore logs for real-time log streaming from running agents.

Filesystem Persistence (preview) allows agents to persist files across invocations within a session -- scratch pads, intermediate computation results, downloaded artifacts. Files are stored in session-scoped ephemeral storage that survives across turns but is cleaned up when the session ends. Episodic Memory extends the existing Memory primitive with structured event storage: agents can record decision points, tool outcomes, and reasoning traces as episodes that are retrievable by semantic query. This enables agents to learn from past interactions and avoid repeating failed approaches. The coding assistant integration lets agents spawn sandboxed code execution environments for writing, testing, and iterating on code as part of their workflow.

Related Technologies