Amazon Bedrock AgentCore: Deploy and Operate AI Agents at Scale
The complete guide to Amazon Bedrock AgentCore -- the managed platform for building, deploying, and operating AI agents securely at any scale. From Runtime, Memory, Gateway, and Identity primitives to building agents with Strands SDK and LangGraph, MCP Gateway wiring, AgentCore Evaluations, Agent Registry governance, Spring AI for JVM, A2A protocol support, cost optimization, cloud platform comparison, and enterprise security with IAM and VPC isolation.
1. AgentCore Primitives
Amazon Bedrock AgentCore is an agentic platform that reached general availability on October 13, 2025. It provides managed infrastructure for building, deploying, and operating AI agents securely at scale using any framework and any foundation model. The platform is organized around five core primitives that can be used independently or composed together.
Runtime
A secure, serverless, purpose-built hosting environment for deploying and running AI agents or tools. Transform any local agent code to cloud-native deployments with a few lines of code. Auto-scales from zero to thousands of concurrent sessions with per-second billing. Supports both MCP and A2A protocol servers, session isolation per user, and runs containers on port 9000 by default.
Memory
Fully managed, serverless memory primitive for session and long-term storage. Agents access Memory to persist conversation context, store investigation knowledge for future incidents, and personalize user experiences across sessions. Supports both short-term session memory and long-term semantic memory with automatic extraction and retrieval.
Gateway
Provides secure access to infrastructure APIs and external tools through MCP tool integration. Groups multiple task-specific MCP servers behind a single, manageable interface. Handles authentication, authorization, and request signing (SigV4) automatically. Supports OAuth 2.0 Authorization Code flow through AgentCore Identity for third-party service access.
Identity
Centralized capability for managing agent identities and securing credentials. Agents authenticate to AWS services and third-party tools either on behalf of users or as themselves with pre-authorized consent. Supports SigV4, standardized OAuth 2.0, and API keys. Credentials encrypted using customer-managed AWS KMS keys.
Observability
Captures detailed metrics and traces in CloudWatch for monitoring and debugging. Provides end-to-end visibility into agent execution, tool invocations, latency breakdowns, and error rates. Integrates with existing AWS monitoring workflows and dashboards for unified operational awareness.
2. Build a First Agent with Strands SDK / LangGraph
AgentCore Runtime works with any framework that produces an HTTP server or provides an A2A AgentExecutor, including Strands Agents, LangGraph, Google ADK, OpenAI Agents SDK, and custom solutions. The Strands Agents SDK -- an open-source, model-driven framework from AWS -- is the most tightly integrated option, requiring just a few lines of code to go from local prototype to cloud-deployed agent.
Strands Agents (Python)
# Install dependencies
pip install strands-agents strands-agents-tools bedrock-agentcore
# agent.py -- minimal Strands agent
from strands import Agent
from strands.models.bedrock import BedrockModel
from strands_tools import calculator, web_search
model = BedrockModel(
model_id="us.anthropic.claude-sonnet-4-20250514",
temperature=0.3,
streaming=True
)
agent = Agent(
model=model,
tools=[calculator, web_search],
system_prompt="You are a helpful research assistant."
)
response = agent("What is the current population of Tokyo?")
print(response)
Deploy to AgentCore Runtime
# deploy.py -- wrap agent for AgentCore Runtime
from bedrock_agentcore.runtime import BedrockAgentCoreApp
def agent_handler(request):
"""Handle incoming agent requests."""
from strands import Agent
from strands.models.bedrock import BedrockModel
model = BedrockModel(model_id="us.anthropic.claude-sonnet-4-20250514")
agent = Agent(model=model, tools=[])
response = agent(request.input_text)
return {"output": str(response)}
app = BedrockAgentCoreApp(agent_handler)
if __name__ == "__main__":
app.run(port=9000)
LangGraph Alternative
LangGraph agents deploy identically. The AgentCore Runtime Python SDK provides a lightweight HTTP wrapper that exposes any agent function as an AgentCore-compatible endpoint. The Runtime passes JSON-RPC payloads directly to the container without modification, so any framework producing a compliant HTTP server works out of the box.
3. MCP Gateway Wiring
AgentCore Gateway transforms how agents connect to external tools. Instead of each agent managing its own MCP server connections, Gateway centralizes tool access behind a single managed interface. You define MCP servers as targets when creating a gateway, and the SynchronizeGatewayTargets API performs protocol handshakes and indexes available tools automatically.
Supported MCP protocol versions include 2025-03-26, 2025-06-18, and 2025-11-25 (minimum required for gateway creation). The Gateway signs requests to MCP servers using SigV4 with the gateway service role credentials. For third-party services requiring OAuth, Gateway integrates with AgentCore Identity to handle the Authorization Code flow.
# Wire an MCP server to AgentCore Gateway (AWS CLI)
aws bedrock-agentcore create-gateway \
--gateway-name "tools-gateway" \
--protocol-type MCP \
--targets '[{
"name": "github-tools",
"type": "MCP_SERVER",
"endpoint": "https://mcp.github.internal/sse",
"authConfig": {
"type": "IAM",
"roleArn": "arn:aws:iam::123456789012:role/GatewayMCPRole"
}
}]'
# Synchronize targets to discover available tools
aws bedrock-agentcore synchronize-gateway-targets \
--gateway-id "gw-abc123"
As of February 2026, Amazon Bedrock supports server-side tool execution through AgentCore Gateway integrated with the Responses API. This enables models to call Gateway tools directly without client-side orchestration -- the model decides which tool to call, Gateway executes it, and results flow back automatically. This is ideal for reducing latency and simplifying agent architectures where the client does not need to inspect or modify tool calls.
4. AgentCore Evaluations -- Production Quality Gates
AgentCore Evaluations became generally available on March 31, 2026. The service provides automated quality assessment for AI agents through two evaluation modes: online evaluation continuously monitors production traffic by sampling and scoring live traces, while on-demand evaluation enables programmatic testing that integrates into CI/CD pipelines and interactive development workflows.
Teams can evaluate agents using 13 built-in evaluators covering response quality, safety, task completion, and tool usage. Custom evaluators support two approaches: LLM-based evaluation using configurable prompts and model selection, or code-based evaluation through Python or JavaScript functions hosted on Lambda. This flexibility allows teams to enforce domain-specific quality standards beyond generic metrics.
Online Evaluation
Continuously monitors agent performance in production by sampling and scoring live traces. Detects quality degradation in real time without requiring dedicated test suites. Configure sampling rates and scoring thresholds to balance cost and coverage.
On-Demand Evaluation
Programmatic evaluation for CI/CD integration and regression testing. Run evaluations against test datasets before deploying agent changes to production. Supports batch execution with configurable parallelism for fast feedback loops.
Custom Evaluators
Build evaluators using LLM-based prompts (choose any Bedrock model) or code-based logic (Python/JavaScript on Lambda). Enforce domain-specific quality standards, compliance rules, and business-logic validation beyond built-in metrics.
AgentCore Evaluations is available in nine AWS Regions: US East (N. Virginia), US East (Ohio), US West (Oregon), Asia Pacific (Mumbai, Singapore, Sydney, Tokyo), and Europe (Frankfurt, Ireland).
5. Agent Registry Governance
AWS Agent Registry entered public preview in April 2026. It is a private, governed catalog and discovery layer for agents, tools, skills, MCP servers, and custom resources within an organization. The registry solves the duplication problem in enterprises running hundreds of agents: teams can discover existing capabilities instead of rebuilding them.
Discovery uses hybrid search combining keyword matching and semantic understanding. Short queries use keyword matching, while longer natural-language queries also surface conceptually related results. Resources can be registered manually through the console or API, or via URL-based discovery that automatically retrieves metadata (tool schemas, capability descriptions) from a live MCP server or agent endpoint.
The registry includes approval workflows for publishing new agents, CloudTrail audit trails for compliance, and access control through IAM policies. It can be accessed via the AgentCore Console UI, AWS CLI/SDK, or as an MCP server that developers query directly from their IDEs. Preview availability covers five regions: US West (Oregon), Asia Pacific (Tokyo, Sydney), Europe (Ireland), and US East (N. Virginia).
6. Spring AI AgentCore SDK for JVM
The Spring AI AgentCore SDK brings Amazon Bedrock AgentCore capabilities into the Java/JVM ecosystem through familiar Spring patterns: annotations, auto-configuration, and composable advisors. A single @AgentCoreInvocation annotation transforms any Spring bean method into an AgentCore-compatible endpoint with automatic serialization, streaming detection, and response formatting.
The SDK supports AgentCore Runtime for fully managed deployment, but individual modules (Memory, Browser, Code Interpreter) work standalone in applications running on Amazon EKS, ECS, or any infrastructure. MCP annotations are first-class in Spring AI 2.0, with Spring-specific MCP transport implementations built into the core project.
// Spring AI AgentCore -- annotation-driven agent endpoint
@Service
public class ResearchAgent {
@AgentCoreInvocation
public Flux<String> research(@RequestBody AgentRequest request) {
// Spring AI advisors chain: Memory -> Tools -> Model
return chatClient.prompt()
.advisors(memoryAdvisor, toolAdvisor)
.user(request.getInput())
.stream()
.content();
}
}
Spring Boot 4.0 (November 2025) combined with Spring AI 2.0 (Milestone 3, March 2026) provides a production-grade, Spring-native path to building AI applications for JVM shops. This eliminates the Python detour that many Java-focused enterprises previously required for agent development.
7. A2A Protocol Support via AgentCore Runtime
AgentCore Runtime added first-class Agent-to-Agent (A2A) protocol support in October 2025. With this addition, agents built on different frameworks -- Strands, LangGraph, OpenAI Agents SDK, Google ADK, Claude Agents SDK -- can discover peers, share capabilities, and coordinate actions across platforms using standardized communication.
When configured for A2A, AgentCore expects containers to run stateless, streamable HTTP servers on port 9000 at the root path. The service passes JSON-RPC payloads from the InvokeAgentRuntime API directly to the A2A container without modification, maintaining protocol transparency. Session isolation is automatic via the X-Amzn-Bedrock-AgentCore-Runtime-Session-Id header.
Authentication for A2A calls supports OAuth 2.0 and IAM (SigV4), enabling secure, identity-aware communication between agents. A2A servers can run inside a VPC and use PrivateLink for private connectivity, ensuring that inter-agent traffic never traverses the public internet. Combined with MCP support, AgentCore Runtime enables agents to both expose tools (MCP) and collaborate with peers (A2A) from a single deployment.
8. Cost and Scaling Patterns
AgentCore uses modular, consumption-based pricing with no upfront commitments or minimum fees. Each primitive is billed independently, and you pay only for what you use. Billing is calculated per second, using actual CPU consumption and peak memory consumed up to that second, with a 1-second minimum.
| Primitive | Billing Unit | Rate |
|---|---|---|
| Runtime | vCPU-hour + GB-hour | $0.0895/vCPU-hr, $0.00945/GB-hr |
| Gateway | Authorization requests | Per request during agent execution |
| Identity | Auth requests | Per authentication event |
| Memory | Records stored + retrieved | Per record operation |
| Evaluations | Evaluation tokens | Per token processed |
| Policy | Policy evaluations | Per policy check |
Runtime's consumption-based billing captures the "I/O wait is free" benefit -- while an agent waits for an LLM response or external API call, CPU consumption drops to near zero and billing reflects that. This delivers substantial cost efficiency compared to traditional always-on compute options. For bursty workloads, auto-scaling from zero eliminates idle costs entirely.
For cost optimization: use smaller model IDs for simple routing agents (Haiku for orchestration, Sonnet for reasoning), batch tool calls where possible, and leverage Memory to avoid redundant LLM calls for previously resolved queries. Monitor costs through CloudWatch billing metrics and set budget alarms to catch unexpected spikes early.
9. Bedrock AgentCore vs Azure AI Foundry vs Vertex AI
The three major cloud providers each offer managed agent platforms with different strengths. The selection primarily depends on which cloud and business systems you already use -- switching costs are highest at the integration layer, not the agent framework layer.
| Capability | AWS Bedrock AgentCore | Azure AI Foundry | Google Vertex AI |
|---|---|---|---|
| GA Date | Oct 2025 | Dec 2025 | Agent Engine 2025 |
| Agent Runtime | Serverless, per-second billing, auto-scale from zero | Azure Container Apps based | Managed infrastructure with session support |
| MCP Support | Gateway with centralized MCP wiring | Via Semantic Kernel plugins | Native MCP tool integration |
| A2A Protocol | First-class Runtime support | Via Microsoft Agent Framework | Originated A2A; native support |
| Evaluations | GA Mar 2026 -- online + on-demand | Azure AI Evaluation SDK | Vertex AI Evaluation |
| Security Model | IAM + VPC + PrivateLink + KMS | Azure AD + VNet + Key Vault | IAM + VPC-SC + CMEK |
| Ecosystem Fit | AWS-native orgs, multi-framework flexibility | Microsoft 365, Teams, SharePoint shops | GCP-first teams, Gemini-centric |
All three platforms support both MCP and A2A protocols, enabling interoperability across clouds. AgentCore's key differentiator is its framework-agnostic Runtime -- any agent framework that produces an HTTP server deploys without modification. Azure AI Foundry's advantage is deep Microsoft 365 integration with 1,400+ action connectors. Vertex AI excels when teams are already invested in GCP and want tight Gemini model integration.
10. Security: IAM Roles, VPC Isolation, Encryption
AgentCore deploys with enterprise-grade security by default. Every user session runs in its own protected environment with complete isolation to prevent data leakage. The platform integrates with the full AWS security stack: IAM for access control, VPC for network isolation, KMS for encryption, and CloudTrail for audit logging.
IAM and Access Control
AgentCore supports IAM condition keys including bedrock-agentcore:subnets and bedrock-agentcore:securityGroups for fine-grained network policy enforcement. Control Plane API operations (CreateAgentRuntime, UpdateAgentRuntime, CreateCodeInterpreter, CreateBrowser) support these condition keys. AgentCore Policy (GA March 2026) provides centralized, fine-grained controls for agent-tool interactions, with security teams able to define tool access and input validation rules using natural language.
VPC Isolation and PrivateLink
By default, agent requests to AgentCore Gateway traverse the public internet. With interface VPC endpoints, organizations route communications through the AWS internal network backbone, delivering enhanced security, reduced latency, and improved compliance alignment. A2A servers run inside VPCs with PrivateLink for private inter-agent communication that never touches the public internet.
Encryption
Customer data and metadata stored in DynamoDB and S3 are encrypted with either a service key or a customer-managed KMS key. AgentCore Identity encrypts all stored credentials using customer-managed KMS keys. Data never leaves your AWS account and is never used to train models. All API calls are logged in CloudTrail for compliance auditing.
Key Features at a Glance
Framework Agnostic
Deploy agents built with Strands, LangGraph, Google ADK, OpenAI Agents SDK, Claude Agent SDK, or custom frameworks. Any HTTP server on port 9000 works with Runtime.
Centralized MCP Gateway
Group multiple MCP servers behind a single managed interface. Automatic tool discovery, protocol handshakes, and SigV4 request signing.
Scale to Zero
Per-second billing with auto-scaling from zero. No idle costs. I/O wait periods incur near-zero CPU charges. Handles burst traffic without pre-provisioning.
Agent Registry
Private catalog with semantic search for discovering existing agents, tools, and MCP servers across the organization. Approval workflows and CloudTrail audit trails.
Built-in Evaluations
13 built-in evaluators plus custom LLM-based and code-based evaluators. Online monitoring of production traffic and on-demand CI/CD integration.
Enterprise Security
Session isolation, IAM condition keys, VPC endpoints with PrivateLink, customer-managed KMS encryption, and Policy for natural-language tool access rules.
Production Outcomes
Runtime scales from zero and bills per second of actual CPU use. I/O wait periods (LLM inference, API calls) incur near-zero charges, making bursty agent workloads significantly cheaper than always-on compute.
Deploy agents built with Strands, LangGraph, Claude Agent SDK, or custom Python/Java frameworks to the same Runtime. No vendor lock-in at the agent framework layer.
A single AgentCore deployment supports both MCP tool servers and A2A agent-to-agent communication, enabling agents that both use tools and collaborate with peers.
AgentCore Evaluations available across nine regions. Runtime and Gateway support continues expanding, with VPC, PrivateLink, and CloudFormation available in all GA regions.
11. Managed Harness, AgentCore CLI, and Memory Enhancements
The Managed Harness (preview, April 2026) provides a pre-built runtime environment that removes the need to write HTTP server code entirely. Instead of wrapping your agent in an HTTP handler, you provide the agent function and Managed Harness handles HTTP serving, health checks, session management, and graceful shutdown automatically. This reduces the deployment surface from a Dockerfile + server code to a single agent function and a manifest file. Managed Harness currently supports Python and Java agents via the Strands and Spring AI SDKs.
The AgentCore CLI with CDK integration enables infrastructure-as-code workflows for AgentCore deployments. Define Runtime endpoints, Gateway configurations, Memory stores, and evaluation pipelines as CDK constructs, then deploy with cdk deploy. The CLI includes agentcore init for project scaffolding, agentcore deploy for direct Runtime deployment, and agentcore logs for real-time log streaming from running agents.
Filesystem Persistence (preview) allows agents to persist files across invocations within a session -- scratch pads, intermediate computation results, downloaded artifacts. Files are stored in session-scoped ephemeral storage that survives across turns but is cleaned up when the session ends. Episodic Memory extends the existing Memory primitive with structured event storage: agents can record decision points, tool outcomes, and reasoning traces as episodes that are retrievable by semantic query. This enables agents to learn from past interactions and avoid repeating failed approaches. The coding assistant integration lets agents spawn sandboxed code execution environments for writing, testing, and iterating on code as part of their workflow.