Agents for Amazon Bedrock turn a foundation model into a goal-driven orchestrator: it plans the steps required to complete a user request, calls the tools (Lambda functions) and knowledge bases you give it, keeps track of session state, and returns a final answer with a complete reasoning trace. The runtime handles the planning loop, prompt construction, and tool-use protocol, so the application code only has to define what the agent can do — not how the loop works.
An agent is configured with: a foundation model, an instruction (the system prompt), zero or more action groups (groups of callable tools), and zero or more knowledge bases. At invocation time the runtime executes a planning loop:
Every iteration is captured in a trace: rationale, model input, model output, tool input, tool output, and any KB retrievals. Traces are how you debug and audit agent behavior in production.
PROD, STAGING) that point to a specific agent version, similar to Lambda aliases.sessionId; the runtime carries history across InvokeAgent calls in the same session.An action group is a set of related tools the agent can invoke. There are two ways to declare them: an OpenAPI 3.0 schema (richer typing, good for APIs you already document) or a function-detail list (faster to author).
import boto3
bedrock_agent = boto3.client("bedrock-agent", region_name="us-west-2")
bedrock_agent.create_agent_action_group(
agentId="AGENT123ABC",
agentVersion="DRAFT",
actionGroupName="OrderTools",
actionGroupExecutor={
"lambda": "arn:aws:lambda:us-west-2:111111111111:function:order-tools"
},
functionSchema={
"functions": [
{
"name": "get_order_status",
"description": "Look up the shipping status of a customer order by ID.",
"parameters": {
"order_id": {
"type": "string",
"description": "The customer-facing order ID, e.g. A-482.",
"required": True,
}
},
},
{
"name": "cancel_order",
"description": "Cancel an order if it has not yet shipped.",
"parameters": {
"order_id": {"type": "string", "required": True},
"reason": {"type": "string", "required": False},
},
},
]
},
)
OpenAPI is the better fit when the tools are existing REST endpoints — the schema doubles as both the agent contract and the public API spec.
openapi: 3.0.0
info:
title: Order Tools
version: "1.0"
paths:
/orders/{order_id}/status:
get:
summary: Get shipping status of an order
operationId: getOrderStatus
parameters:
- name: order_id
in: path
required: true
schema: { type: string }
responses:
"200":
description: OK
content:
application/json:
schema:
type: object
properties:
order_id: { type: string }
status: { type: string }
eta: { type: string }
Reference the schema in the action group:
bedrock_agent.create_agent_action_group(
agentId="AGENT123ABC",
agentVersion="DRAFT",
actionGroupName="OrderApi",
actionGroupExecutor={"lambda": "arn:aws:lambda:us-west-2:111111111111:function:order-tools"},
apiSchema={"s3": {
"s3BucketName": "my-agent-schemas",
"s3ObjectKey": "order-tools/openapi.yaml",
}},
)
Bedrock invokes the Lambda with a structured event that names the action group, the function (or path/method for OpenAPI), and the parameters. The handler dispatches and returns a payload Bedrock feeds back to the model.
def lambda_handler(event, context):
action_group = event["actionGroup"]
function = event.get("function") or event.get("apiPath")
params = {p["name"]: p["value"] for p in event.get("parameters", [])}
if function == "get_order_status":
result = lookup_order(params["order_id"])
elif function == "cancel_order":
result = cancel_order(params["order_id"], params.get("reason"))
else:
result = {"error": f"unknown function {function}"}
return {
"messageVersion": "1.0",
"response": {
"actionGroup": action_group,
"function": function,
"functionResponse": {
"responseBody": {"application/json": {"body": str(result)}},
},
},
}
Once a Knowledge Base is created (see Bedrock Knowledge Bases), associate it with the agent. The runtime then exposes it as an implicit retrieval tool — the agent decides when to query it based on the description you provide.
bedrock_agent.associate_agent_knowledge_base(
agentId="AGENT123ABC",
agentVersion="DRAFT",
knowledgeBaseId="KB1234ABCD",
description=(
"Internal HR policies, including PTO, parental leave, expense reimbursement, "
"and remote-work guidelines. Use for any employee-policy question."
),
knowledgeBaseState="ENABLED",
)
bedrock_agent.prepare_agent(agentId="AGENT123ABC")
The description is the most important field — it's what the planner reads when deciding whether to retrieve. Be specific about scope and recency.
Use the bedrock-agent-runtime client. The response is a streamed event collection; concatenate the chunks for the final answer.
import boto3, uuid
runtime = boto3.client("bedrock-agent-runtime", region_name="us-west-2")
resp = runtime.invoke_agent(
agentId="AGENT123ABC",
agentAliasId="PROD",
sessionId=str(uuid.uuid4()),
inputText="Where is order A-482, and can you cancel it if it hasn't shipped?",
enableTrace=True,
)
answer = ""
for event in resp["completion"]:
if "chunk" in event:
answer += event["chunk"]["bytes"].decode("utf-8")
elif "trace" in event:
# See section 7 for trace handling
pass
print(answer)
By default, the agent calls your Lambda directly. In return-of-control (ROC) mode it instead returns the tool-use request to the caller, who runs it locally and posts the result back. ROC is the right choice when the tool needs caller-side context the Lambda cannot have — interactive UIs, on-device data, user-scoped credentials, or long-running operations that exceed the Lambda timeout.
Configure the action group with customControl=RETURN_CONTROL instead of a Lambda ARN:
bedrock_agent.create_agent_action_group(
agentId="AGENT123ABC",
agentVersion="DRAFT",
actionGroupName="ClientSideTools",
actionGroupExecutor={"customControl": "RETURN_CONTROL"},
functionSchema={"functions": [
{
"name": "open_calendar",
"description": "Open the user's local calendar to a specific date.",
"parameters": {"date": {"type": "string", "required": True}},
}
]},
)
The runtime then emits a returnControl event the caller must satisfy before continuing the session.
resp = runtime.invoke_agent(
agentId="AGENT123ABC", agentAliasId="PROD",
sessionId=session_id, inputText="Open my calendar for next Monday.",
)
invocation_id = None
function_name = None
params = None
for event in resp["completion"]:
if "returnControl" in event:
rc = event["returnControl"]
invocation_id = rc["invocationId"]
invoc = rc["invocationInputs"][0]["functionInvocationInput"]
function_name = invoc["function"]
params = {p["name"]: p["value"] for p in invoc["parameters"]}
# Run the tool locally (e.g. open the calendar UI)
local_result = run_local_tool(function_name, params)
# Post the result back into the same session
runtime.invoke_agent(
agentId="AGENT123ABC", agentAliasId="PROD",
sessionId=session_id,
sessionState={
"invocationId": invocation_id,
"returnControlInvocationResults": [{
"functionResult": {
"actionGroup": "ClientSideTools",
"function": function_name,
"responseBody": {"TEXT": {"body": local_result}},
}
}],
},
)
Pass per-session context (user ID, locale, account tier) that should be visible to every tool call without being injected into the user-facing transcript.
runtime.invoke_agent(
agentId="AGENT123ABC", agentAliasId="PROD",
sessionId=session_id,
inputText="Summarize my open tickets.",
sessionState={
"sessionAttributes": {"user_id": "u-9921", "tier": "enterprise"},
"promptSessionAttributes": {"current_date": "2026-04-25"},
},
)
sessionAttributes are passed to Lambdas in the event payload. promptSessionAttributes are interpolated into the agent's prompt template (use them for things like the current date that the model itself needs to see).
Enable agent memory to persist a summary of past sessions for a user across calls. The agent uses the summary to maintain continuity — useful for assistants that talk to the same user repeatedly.
bedrock_agent.update_agent(
agentId="AGENT123ABC",
agentName="support-bot",
foundationModel="anthropic.claude-opus-4-7",
instruction="You are a helpful customer support assistant.",
agentResourceRoleArn="arn:aws:iam::111111111111:role/AgentRole",
memoryConfiguration={
"enabledMemoryTypes": ["SESSION_SUMMARY"],
"storageDays": 30,
},
)
# At invoke time, scope memory to a specific user
runtime.invoke_agent(
agentId="AGENT123ABC", agentAliasId="PROD",
memoryId="user-9921",
sessionId=session_id,
inputText="What did we agree on last time?",
)
The agent's planning loop is driven by four prompt templates: PRE_PROCESSING, ORCHESTRATION, KNOWLEDGE_BASE_RESPONSE_GENERATION, and POST_PROCESSING. Override any of them to tighten formatting, change the persona, or disable a step.
bedrock_agent.update_agent(
agentId="AGENT123ABC",
agentName="support-bot",
foundationModel="anthropic.claude-opus-4-7",
instruction="You are a helpful customer support assistant.",
agentResourceRoleArn="arn:aws:iam::111111111111:role/AgentRole",
promptOverrideConfiguration={
"promptConfigurations": [{
"promptType": "PRE_PROCESSING",
"promptCreationMode": "OVERRIDDEN",
"promptState": "DISABLED", # skip the pre-processing step entirely
}, {
"promptType": "ORCHESTRATION",
"promptCreationMode": "OVERRIDDEN",
"basePromptTemplate": open("prompts/orchestration.txt").read(),
"inferenceConfiguration": {
"temperature": 0.0, "topP": 1.0, "maximumLength": 2048,
},
}],
},
)
With enableTrace=True the runtime emits a structured trace event for every step. Persist traces to CloudWatch or S3 for debugging and audit.
import json
resp = runtime.invoke_agent(
agentId="AGENT123ABC", agentAliasId="PROD",
sessionId=session_id,
inputText="Where is order A-482?",
enableTrace=True,
)
for event in resp["completion"]:
if "trace" not in event:
continue
t = event["trace"]["trace"]
if "orchestrationTrace" in t:
ot = t["orchestrationTrace"]
if "rationale" in ot:
print("RATIONALE:", ot["rationale"]["text"])
if "invocationInput" in ot:
inv = ot["invocationInput"]
if "actionGroupInvocationInput" in inv:
ag = inv["actionGroupInvocationInput"]
print(f"TOOL CALL: {ag['actionGroupName']}.{ag.get('function')} {ag.get('parameters')}")
if "knowledgeBaseLookupInput" in inv:
kb = inv["knowledgeBaseLookupInput"]
print(f"KB LOOKUP: kb={kb['knowledgeBaseId']} q={kb['text']}")
if "observation" in ot:
print("OBS:", json.dumps(ot["observation"])[:200])
The most useful field in production is rationale.text — it shows what the model is thinking before each tool call, which makes debugging "wrong tool chosen" or "no tool chosen" failures tractable.
Resources:
SupportAgent:
Type: AWS::Bedrock::Agent
Properties:
AgentName: support-bot
FoundationModel: anthropic.claude-opus-4-7
Instruction: |
You are a helpful customer support assistant. Use the OrderTools action group
for order lookups and the HR knowledge base for policy questions.
AgentResourceRoleArn: !GetAtt AgentRole.Arn
IdleSessionTTLInSeconds: 1800
ActionGroups:
- ActionGroupName: OrderTools
ActionGroupExecutor:
Lambda: !GetAtt OrderToolsFn.Arn
FunctionSchema:
Functions:
- Name: get_order_status
Description: Look up shipping status by order ID.
Parameters:
order_id: { Type: string, Required: true }
SupportAgentAlias:
Type: AWS::Bedrock::AgentAlias
Properties:
AgentId: !Ref SupportAgent
AgentAliasName: PROD
resource "aws_bedrockagent_agent" "support" {
agent_name = "support-bot"
foundation_model = "anthropic.claude-opus-4-7"
instruction = "You are a helpful customer support assistant."
agent_resource_role_arn = aws_iam_role.agent.arn
idle_session_ttl_in_seconds = 1800
}
resource "aws_bedrockagent_agent_action_group" "order_tools" {
agent_id = aws_bedrockagent_agent.support.id
agent_version = "DRAFT"
action_group_name = "OrderTools"
action_group_executor {
lambda = aws_lambda_function.order_tools.arn
}
function_schema {
member_functions {
functions {
name = "get_order_status"
description = "Look up shipping status by order ID."
parameters {
map_block_key = "order_id"
type = "string"
required = true
}
}
}
}
}
resource "aws_bedrockagent_agent_alias" "prod" {
agent_id = aws_bedrockagent_agent.support.id
agent_alias_name = "PROD"
}
A common hybrid: a Step Functions state machine drives the overall business process, and one of its states invokes a Bedrock Agent to handle the open-ended "talk to the customer" portion.
A Bedrock Agent is a managed orchestration layer that wraps a foundation model with goal-oriented planning, tool calls (action groups), retrieval (knowledge bases), and session memory. The planning loop is ReAct-style: the model reads the user goal, decides whether to call a tool or query a KB, observes the result, then either iterates or returns a final answer. AWS handles the loop, retries, and trace logging — you only define the tools and the instruction prompt.
An action group is a set of callable functions exposed to the agent, defined either by an OpenAPI 3.0 schema or a function-detail schema. Each action is backed by a Lambda function (or by return-of-control on the client side). When the agent decides to call an action, Bedrock invokes the Lambda with the parsed parameters, captures the response, and feeds it back into the next reasoning step. Keep action descriptions and parameter docs precise — the agent's tool selection is only as good as those strings.
Return-of-control (RoC) lets the agent pause and hand the tool invocation back to the caller instead of running a Lambda. The InvokeAgent response contains the chosen action and parameters; your client executes it (often in a different VPC, on-prem, or in an existing microservice) and replies in the next turn with the result. Use RoC when the tool already exists outside AWS, when latency-sensitive code shouldn't take a Lambda cold start, or when the caller has security context (a user's OAuth token) that shouldn't be handed to a Lambda role.
Enable trace in InvokeAgent — the response stream includes preProcessingTrace, orchestrationTrace, and postProcessingTrace with the model's rationale, the chosen action, and the observation at every step. Common fixes: tighten action descriptions, add few-shot examples to the agent instruction, narrow parameter schemas, or use advanced prompt overrides to inject custom orchestration prompts. For systematic evaluation, replay a fixture set of user queries through the agent and assert on which action group was called.
Step Functions wins when the workflow is deterministic — fixed branching, parallel fan-out, retries with backoff, human approval steps, long-running waits — and the model is one of many participants rather than the orchestrator. LangGraph wins when you need fine-grained control of node transitions, multi-agent supervisor topologies, custom memory stores, or portability across providers (OpenAI + Bedrock + local). Bedrock Agents wins when the workflow genuinely needs LLM-driven planning and you want the lowest code footprint.
Use agent versions (immutable snapshots) and aliases (mutable pointers) — applications call the alias, so promotion is a metadata flip and rollback is instantaneous. Wire the agent into a CI pipeline that runs a regression eval suite (golden traces, tool-selection accuracy, hallucination rate via LLM-as-judge) against a candidate version before swapping the PROD alias. Stream traces to CloudWatch Logs and ship to OpenSearch or Datadog for per-turn latency, tool-call distribution, and error budgets. Pair with a Bedrock Guardrail attached at the agent level so policy is enforced even if the prompt regresses.