Amazon Bedrock AgentCore Evaluations
Automated quality assessment for AI agents, including continuous evaluation of production traces and on-demand evaluation workflows, with built-in evaluators and support for custom evaluators.
- Lifecycle
- GA stage - Region/Partition-limited
- Availability / regions
- US East (N. Virginia), US East (Ohio), US West (Oregon), Asia Pacific (Mumbai), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), Europe (Frankfurt), Europe (Ireland)
- Typical use cases
- Continuous quality scoring on live agent trafficregression testing for agent changes in CI/CDsafety/task-completion benchmarkingtool-usage and behavior verification.
- Cost signal
- $$$$ evaluation pipelines over production traces + model-based scoring can be compute- and data-intensive