AI– 0

AWS releases open-source toolkit for AI agent evaluation

AWS ML Blog·June 11, 2026 at 03:49 PM

Amazon Web Services has launched Agent-EvalKit, an open-source toolkit designed to systematically evaluate AI agents. This toolkit, licensed under Apache 2.0, integrates with popular AI coding assistants like Claude Code, Kiro CLI, and Kilo Code. The blog post details Agent-EvalKit's six-phase evaluation process, demonstrating its functionality with a travel research agent built using the Strands Agents SDK and Amazon Bedrock. This initiative aims to provide developers with robust infrastructure for assessing the performance and reliability of their AI agent creations.

AWS releases open-source toolkit for AI agent evaluation

Microsoft SkillOpt upgrades AI agent skills automatically

Meta adds AI assistant to Edits app

New AI research cuts LLM input 16x

Companies make biggest AI mistakes