AI▲ 60

Anthropic's Claude Code Splits Agent Tasks for Reliability

VentureBeat·May 14, 2026 at 06:12 PM

Anthropic's Claude Code introduces a '/goals' feature that separates AI agent task execution from evaluation, addressing a common failure point where agents prematurely declare tasks complete. This new approach uses a dedicated, smaller model like Haiku to continuously verify if a defined goal has been met after each step performed by the primary agent. This prevents agents from stopping work before all objectives are truly achieved, enhancing reliability for complex coding tasks. Unlike competitors who require developers to build custom evaluation logic, Claude Code's evaluator is a default setting, simplifying the process for enterprises seeking more auditable and observable AI systems.

Anthropic's Claude Code Splits Agent Tasks for Reliability

OpenAI Enhances Codex Security on Windows

OpenAI explores legal action against Apple

Oura CEO: AI Health Predictions Are Disruptive

Cerebras Systems IPO Surges 81% on Debut