Opus 4 and Sonnet 4 set new benchmarks in agentic coding, with Claude Code and Agent SDK completing the developer stack.
Product AnnouncementOpus 4: world's best coding model (SWE-bench 72.5%, Terminal-bench 43.2%), sustained multi-hour agentic tasks. Sonnet 4: significant upgrade. Claude Code became GA with GitHub Actions, VS Code, JetBrains integrations. New API capabilities: code execution tool, MCP connector, Files API, prompt caching up to 1 hour.
A framework for building autonomous agents that can use tools, take actions, and sustain reasoning over multiple hours. The SDK provides orchestration, error handling, and safeguards for agentic workflows, enabling developers to build systems where Claude can work independently toward goals with minimal human intervention.
A tiered model family optimized for different use cases: Haiku for speed and efficiency, Sonnet for balanced performance, Opus for maximum capability. This allows developers to choose the right model for their cost-performance tradeoff rather than one-size-fits-all, matching how other model providers structure their offerings.
System design centered on autonomous agents that can perceive their environment, make decisions, execute actions, and learn from outcomes. Unlike traditional request-response interfaces, agentic architecture enables sustained, multi-step problem-solving with the model maintaining state and pursuing goals over time.
Integrating many tools (code execution, file operations, web access, MCP connections) into a single coherent agent framework. The challenge is not just giving models tool access but managing complex tool interactions, error handling, and ensuring the model uses tools appropriately in pursuit of goals.