Doubled context to 100K tokens and added code generation, narrowing the gap with GPT-4.
Product AnnouncementMajor model upgrade with improved coding, math, and reasoning. Extended context window to 100K tokens. Made available via claude.ai and API. Classified as ASL-2 under Anthropic's safety framework.
The ability to process 200,000 tokens of input text—roughly equivalent to a 150-page book. This was technically achieved through improved attention mechanisms and was a genuine engineering breakthrough. GPT-4 had only 8K tokens (later extended to 32K), making Claude's long context a unique value proposition that enabled entirely new applications like analyzing codebases and legal documents end-to-end.
Claude 2 matched or exceeded GPT-4 on multiple standard benchmarks while introducing the longer context window. However, the context window was the more important feature than raw benchmark performance—it opened use cases unavailable to competitors and became the primary selling point to developers.
Claude 2 showed improved safety calibration over Claude 1 (less over-refusal) while simultaneously improving capabilities. This validated Anthropic's thesis that safety and capability are not fundamentally opposed. The model was more helpful and safer than its predecessor, suggesting that proper alignment training improves products on both dimensions.