Safety·Anthropic·Dec 2025

40. Bloom: Open Source Tool for Automated Behavioral Evaluations

Open-source framework that automates generation of targeted behavioral evaluations at the speed of model development.

Research Paper

Summary

Agentic framework for generating targeted behavioral evaluations. Automates evaluation development for researcher-specified traits, leveraging advanced model capabilities to scale safety testing.

Key Concepts

Internal Tool Development

Bloom is an example of building infrastructure that solves an internal problem first, then releasing it. Anthropic needed faster evaluation generation; they built it as an agentic tool; the tool became valuable enough to release publicly. This pattern shows how dogfooded internal tools often become the best products because they're built on genuine operational necessity, not theoretical demand.

Developer Experience

Bloom dramatically improves the DX for safety researchers. Instead of manually writing hundreds of test cases, researchers specify the trait to evaluate, and Bloom generates the tests. This transforms evaluation from a laborious manual task to a high-level specification task, freeing researchers to focus on which behaviors matter rather than how to test them.

AI-Assisted Engineering Workflows

Bloom exemplifies using AI to accelerate engineering tasks. Advanced language models generate evaluation test cases; humans specify intent at a higher level. This is not replacing engineers but amplifying them — the same pattern applies across code generation, documentation, testing, and other engineering domains where AI can handle the low-level synthesis while humans focus on high-level direction.

Connections

Influenced by

30. Natural Emergent Misalignment from Reward Hacking in Production RL

Jul 2025

Influences

58. Petri 2.0: Automated Behavioral Auditing at Scale

Jan 2026

61. A3: Automated Alignment Agent

Mar 2026