Reasoning models get tools
Product AnnouncementReleased o3 and o4-mini, next-generation reasoning models with tool use capabilities. o3 achieved state-of-the-art on ARC-AGI and complex benchmarks, while o4-mini delivered strong reasoning at lower cost — best model on AIME 2024/2025.
The most capable reasoning model at release, with native tool use integrated into the reasoning loop. Could call tools mid-thought, use results to update its reasoning, and iterate.
Optimized for speed and cost. Despite being smaller, achieved best-in-class performance on AIME 2024 and 2025 mathematical competitions. Outperformed o3-mini on non-STEM tasks as well.
The critical advance was that reasoning and tool use were integrated — the model doesn't just think, then use tools. It thinks with tools, interleaving reasoning steps with code execution and web search.