SS
About Me
Frontier AI Paper BriefingsPokebowlClinical Trial EnrollerLittle Human Names
DisclaimersPrivacy PolicyTerms of Use
Privacy Policy·Terms of Use·Disclaimers

© 2026 Silvia Seceleanu

← Back to Explorer
Models·OpenAI·Mar 2026

38. GPT-5.4

Native computer use meets frontier reasoning

Product Announcement
Summary

Released GPT-5.4 with native computer-use capabilities, 1M-token context, and state-of-the-art agentic performance. First general-purpose model to surpass human performance on OSWorld-Verified (75% vs 72.4%) and match professionals in 83% of GDPval occupational comparisons.

Key Concepts

First general-purpose model with native computer-use capabilities for agentic workflows

In Codex and the API, GPT-5.4 is the first general-purpose model released with native, state-of-the-art computer-use capabilities. Agents can operate computers, navigate applications, fill forms, and carry out complex workflows across software environments — a capability previously limited to specialized models.

Surpasses human performance on OSWorld-Verified (75% vs 72.4%) and matches professionals in 83% of GDPval tasks

On OSWorld-Verified, which tests real-world computer interaction, GPT-5.4 achieves 75.0% success rate — surpassing human performance benchmarked at 72.4%. On GDPval, which evaluates professional knowledge work across 44 occupations, it matches or exceeds industry professionals in 83.0% of comparisons.

1M-token context with improved token efficiency and an upfront thinking plan

Supports up to 1 million tokens of context, allowing agents to plan, execute, and verify tasks across long horizons. Adds an upfront thinking plan for midcourse adjustments and solves problems with significantly fewer tokens than GPT-5.2.

Most factual model yet: 33% fewer false claims and 18% fewer erroneous responses vs GPT-5.2

OpenAI's most factual model to date, with individual claims 33% less likely to be false and full responses 18% less likely to contain errors compared to GPT-5.2. This represents a meaningful step toward reducing hallucinations in production use.

Connections

38. GPT-5.4Mar 202631. Introducing Oper…Jan 202537. GPT-5.2 / CodexDec 2025Influenced by
Influenced by
31. Introducing Operator
Jan 2025
37. GPT-5.2 / Codex
Dec 2025