MODEL

GPT-5.4

modeltopic-noteopenai

Overview

GPT-5.4 is OpenAI’s latest flagship model line, representing the fifth major generation with incremental version 4. The model family includes both thinking and production-optimized variants, along with lightweight alternatives (Mini and Nano). GPT-5.4 achieves improvements in factual accuracy, context handling, and token efficiency while expanding pricing options for cost-sensitive applications.

Timeline

  • 2026-03-09-AI-Digest - GPT-5.4 model line introduced and initial benchmarks released
  • 2026-03-16-AI-Digest - Thinking variant capabilities and performance analysis
  • 2026-03-18-AI-Digest - Factual accuracy improvements and evaluation results
  • 2026-03-23-AI-Digest - Mini variant benchmarking and lightweight model performance
  • 2026-03-24-AI-Digest - SWE-bench and software engineering evaluation updates
  • 2026-03-27-AI-Digest - Nano variant pricing and efficiency announcements
  • 2026-04-04-AI-Digest - GPT-5.4 Thinking variant scores 75.0% on OSWorld-Verified, officially surpassing human-level desktop task automation; Responses API extended with shell tool, agent execution loop, and hosted container workspaces.
  • 2026-04-05-AI-Digest — GPT-5.4 Thinking confirmed at 75.0% OSWorld-V (above 72.4% human baseline); Responses API extended for agentic development with shell tool and hosted workspaces.
  • 2026-04-06-AI-Digest — Referenced in vibe coding and agentic workflow discussions alongside Codex.
  • 2026-04-09-AI-Digest — GPT-5.4 ties Gemini 3.1 Pro Preview at the top of the Artificial Analysis Intelligence Index v4.0 with a score of 57, ahead of Claude Opus 4.6 (53) and Meta’s newly launched Muse Spark (52), confirming OpenAI’s continued frontier benchmark leadership even as commercial run-rate trails Anthropic.
  • 2026-04-13-AI-Digest — OpenAI replaces o1-mini with o3-mini as default reasoning model (3x faster); launches Flex Compute pricing for o3 at 30% off-peak discount, signaling inference cost pressure even for flagship reasoning models.

Model Variants

GPT-5.4 Base/Thinking

  • Context window - 1M tokens
  • Reasoning capability - Advanced planning and step-by-step inference
  • Factual accuracy - 33% fewer factual errors versus predecessors

GPT-5.4 Mini

  • SWE-Bench Pro - 54.4% performance score
  • Target use case - Balanced performance and cost for general applications
  • Cost profile - Mainstream pricing tier

GPT-5.4 Nano

  • Input pricing - $0.20 per million tokens
  • Output pricing - $1.25 per million tokens
  • Use case - Ultra-low-cost inference for high-volume applications
  • Trade-off - Reduced model capability for minimal latency and maximum throughput

Key Specs & Benchmarks

Token Efficiency

  • Token reduction - 47% fewer tokens required than predecessor models
  • Implication - Improved inference speed and reduced API costs for equivalent tasks

Context Handling

  • Maximum context - 1M tokens across all variants
  • Practical advantage - Support for full-document analysis and extended reasoning

Accuracy Improvements

  • Factual errors - 33% reduction across benchmark suites
  • Reasoning quality - Improvement through thinking variant

Pricing Tiers

GPT-5.4 introduces stratified pricing to serve diverse market segments:

  • Pro/Thinking variant - Premium tier for complex reasoning
  • Base variant - Standard production tier
  • Mini - Mid-tier efficiency option
  • Nano - Cost-optimized tier with $0.20/$1.25 token pricing