MODEL

GPT-Realtime-Whisper

modeltopic-noteopenaivoice

Overview

GPT-Realtime-Whisper is OpenAI‘s streaming speech-to-text (STT) model, released to the API in May 2026 as part of a three-model real-time voice release alongside GPT-Realtime-2 and GPT-Realtime-Translate.

Timeline

  • 2026-05-11-AI-DigestOpenAI releases GPT-Realtime-Whisper to the API, priced at $0.017/minute — the lowest-priced of the three real-time voice models. Provides streaming STT, enabling live transcription at a per-minute rate that competes directly with cloud-based transcription services. Billing is by-the-minute, matching GPT-Realtime-Translate‘s utility-bandwidth model. Positions as the production-grade streaming transcription tier beneath Realtime-2’s full-reasoning voice surface.

Key Developments

  1. $0.017/Minute Streaming STT: Most cost-efficient tier in OpenAI’s new real-time voice lineup; entry point for applications needing live transcription without the full reasoning overhead of GPT-Realtime-2.

See also: OpenAI, GPT-Realtime-2, GPT-Realtime-Translate, MOC - Developer Tools.