MODEL
GPT-Realtime-Whisper
modeltopic-noteopenaivoice
Overview
GPT-Realtime-Whisper is OpenAI‘s streaming speech-to-text (STT) model, released to the API in May 2026 as part of a three-model real-time voice release alongside GPT-Realtime-2 and GPT-Realtime-Translate.
Timeline
- 2026-05-11-AI-Digest — OpenAI releases GPT-Realtime-Whisper to the API, priced at $0.017/minute — the lowest-priced of the three real-time voice models. Provides streaming STT, enabling live transcription at a per-minute rate that competes directly with cloud-based transcription services. Billing is by-the-minute, matching GPT-Realtime-Translate‘s utility-bandwidth model. Positions as the production-grade streaming transcription tier beneath Realtime-2’s full-reasoning voice surface.
Key Developments
- $0.017/Minute Streaming STT: Most cost-efficient tier in OpenAI’s new real-time voice lineup; entry point for applications needing live transcription without the full reasoning overhead of GPT-Realtime-2.
Related
See also: OpenAI, GPT-Realtime-2, GPT-Realtime-Translate, MOC - Developer Tools.