OpenAI: Rolls Out New Realtime Voice Models in API
OpenAI launches realtime voice models in the API for reasoning, translation and transcription, building on GPT-4o realtime audio capabilities.
SourceAnalysis
OpenAI has released new realtime voice models through its API that reason over speech, translate languages and transcribe audio with greater accuracy.
The updates extend capabilities first seen in GPT-4o realtime audio capabilities, where the company refined low-latency voice interactions over the past year. Developers can now build agents that handle complex spoken tasks without separate pipelines for speech-to-text and text-to-speech.
These models target enterprise use cases in customer support, multilingual meetings and voice-first applications, tightening the gap between text and spoken intelligence inside the OpenAI voice model API.
OpenAI
@OpenAILeading AI research organization developing transformative technologies like ChatGPT while pursuing beneficial artificial general intelligence.