OpenAI previews Realtime API for speech-to-speech apps

October 2, 2024

23

OpenAI has launched a public beta of the Realtime API, an API that permits paid builders to construct low-latency, multi-modal experiences together with textual content and speech in apps.

Launched October 1, the Realtime API, much like the OpenAI ChatGPT Superior Voice Mode, helps pure speech-to-speech conversations utilizing preset voices that the API already helps. OpenAI is also introducing audio enter and output within the Chat Completions API to help use instances that don’t want the low-latency advantages of the Realtime API. Builders can move textual content or audio inputs into GPT-4o and have the mannequin reply with textual content, audio, or each.

With the Realtime API and the audio help within the Chat Completions API, builders don’t have to hyperlink collectively a number of fashions to energy voice experiences. They’ll construct pure conversational experiences with only one API name, OpenAI mentioned. Beforehand, creating the same voice expertise had builders transcribing an automated speech recognition mannequin equivalent to Whisper, passing textual content to a textual content mannequin for inference or reasoning, and taking part in the mannequin’s output utilizing a text-to-speech mannequin. This method typically resulted in lack of emotion, emphasis, and accents, plus latency.

OpenAI previews Realtime API for speech-to-speech apps

Related Articles

Introducing new capabilities to AWS CloudTrail Lake to reinforce your cloud visibility and investigations

The $3.8 Trillion Alternative: Unlocking the Financial Potential of the US Generative AI Ecosystem

Advancing city tree monitoring with AI-powered digital twins | MIT Information

LEAVE A REPLY Cancel reply

Latest Articles

Introducing new capabilities to AWS CloudTrail Lake to reinforce your cloud visibility and investigations

The $3.8 Trillion Alternative: Unlocking the Financial Potential of the US Generative AI Ecosystem

Advancing city tree monitoring with AI-powered digital twins | MIT Information

Pink Hat Linux to be official WSL distro

Cisco and Tele2 IoT: Co-Innovation Broadens IoT Advantages Throughout Industries