Thursday, July 4, 2024

OpenAI unveils voice cloning AI mannequin Voice Engine

Be a part of us in Atlanta on April tenth and discover the panorama of safety workforce. We are going to discover the imaginative and prescient, advantages, and use instances of AI for safety groups. Request an invitation right here.


Not content material to disrupt merely textual content technology, imagery, and video with its varied AI fashions, ChatGPT-maker OpenAI can also be moving into the final main type of legacy digital media: audio. Particularly, voice cloning.

The corporate at present is saying its latest AI mannequin, “Voice Engine,” which it says has been in growth since 2022 and presently powers OpenAI’s text-to-speech API and the brand new ChatGPT Voice and Learn Aloud options unveiled earlier this month.

Because it seems, the mannequin can even preform voice cloning. Right here’s the way it works: a human speaker information a 15-second clip of their voice by means of a cellphone or pc microphone, and OpenAI’s Voice Engine generates “natural-sounding speech that carefully resembles the unique speaker,” and can be utilized henceforth going ahead, to talk aloud any textual content {that a} human consumer sorts in.

Huge implications for spoken audio market

The tech has clearly big implications for individuals who file themselves talking typically, be they podcasters, voice over artists, spoken phrase performers, audiobook and promoting narrators, avid gamers, streamers, customer support brokers, salespersons, and lots of different occupations and disciplines.

VB Occasion

The AI Affect Tour – Atlanta

Persevering with our tour, we’re headed to Atlanta for the AI Affect Tour cease on April tenth. This unique, invite-only occasion, in partnership with Microsoft, will characteristic discussions on how generative AI is remodeling the safety workforce. Area is proscribed, so request an invitation at present.


Request an invitation

It additionally places strain on different corporations devoted to this sort of tech, akin to well-funded AI startup ElevenLabs, Captions, Meta, WellSaid Labs, MyShell, and others.

OpenAI additional spotlight’s Voice Engine’s functionality to supply assist for non-verbal people, offering them with distinctive, non-robotic voices, and support in therapeutic and academic packages for these with speech impairments or studying wants.

Preliminary use instances

OpenAI mentioned in its weblog publish saying Voice Engine at present that to this point, it has solely made the tech accessible to a “small group of trusted companions.” Amongst these highlighted and named are

  1. Age of Studying, an training know-how firm that makes use of Voice Engine and GPT-4 for producing pre-scripted and real-time customized voice content material, increasing studying help and interactivity for a various pupil viewers.
  2. HeyGen, an AI visible storytelling platform that allows creators and companies to translate their content material into a number of languages, employs Voice Engine for video translation, creating customized human-like avatars with multilingual voices, preserving authentic speaker’s accent to achieve a worldwide viewers.
  3. Dimagi, a software program firm making instruments for neighborhood well being staff, makes use of Voice Engine and GPT-4 to offer interactive suggestions in varied languages for mentioned staff, enhancing important service supply in distant settings.
  4. Livox, an AI app for Augmentative and Various Communication (AAC) gadgets utilized by these with speech and listening to difficulties, integrates Voice Engine to offer distinctive, non-robotic voices throughout languages for non-verbal people.
  5. The Norman Prince Neurosciences Institute at Lifespan, a nonprofit medical and instructing group at Brown College, devoted to serving to these with neurological illnesses and problems, is utilizing Voice Engine to help these with speech impairments in utilizing the AI model of their voice. Two medical doctors there, Rohaid Ali and pediatric neurosurgeon Konstantina Svokos, have already efficiently restored a mind tumor affected person’s speech utilizing an audio pattern from one in all her faculty venture movies.

The corporate uploaded to its weblog, and emailed to VentureBeat beneath embargo, a number of audio samples exhibiting the tech’s humanlike talking capabilities. For instance, right here’s the unique “supply voice” of Lifespan’s affected person:

And right here’s the cloned voice utilizing OpenAI Voice Engine:

Restricted consumer base by design

But for now, the tech is proscribed. As with its highly effective, extremely practical and vivid video technology AI mannequin Sora, OpenAI is not presently permitting the general public to make use of Voice Engine. As a substitute, at present OpenAI is solely sharing the existence of the instrument and “preliminary insights and outcomes from a small-scale preview” with “a small group of trusted companions” who’ve been given entry.

As OpenAI states in its weblog publish at present saying the tech:

“We’re taking a cautious and knowledgeable strategy to a broader launch because of the potential for artificial voice misuse. We hope to start out a dialogue on the accountable deployment of artificial voices and the way society can adapt to those new capabilities. Based mostly on these conversations and the outcomes of those small scale checks, we are going to make a extra knowledgeable choice about whether or not and how you can deploy this know-how at scale.”

The cautious, slow-and-steady, restricted entry strategy to releasing Voice Engine is smart particularly in gentle of U.S. President Joseph R. Biden’s current name to “ban AI voice impersonation.”

Central to OpenAI’s deployment technique is a stringent adherence to security and moral tips. Companions concerned in testing Voice Engine are sure by utilization insurance policies that prohibit unauthorized impersonation and require knowledgeable consent from voice donors.

Moreover, OpenAI has carried out security measures akin to watermarking and proactive monitoring to make sure the know-how’s accountable use.

Related Articles

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Latest Articles