Saturday, March 30, 2024
HomeTechnologyOpenAI’s voice cloning AI mannequin solely wants a 15-second pattern to work

OpenAI’s voice cloning AI mannequin solely wants a 15-second pattern to work

OpenAI is providing restricted entry to a text-to-voice technology platform it developed known as Voice Engine, which may create an artificial voice primarily based on a 15-second clip of somebody’s voice. The AI-generated voice can learn out textual content prompts on command in the identical language because the speaker or in numerous different languages. “These small scale deployments are serving to to tell our strategy, safeguards, and fascinated by how Voice Engine may very well be used for good throughout varied industries,” OpenAI mentioned in its weblog publish

Firms with entry embody the training expertise firm Age of Studying, visible storytelling platform HeyGen, frontline well being software program maker Dimagi, AI communication app creator Livox, and well being system Lifespan.

In these samples posted by OpenAI, you may hear what Age of Studying has been doing with the expertise to generate pre-scripted voice-over content material, in addition to studying out “real-time, customized responses” to college students written by GPT-4.

First, the reference audio in English:

And listed below are three AI-generated audio clips primarily based on that pattern,

OpenAI mentioned it started growing Voice Engine in late 2022 and that the expertise has already powered preset voices for the text-to-speech API and ChatGPT’s Learn Aloud function. In an interview with TechCrunch, Jeff Harris, a member of OpenAI’s product group for Voice Engine, mentioned the mannequin was skilled on “a mixture of licensed and publicly obtainable information.” OpenAI advised the publication the mannequin will solely be obtainable to about 10 builders.

AI text-to-audio technology is an space of generative AI that’s persevering with to evolve. Whereas most deal with instrumental or pure sounds, fewer have targeted on voice technology, partially as a result of questions OpenAI cited. Some names within the area embody corporations like Podcastle and ElevenLabs, which offer AI voice cloning expertise and instruments the Vergecast explored final yr.

Based on OpenAI, its companions agreed to abide by its utilization insurance policies that say they won’t use Voice Era to impersonate folks or organizations with out their consent. It additionally requires the companions to get the “express and knowledgeable consent” of the unique speaker, not construct methods for particular person customers to create their very own voices, and to confide in listeners that the voices are AI-generated. OpenAI additionally added watermarking to the audio clips to hint their origin and actively monitor how the audio is used. 

OpenAI recommended a number of steps that it thinks may restrict the dangers round instruments like these, together with phasing out voice-based authentication to entry financial institution accounts, insurance policies to guard using folks’s voices in AI, better training on AI deepfakes, and improvement of monitoring techniques of AI content material. 

RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Most Popular

Recent Comments