OpenAI built a voice cloning tool, but you can't use it... yet

OpenAI is enhancing voice cloning technology in the face of increasing deepfakes while emphasizing responsible usage.

Introducing OpenAI’s Voice Engine, a new extension of their text-to-speech API being previewed today. This technology allows users to create synthetic voices based on a 15-second voice sample. Although not available to the public yet, OpenAI is taking the time to ensure responsible deployment and address potential misuse.

Jeff Harris, a member of OpenAI’s product staff, states, “We are focused on understanding the risks associated with this technology and implementing safeguards to mitigate those risks.”

Model Development

The underlying AI model driving Voice Engine has been in use for some time. It powers features in ChatGPT and Spotify, among others. While the training data source remains undisclosed, OpenAI emphasizes a combination of licensed and public data for training.

OpenAI faces legal challenges over alleged copyright infringement related to training its AI models with copyrighted content. The company has licensing agreements with certain providers and allows artists to remove their work from training datasets. It believes fair use protects its practices.

Voice Synthesis

Surprisingly, Voice Engine does not fine-tune on individual user data. Instead, it uses a combination of a diffusion process and transformer to generate speech based on a small audio sample and text input. OpenAI claims its approach results in higher-quality speech compared to competitors.

Although pricing details are not currently available, Voice Engine offers competitive rates for synthetic voice generation, with no customization controls available at the moment.

Voice Talent Perspective

OpenAI’s technology could potentially disrupt the voice actor industry by providing cost-effective synthetic voice solutions. While some platforms attempt to balance voice actor rights and AI integration, OpenAI focuses on responsible usage and requires explicit consent for voice cloning.

Ethical Considerations

Concerns around misuse of voice cloning technology, including generating deepfakes for malicious purposes, are real. OpenAI is taking proactive steps to prevent misuse by watermarking cloned voices and limiting access to a select group of developers initially. The company remains committed to releasing the technology safely.

Future Plans

Pending the pilot phase, OpenAI may expand the availability of Voice Engine. The company is exploring security measures like user verification through text reading to ensure responsible usage. Continuous learning and mitigation of safety risks remain top priorities for OpenAI.

What's Hot

OnePlus 13 cameras are modest, but promise big improvements

NYT Connections: hints and answers for Saturday, October 26

The Hinterlands are the best part of Dragon Age: Inquisition

OpenAI built a voice cloning tool, but you can’t use it… yet

Fundraising is a lot easier when you have traction

Eti Lazarian and Bruce Lee joining Disrupt 2024

Last Day: Exhibit your startup with big savings at Disrupt 2024

Generative AI coding startup Magic lands $320M investment from Eric Schmidt, Atlassian and others

OnePlus 13 cameras are modest, but promise big improvements

NYT Connections: hints and answers for Saturday, October 26

The Hinterlands are the best part of Dragon Age: Inquisition

Android 15: everything you need to know

OnePlus 13 cameras are modest, but promise big improvements

NYT Connections: hints and answers for Saturday, October 26

The Hinterlands are the best part of Dragon Age: Inquisition

Subscribe to Updates

What's Hot

OpenAI built a voice cloning tool, but you can’t use it… yet

Model Development

Voice Synthesis

Voice Talent Perspective

Ethical Considerations

Future Plans

Related Posts