You’ve probably heard of deepfakes for images and videos. Those strangely realistic videos created with AI? Now it looks like Meta (formerly known as Facebook) has developed a new AI model called Voicebox which is all about audio. It’s like a supercharged text-to-speech system that can create synthetic voices from a simple text prompt.
CLICK TO GET KURT’S FREE CYBERGUY NEWSLETTER WITH SECURITY ALERTS, QUICK TIPS, TECHNICAL ADVICE AND EASY FEEDBACK TO MAKE YOU SMARTER
What is Voicebox?
At its core, Voicebox is an AI model that creates synthetic voices based on simple text prompts. In other words, you give it text and it will read it aloud in a human-sounding voice. It’s similar to the text-to-speech feature you might use on your phone or computer, but it takes things to a whole new level.
One thing that sets Voicebox apart is its ability to reproduce specific voice styles based on a very short audio sample – we’re talking as little as two seconds! This means you could potentially have a synthetic voice that sounds like your favorite celebrity or even your own voice. It’s almost like having an on-demand voice actor, ready to play whatever you want in the voice style of your choice.
Competing AI voice models
Speechify and ElevenLabs are also players in the text-to-speech game. Speechify is an application that turns any text into audio. It can read books, articles, notes, emails, PDFs, pictures, and web pages aloud. Speechify also claims to offer voice cloning, voice editing, and voice sampling features. Speechify offers hundreds of free timeless audiobooks, has a desktop app, and is designed to help people with reading difficulties.
MARK ZUCKERBERG ‘TWITTER KILLER’ THREADS ENRAGE USERS OVER MASS DATA COLLECTION: ‘NEAR ZERO PRIVACY
ElevenLabs, on the other hand, is a startup that uses AI to generate synthetic voices with contextual emotions and natural language understanding. They offer a platform to create and customize high-quality spoken audio in any voice and style for various industries, such as video games, animations, personal digital assistants, education, entertainment, advertising and podcasting. They also have a tool for detecting synthetic voices and verifying their authenticity. ElevenLabs works with actors who provide their voice samples and get paid when their voice clones are used. They use proprietary deep learning models to create their AI spoken speeches.
They’re both pretty cool, but they don’t quite have the same versatility as Voicebox, which can imitate real voices from a few seconds of audio. It’s like comparing a Swiss army knife to a few really good spoons. They all have their uses, but one is definitely more versatile.
The power of Voicebox
But it’s not just about creating fake voices. Voicebox can also tidy up your audio by removing distracting background noise – say, a barking dog while you’re trying to record. And it’s not just English. This AI also speaks French, Spanish, German, Polish and Portuguese and can even translate passages from one language to another while keeping the same style of voice.
MOVE, SIRI: APPLE’S NEW AI VOICE AUDIOBOOK SOUNDS LIKE A HUMAN
Meta Voicebox: Breakthrough or Threat?
Unfortunately, or fortunately, depending on where you stand on AI, Meta doesn’t plan to open Voicebox just yet. This makes people wonder if they are trying to avoid certain potential problems. For example, AI voice technology can be used in negative ways, such as in harassment campaigns. Or, it could be that Meta has future plans to make money from this model.
The Source of Voicebox’s Huge Training Data
One interesting thing about Voicebox is that it was trained on a ton of data – over 60,000 hours of speech from English audiobooks and another 50,000 hours from multilingual audiobooks. Meta says they used public domain audiobooks as their primary source of data, but they also used other sources such as podcasts, speeches, and radio shows. However, there are some challenges and limitations associated with using public domain audiobooks, such as quality, consistency, alignment, and speaker identity. Meta claims that they have addressed some of these issues with their data processing and model design.
FOR MORE OF MY SECURITY ALERTS, SUBSCRIBE TO MY FREE CYBERGUY REPORT NEWSLETTER BY RENDERING MANAGEMENT CYBERGUY.COM/NEWSLETTER
The double-edged sword of technology
OBAMA AG TRIPS ‘STUPID’ COURT ORDER AFTER JUDGE BLOCKS BIDEN ADMINISTRATOR’S COMMUNICATION WITH SOCIAL MEDIA COMPANIES
The rise of AI voices is a bit of a tricky subject, especially for voice actors and, more recently, writers. They fear that companies will use AI to synthesize their voices without paying them. The audiobook market has grown a lot and companies are always looking to cut costs. This could therefore become another problem for voice professionals.
Make no mistake though; it’s not just about jobs. There are real concerns about the depth of the use of fake voices in scams. For example, there was a case where a synthetic voice impersonating a CEO was used in a major heist. There’s also the fear that fake voices could be used to disrupt things like voice biometric systems, which are used for things like online banking.
You see, as cool as this technology may seem, there is a darker side. Imagine you receive a call from your boss asking you to transfer a huge sum of money to close an account. You do what you’re told because, well, he’s your boss. Except it wasn’t. That’s right; it was a synthetic fake voice created using AI that sounded like your boss. Wild, isn’t it? But this is not a movie plot; it really happened! It was one of the first times a fake voice was used in a heist, and it left law enforcement and AI experts scratching their heads.
SLAB-2 VS. BING CREATOR – WHO COMES TOP IN THIS AI SHOWDOWN?
And it’s not just burglaries. Deepfake voices can be used to trick systems that rely on voice recognition. We’re talking about things like online banking, which uses your voice as a form of identification. If criminals can create a convincing fake voice of you, they could potentially gain access to your accounts. It’s a bit like forging a signature but with your voice instead.
Countering the threat of deepfakes
So while we marvel at the amazing things technology can do, it’s also important to be aware of the potential risks and stay one step ahead. It’s like a high-tech game of cat and mouse, with AI experts and companies working hard to spot and stop these fake voices before they can do any harm.
Luckily, there are people trying to combat the potential misuse of deepfake voices. For example, some countries have started passing laws to regulate deepfakes. Additionally, there are projects such as the Automatic Speaker Verification Spoofing and Countermeasures Challenge (ASVspoof), where scientists and engineers work on ways to counter spoofed voice attacks.
Kurt’s main takeaways
We are in an age where technology is evolving at breakneck speed and changing the way we work, communicate and even hear things. While the potential of AI like Meta’s Voicebox is undoubtedly exciting, it’s clear that we also need to tread carefully. There’s a fine line between innovation and invasion, a balance we’re all trying to find.
CLICK HERE TO GET THE FOX NEWS APP
With all these advances and potential risks, what do you think of the future of AI and deepfake technology? Do you see it as a boon or a curse? Let us know by writing to us at Cyberguy.com/Contact
For more of my security alerts, subscribe to my free CyberGuy Report newsletter by going to Cyberguy.com/Newsletter
Copyright 2023 CyberGuy.com. All rights reserved.