Beware Deepfake Voice Scams

Your work phone rings. You answer to be greeted by the unmistakable voice of your CEO. She asks you to wire over some cash, so you do it right away. After all, why would you second-guess the boss?

A new threat technique may soon give you pause for thought. Welcome to the world of deepfake audio: whereby scammers use AI software to clone the voices of key figures within organizations. From here, the fraudsters can hoodwink a company’s minions into doing their bidding.

Here’s what we know so far, and what you can do to minimize the risks.

Table Of Contents

Add a header to begin generating the table of contents

Is voice cloning a real threat yet?

At this stage deepfake voice scamming is still in the category of emerging as opposed to established threats. The Wall Street Journal recently ran a story of a case involving a $200k+ scam to illustrate what we could soon be up against. Here’s what happened…

The CEO of an unnamed UK-based energy firm received a telephone call from someone he thought was his boss; the chief of the firm’s German parent company.
The caller told the UK CEO to transfer the equivalent of $243,000 to a Hungarian supplier within the hour.
The fraudster behind this incident appears to have used AI-based software to mimic the German CEO’s voice over the phone. The UK CEO recognized his boss’ slight German accent and the distinct cadences in his voice.
After the money was transferred to the bank account in Hungary, it was subsequently sent on to Mexico before being distributed to other locations.

Pindrop, a company specializing in detecting voice fraud, says that it has come across around a dozen instances of deepfake audio scans so far.

How does it work?

The technology behind this type of scam is advancing quickly.

For instance, in July last year, the BBC reported findings from the security firm, Symantec. At the time, Symantec had reported seeing three cases of deepfake audio being used to trick senior executives.

But as the article pointed out, mounting such a scam would require hours of good quality audio - things like corporate videos, media appearances and conference keynote presentations. It’s because you generally need lots of data in the form of speech extracts to train and hone a convincing voice model. This scamming technique was doable in theory, but was both expensive and time-consuming to put into practice.

Shortly after this however, news came through of an open sourced Github project that enables anyone to effectively clone a voice from as little as five seconds of sample audio. You just have to input a short voice sample and the model can deliver text-to-speech utterances right away.

The results are reportedly pretty impressive with just a single short sample. However, the more audio you can feed into the model, the better. If your CEO currently stars in a very short YouTube video, it raises the possibility of being able to create an entirely convincing scamming tool.

Voice cloning is becoming ever-more accessible, so it will come as little surprise if it becomes used a lot more frequently as a means of attack.

Tips for staying safe?

Voice cloning is basically a high-tech variant of ‘vishing’ - i.e. where fraudsters use phone calls to scam people into handing over money or revealing personal information. It’s just that in a classic vishing attempt, the fraudster usually pretends to be from another company, and relies on you having no knowledge of the voice of the person they are pretending to be.

Voice cloning, by contrast, relies on familiarity to lull you into a false sense of security: you recognise the voice, so you do what it says.

Other than telling your senior managers to take a vow of online silence, there’s not a lot you can do to prevent your organization from being targeted. That said, there are certain procedures you can follow to stop any scam attempt in its tracks.

Multi-factor authorization. Scams like these work on a certain assumption: that all it takes is a single voice message from one individual, and a particular action will be carried out. For sensitive actions such as financial transactions and the handing over of information, you should have set processes in place. These processes should be made up of more than one step, ideally involving different channels. In the case of a big transaction for instance, you should stipulate that requests over the telephone should always be backed up by an email message. It’s much less likely that a fraudster has cloned a key insider’s voice and hacked their account.

The rules apply to everyone. Voice cloning fraudsters make a further important assumption: that people will always do what the boss tells them. But of course, if the boss is following the same rules as everyone else, they won’t be getting on the phone to order a cash transfer. Instead, they’ll be following the same procedure as everyone else. An out-of-the-blue call would immediately be flagged as suspicious, no matter how convincing it sounded.

It’s a reminder that for security policies to work, you need buy-in from everyone within an organization; no matter how important!

Guarantee Your Cyber Security Career with the StationX Master’s Program!

Get real work experience and a job guarantee in the StationX Master’s Program. Dive into tailored training, mentorship, and community support that accelerates your career.

Job Guarantee & Real Work Experience: Launch your cybersecurity career with guaranteed placement and hands-on experience within our Master’s Program.
30,000+ Courses and Labs: Hands-on, comprehensive training covering all the skills you need to excel in any role in the field.
Pass Certification Exams: Resources and exam simulations that help you succeed with confidence.
Mentorship and Career Coaching: Personalized advice, resume help, and interview coaching to boost your career.
Community Access: Engage with a thriving community of peers and professionals for ongoing support.
Advanced Training for Real-World Skills: Courses and simulations designed for real job scenarios.
Exclusive Events and Networking: Join events and exclusive networking opportunities to expand your connections.

TAKE THE NEXT STEP IN YOUR CAREER TODAY!

UNLOCK YOUR MASTER’S PROGRAM

Nathan House

Nathan House is the founder and CEO of StationX. He has over 25 years of experience in cyber security, where he has advised some of the largest companies in the world. Nathan is the author of the popular "The Complete Cyber Security Course", which has been taken by over half a million students in 195 countries. He is the winner of the AI "Cyber Security Educator of the Year 2020" award and finalist for Influencer of the year 2022.

Alishia says:

October 20, 2020 at 3:46 pm

This scam seems to be a new challenge for business organizations and can be considered as a negative side of AI.

Prasad says:

October 21, 2020 at 1:59 pm

Nice post! Need to be vigilant to avoid getting trapped.

Andrew says:

October 21, 2020 at 3:33 pm

That means preventing voice deep fakes will favour those organisations which are less autocratic, where subordinates do not automatically jump into “obedience mode” when the big boss makes demands.

Marcos says:

October 28, 2020 at 2:27 am

Thanks for sharing! May it be used to stole bank identity? Some banks use voice to authenticate users. Sounds really bad!

Kyle says:

November 5, 2020 at 1:09 am

Looks like we’re entering an era of “zero-trust communications”, where face, voice, and natural language deepfakes will require digitally signing everything on pre-approved channels.

Imagine having to revamp the modern telephone system to achieve this, or better yet, designing an app that makes keys for users, and makes it as easy as clicking the Phone app to get verification that you and other parties are authenticated and who they say they are.

It could be a Phone app-specific PGP-like system with no configuration by the user needed that integrates with other phone systems used in business applications. Perhaps it could also transfer a nonaudible/high-pitched preamble tone sequence for each call to transfer the key. Spitballing pretty heavily here, but there could be a market for this.

Beware Deepfake Voice Scams

Is voice cloning a real threat yet?

How does it work?

Tips for staying safe?

Guarantee Your Cyber Security Career with the StationX Master’s Program!

Related Articles

StationX Accelerator Pro

StationX Accelerator Premium

StationX Master's Program