Defending against Adversarial AI and Deepfakes with Billy Hewett and Tony Huynh

Transcript

Tony Huynh: In today's digital world where AI can create just about anything, it's wisest to double check before believing everything you see online. [ Music ]

David Moulton: Welcome to "Threat Vector," a podcast where Unit 42 shares unique threat intelligence insights, new threat actor TTPs and real-world case studies. Unit 42 has a global team of threat intelligence experts, incident responders, and proactive security consultants, dedicated to safeguarding our digital world. I'm your host, David Moulton, director of thought leadership for Unit 42. [ Music ] In today's episode, I'm sharing a conversation with two AI experts about Adversarial AI and Deepfakes. As organizations continue to leverage artificial intelligence to fortify their defenses, malicious actors are leveraging the same technology to breach them. Leading to yet another round of the age old game of cat and mouse. [ Music ] My guests today are Billy Hewlett, Senior Director of AI Research at Palo Alto Networks, and Tony Huynh, Security Engineer specializing in AI and deepfakes. Billy, can you start by introducing yourself?

Billy Hewlett: Sure. Billy Hewlett, Senior Director of AI Research. I lead an AI research group that tries to use AI to solve cybersecurity problems.

David Moulton: How did you get into AI?

Billy Hewlett: You know, I used to build videogames. I did AI for videogames. When I finished my undergrad in 2000, there was no AI jobs, if you can believe. And so I really wanted to do AI, so I did AI for videogames.

David Moulton: Tony, can you introduce yourself?

Tony Huynh: Yeah, Tony Huynh, expertise consultant. I work on the XDR platform. I work with clients and help them mature and learn XDR.

David Moulton: Tony, when did you get into deepfakes?

Tony Huynh: I got into deepfake before joining Apollo and really did a home lab, running ransomware, testing and researching, really.

David Moulton: Tony, I'm curious, what got you interested in deepfakes?

Tony Huynh: It was a TV show called America's Got Talent. And one of the companies did a deepfake with a live voice, and just the wow factor. And then being in security kind of like, you know, CEOs, deepfakes, I guess a wave, and so that started me in that.

David Moulton: Billy and Tony, thanks for telling us a little bit about yourself and how you got into this field. And I guess I'll start with you, Billy. What are some of the potential risks that you see from adversarial AI for cybersecurity?

Billy Hewlett: Yeah. So academically, there's this idea of adversarial AI, and that's how you use AI to defeat AI. So all of our products use artificial intelligence. For example, in filtering, we're trying to take a webpage, we're trying to decide if it's malicious or benign, and we have to do this very, very quickly, and we have to do this at scale. And the only way to do that kind of thing is with artificial intelligence. So, you know, if we look at cortex, if we look at wildfire, if we look at DNS, if we look at DLP, SaaS, all of the products in some way use artificial intelligence. So then the question is, does the attacker use artificial intelligence to fight against that AI? And that is what adversarial AI is. It's AI versus AI.

David Moulton: So the potential risk there is that the attacker gains the upper hand because they have the better, faster, stronger tools?

Tony Huynh: Right. So, you know, the cat and mouse game between attacker and defender has been going on for a very long time, for as long as cybersecurity has existed. There have been, you know, defenses and attacks against them. And this is sort of the latest wave of attacks where attackers are now using AI themselves in order to perform these attacks.

David Moulton: So is this about strengthening attacker AI or weakening defender AI?

Billy Hewlett: It's a technique for strengthening attacker AI. Imagine that you sort of have very static defenses. And I will say, our defenses are not static. But suppose you have static defenses. Then you can imagine an AI trying to go and trying to figure out what the hole is in those defenses. So it's more effective than perhaps a human expert trying to find that. And perhaps it's cheaper than a human expert for the attacker to try to find a hole in those defenses. Now, our defenses are not stationary. There is many classifiers that retrain every day, because we have so much data. Now, there's other techniques that we can use to even better protect ourselves against adversarial AI.

David Moulton: Billy, how are adversarial attacks carried out? And can you talk about some of the most common techniques?

Billy Hewlett: Sure. I mean, I think the most common technique is something where you have some sort of gradient descent on the defenses. So imagine that I was trying to classify malware. And what I did was I said, okay, this thing is malware and this is how sure I am it's malware. All right, if it's 50% sure, 90% sure. And there's some like threshold. And if I go above that threshold, I'm going to call it malware. If I go below that threshold, I'll call it benign. So imagine that I say like, this is how much I think this is malware and this is how sure I am. Now what the attacker can do is they could take that whole array of malicious and benign files and they can give it to us or to whoever they're attacking. And they could get the answer back. And they get all these answers back and they also have sort of numbers like how sure are you that it's malware.

David Moulton: A kind of confidence score?

Billy Hewlett: A confidence score, exactly. All right, and so imagine the thieves like these numbers form like a space. Imagine just like a 3-D space. And so what the attacker is trying to do is they're trying to find a way to make the malware file that is detected as benign. And so they have this sort of space and they're looking for the highest hill. That might be the place where, if you can imagine in this sort of metaphor. So what they're doing is they're sort of kind of looking to travel up to find higher spaces in this, you know, mountainous terrain. And, you know, if you just pick one point and go to the highest space, it may not be the highest mountain. So you're doing that in lots of different places and trying to figure out. So you're basically sort of exploring the space but you have some direction from the investigation that you've done of the space. So you know some of the character of the space you're attacking because you've gotten all this information from the defender.

David Moulton: So how do you do AI recon?

Billy Hewlett: One way is, you know, if you're talking about like how you're gathering information, like maybe you purchase one of our boxes or maybe you have -- you know, there might be some open-source ways of looking at defenses. Or there may be ways where, you know, you can do a certain amount of, you know, questions and stuff like this and get answers out of it. The more verbose, the more information is contained in the answer, the more you can sort of use it to form the landscape better, and from there, you know, follow the gradients to be where you're supposed to.

David Moulton: So for an attacker, you're looking for the space in the topography that gives you the hold that you need without necessarily giving you the exposure that you don't want.

Billy Hewlett: Right, right. So that's sort of a very 10,000-foot view of what's going on. I think an interesting thing here is, how do you defend against this? So one way you do, as I sort of alluded to before, is you can have defenses that change a lot. So the more you change your defenses, the harder it is for them to game you. Another thing you can do is this thing called "adversarial hardening," which we're now doing in some of our products. And the idea here is, in-house, we'll actually build our own AI that could attack our AI. Right? So we're building AI that attacks against our AI. So we hold our defender AI steady and we use an attacker AI that can get past it, until the attacker AI has learned enough to get past it. And then that gives us examples of here's how I got past it. We take those examples and we feed them back into the original model and say, okay, here are some like adversarial attacks that, you know, you couldn't catch. And then, you know, we retrain the original model and we go back and there's an inner loop where we're doing that and an outer loop. So the outer loop goes back and says, okay, here's the new model, see if you can beat this. And then the adversarial AI attacks it again till it comes up with some solution. So we feed this back into the original model and then we can retrain with these new examples. So this is called "adversarial hardening."

David Moulton: Okay, so this sounds like an AI version of red teaming.

Billy Hewlett: Yes, it is an AI version of red teaming, exactly. Of course, all this stuff is done at scale. Huge amounts of data. Huge amounts of attacks. Huge amounts of GPUs to do all this simulation again and again and again. And the end result is you have defenses which are hardened against these sort of attacks. Now, here's an interesting thing. You may think, you know, what if your adversarial attacks are different than the adversaries' adversarial? Like what if the malware authors, they go off and they do this kind of adversarial attack and you went and did an adversarial attack, right? So how do you know that you're necessarily protected? And to some extent we don't, but to some extent we can do an experiment. And, you know, this is usually how we do experiments. We try to predict the future. Because we have lots of historical data going back, we can take data from March and try to predict what happens in April, right? So we can build a classifier that we train on data from March and then we check how well it does in April, right? We know exactly what happened in April, because it's March and April of a year ago. And so we train on March and we test on April. And then we do the same thing but we train on March and then using the March data, we do this adversarial training. And then we test on April again. So we're comparing training just on March or training an adversarial on March and in both of them we're testing against April. And what we found is the hardening examples do better on the April data. So even though we're not sure that our adversarial examples are just like the attackers, it turns out it makes them a little better anyways. So we have sort of this, you know, on one hand where we're trying to protect ourselves against adversarial attacks. On the other hand, it's just building a better classifier. We just seem to be more accurate with this adversarial work than we were if train without it. So that's a win. We'll take that. Like a double win, you know. We're doing one thing, we're trying to do, and secondly, we just also improved the classifier.

David Moulton: Billy, tell us about the types of adversarial AI attacks that concern you the most.

Billy Hewlett: The attacks that really worries me are this AI versus human attacks. We're really, really good at defending our own models. You know, we have lots of lots of work there. There's lots and lots of data. It's much harder to defend humans. I remember I did an internship with Google many years ago, and they said, hackers hack people, right? They hack the computers, they have to get into your system and all that stuff. And that stuff I think we're really good at. But at some point, the hackers hack people. And they use new techniques that the attackers have when attacking people. We don't know have a corresponding great answer for defenses here.

David Moulton: How do we deal with the fact that we can't upgrade humans as fast as we can upgrade systems? When I interviewed Kyle Wilhoit and Mike Sikorsky about AI in the past few months, both shared this concern that AI can now create nearly flawless phishing messages.

Billy Hewlett: I guess the recommendations I would have -- and these go for not just, you know, your workers but your parents, your family members, your kids, right -- is exercise caution. If something doesn't look right, you should follow up on it. Like if you get a strange email from someone you know and it just looks strange, you know, like give them a call, follow up with some other medium, right? I think that's sort of good advice in general in this age where there's more and more attacks.

David Moulton: MFA but in real life, like multifactor authentication, but it's really your brother.

Billy Hewlett: I think that's a good idea.

David Moulton: What examples can you share of AI outwitting humans?

Billy Hewlett: All right, one case that's in the news and one case that's personal. There was an attacker, I think it was a bank in Hong Kong, some organization in Hong Kong. And they used deepfake to convince someone to transfer $25 million.

David Moulton: That's a significant impact.

Billy Hewlett: And I think that people that have the sort of permissions to transfer that kind of money are well trained and well protected. So that's sort of a shot across the bow. On a personal note, my mom got a voice message from someone purporting to be my sister. What was scary about it was it sounded exactly like her. The attacker fail though was she was claiming to be my mom's mom. So it was just like a bizarre -- like they didn't get the context right. My mom knew something was up and called my sister. What's going on? And then sort of the game was up. But like it's getting more and more prevalent, not just in like the organizational sense but also in the sort of just going after everyday people.

David Moulton: So let me take it over to you, Tony. Tell me how deepfakes are being used for social engineering attacks. And as an expert and researcher, what are the most viable and maybe the most frightening?

Tony Huynh: It would be CEOs, people in higher positions. The one that Billy mentioned, the CFO, right, a finance worker. It was said that it was multiple workers in a call. But imagine, you know, being on that call, your higher-ups in there demanding you send this money. Sure you make like urgent emails, right? Oh, I need you to call me and send me these gift cards. But imagine if, you know, your boss is on a Zoom call with you, yelling at you in his voice, urging you to send money. It's kind of convincing.

David Moulton: So attackers that are able to combine the right details, the right context with deepfake technology, will be tough to defend against. And now we've been talking about deepfakes in the form of video, but are there other forms of adversarial AI generated content that we should be concerned about?

Tony Huynh: Yeah. So we're in the elections right now, Midjourney disabled the presidents to be uploaded, just been known reports of manipulation of images from the campaign, really. There's also been past where, you know, our president's been blackmailed, but supposedly being in a computer-generated photo that you've never done before.

David Moulton: How are those images created? What sort of tools are out there? And are they hard to get a hold of?

Tony Huynh: No. Some are free. Some you can locally install on your own machine, Stable Diffusion and Midjourney.

David Moulton: So how do you go about detecting when something like that is generated?

Tony Huynh: It takes an eye for detail, really. A good job counting fingers. Another way is you can reverse image.

David Moulton: It kind of reminds me when I was a kid, we'd play "spot the difference" games in magazines. What about dedicated tools to spot generated images?

Tony Huynh: The only one I know of is Google, Google Lens. But it's 50-50.

Billy Hewlett: There are some techniques out there in the academic literature that can work okay on videos, especially like real-time videos. It's quite difficult if it's, as Tony said, if it was just a single image, it's very difficult to say if it's real or generative.

David Moulton: Billy, do you think that this is a call for us to harden society against adversarial AI?

Billy Hewlett: People have to build up their own internal defenses against this, especially people that are more risky targets. The defense side I think will catch up eventually. The AI versus AI case I think, you know, it's a fight but we're working on it, we understand how to do this. But the AI versus human case is more tricky. And then it's sort of like, can I get something in the medium between you and the attacker? Can I get something on your cell phone record? You know, someone sends you a text message. Will they have a product that can intercept that text message?

Tony Huynh: For the attacker, your picture's on social media, I can grab that and edit the images and things like that. If you have YouTube videos, I can do that, podcasts, movies. TikTok, for example, I can pull those images down and build deepfake around that.

David Moulton: There's a lot of talk about deepfakes in politics and fooling people. The swaggy Pope image that was out a while ago definitely fooled some folks. And what I'm wondering is how does this affect enterprises?

Billy Hewlett: I think it definitely does affect enterprises. Just the example of somebody stealing the $25 million. But that could be anything. That could be getting passwords. That could be, you know, sending gift cards, you know, all kinds of things like that. So we need to change the invoice number of the vendor, right? And then next time you pay the vendor, it's not actually paying the vendor. So there's lots of attacks that can go against enterprises out there. What I'm worried about is they're attacking like an individual and working against that individual's big business.

Tony Huynh: That's one side of it. But one of the things is insider threat. Let's say someone joined a call early, they don't really validate anybody, really, nowadays. Or if they join a call with their name on the screen or even that video's on, it can't be a deepfake, but it could be like a person like in the call, listening to insider information.

David Moulton: Insiders using AI would definitely be a powerful combination. What are some of the strategies for detecting and mitigating the impact of deepfakes?

Billy Hewlett: I think one thing that is hard to cover up, especially if we're talking about enterprise deepfakes -- this doesn't really help your grandmother that much, but I guess enterprise deepfakes -- is we don't have just the deepfake, right? Imagine that you have an email that leads to a Zoom link, leads to the contents of the Zoom, leads to a follow-up email asking you to change the routing address of your vendor. In each of these steps, there is a small amount of information that when taken together -- we don't have just the deepfake. We got this email from someone from outside of the organization, and they're claiming this is inside information. And you can do analysis along all the steps along the way. That's something that people don't really think about is that we have all this like side channels for determining an attack is going on, in addition to the actual video or the actual audio. And there are techniques around the video and the audio that are, you know. There's one I've seen that's kind of interesting where you're looking at the blood flow of the face. So if you put like, you know, someone else's face on you and you have the static image that you're trying to like distort, but the problem with that is it doesn't have the blood flow of the face quite right. And it turns out that that's sort of difficult to do.

David Moulton: Tony, what ideas do you have that might be helpful to detect unusual or strange behavior?

Tony Huynh: Billy covered it earlier, it's more like this calling the person up, verifying. And then if you have a phish, report it. Even if you think it could be a phish, right? Have your security team is going to review it, right? So if you're sure, better safe than sorry.

David Moulton: You're right, better safe than sorry. How do you see defenses adapting to cope with this?

Billy Hewlett: I feel like the defenses will take in more multimodal information themselves. So they'll take in more information -- who is this person; who are they claiming to be? I'll give you an example in production. So imagine we're trying to figure out if something is just normal bank phishing. And computers are very good at figuring out what's a bank, what's not a bank. They're not very good at figuring out what it looks like. And humans are very good at what looks like a bank, but they're not necessarily good at what actually is a bank. And so what we do is we say, you know, when something looks like a bank -- which we can use artificial intelligence to determine -- and it's not a bank, then that's a problem, right? That's phishing. But like it's a smoking gun. So in a similar sense, if someone's trying to make their AI look like President Biden, or whatever it is, and we know that that's not that person, that's a smoking gun, the game is up. And so that might be one way is sort of like, as the attackers use more context, the defender may be able to use more social context and stuff like that in order to defend.

Tony Huynh: I know companies are now researching, detecting, you know, attributing, manipulating like that media. I know I think the FTC has put out a contest for who can detect real-time voice cloning, for example. So all that's being researched right now. The content of deepfakes are out now, looking at, Billy mentioned, like the blood flow, right? It's being researched and companies are investing into it right now. In today's digital world where AI can create just about anything, it's wise to just double check before believing everything you see online.

David Moulton: So a healthy digital habit is MFA and zero trust in real life to protect yourself against some of these advances in adversarial AI and deepfakes. Tony, I know we've been talking about video deepfakes, but I'm wondering if you can use some of your tools to give an example of a audio deepfake?

Tony Huynh: Hey, David, Joe Biden here, glad to be on Threat Vector. I hope you like this good voice.

David Moulton: That was pretty good. What else have you got?

Tony Huynh: Hasta la vista. I'll be back.

David Moulton: Tony, those are pretty good. Thanks for giving us that quick demonstration, I really appreciate it. Thank you so much for joining me today on Threat Vector to share your perspectives and some of the things that any listener can do to protect themselves. In the face of these emerging threats, organizations must adopt new tools and execute on the fundamentals of security to protect themselves. From implementing robust AI defenses to educating employees about the dangers of deepfakes, a multipronged approach is essential to mitigate risk effectively. As Billy suggested, people have to build up their own internal defenses, which to me means cultivating a culture of cybersecurity awareness. The convergence of adversary AI and deepfakes represents a formidable challenge for cybersecurity professionals. By understanding the nature of these threats and implementing proactive measures, organizations can navigate the evolving cybersecurity landscape with resilience and vigilance. That's it for Threat Vector this week. I want to thank our executive producer, Michael Heller, our content and production teams, which includes Sheida Azimi, Sheila Droski, Tanya Wilkins, and Danny Milrad. I edit the show, and Elliott Peltzman mixes the show. We'll be back in two weeks. Until then, stay secure, stay vigilant. Goodbye for now.

HOST(S):

Meet David Moulton, the voice for Threat Vector, the Unit 42 podcast dedicated to sharing knowledge, know-how, and groundbreaking research to safeguard our digital world.

Moulton, leads Thought Leadership for Unit 42 by Palo Alto Networks, draws on a rich background of experience, including roles in design, strategy, marketing, and sales, to connect with experts from across the globe.

Schedule: Biweekly, Thursdays

Credits: Executive Producer is Michael Heller, Show production by Shelia Droski, Tanya Wilkins, Danny Milrad, and David Moulton. Editing by David Moulton. Audio Engineering by Elliott Peltzman.

Creator: Palo Alto Unit 42