Research Saturday 7.1.23
Ep 288 | 7.1.23

The power behind artificial intelligence.


Dave Bittner: Hello, everyone, and welcome to the CyberWire's "Research Saturday." I'm Dave Bittner, and this is our weekly conversation with researchers and analysts tracking down the threats and vulnerabilities, solving some of the hard problems and protecting ourselves in a rapidly evolving cyberspace. Thanks for joining us.

Daniel dos Santos: So we were thinking about using generative AI, not just for something that can be used to kind of dupe users into executing malicious code, but how the attackers could use it to actually generate or convert some malicious code and some actual exploits to cause some real damage, more specifically in OT and unmanaged devices.

Dave Bittner: That's Daniel dos Santos, Head of Security Research at Forescout. The research we're discussing today is titled AI-Assisted Attacks Are Coming to OT and Unmanaged Devices. The Time to Prepare is Now.

Dave Bittner: Well, let's walk through this together. I mean, you have this tool, and I think probably for most of our audience, it's a tool that, at the very least, they've played with. How do you approach this? Where do you begin?

Daniel dos Santos: Yeah, so our idea is, we took an exploit which we already had, that we developed for a kind of a proof of concept that we had in the past for an exploit for operational technology, right, for a programmable logic controller, and we wanted to convert that into a different language, right? So the original exploit was written in Python as a short script. We wanted to use that in Go, because we see a lot of malware being written in Go these days, and because, you know, it allows for cross-compilation and to run in different architectures and things like that, so we just wanted to explore Go. The issue is that none of us kind of know Go, so we wanted to explore from the eyes of a complete beginner, let's say, how you would use a generative AI tool to assist in that process. And that actually is interesting, because it simulates something that attackers from time to time have to do, which is taking a public exploit, taking a proof of concept, taking a part of a code that actually exists, and translating that, converting that and banding that into order -- malware-order malicious code, to make it more useful, let's say, right? So what we did is we just kind of had a chat with ChatGPT, right, the specific generative AI tool that we used, where we were asking the tool to generate parts of the new code that we wanted it to generate. And in the process, you know, we kind of guided it to where we needed some corrections or things like that, but the whole process took something like 15 minutes, and in the end, we had the working exploit as we intended.

Dave Bittner: Well, tell me about the exploit that you were using in your experiment here. What sort of functionality did it have?

Daniel dos Santos: Yeah, so originally, it was part of a larger attack which were developed, where we wanted to show how adversaries can infiltrate networks from the IoT part of the networks or from IoT devices that are exposed, like IP cameras and network-attached storage devices and things like that, then move laterally to IT computers, servers, workstations, and then from there, move to OT networks and find vulnerable devices like PLCs, and then exploit them. And the exploit in this case is a is a denial of service, right? It's scratching the device, so that you need to manually reboot it, manually in the sense of like power cycling it, like a hard reboot. So what we did is we took this last component part, the part that crashes the PLC, and we wanted to have a very simple version of it, which was avoiding all the network scanning, avoiding all the parts before and just really focus on the core exploit which is crashing the PLC by sending a malicious packet. So we kind of just -- we had the format of the malicious packet, and we told the tool ChatGPT to proceed to generate the code around it that would create the connection, send it to the specific target, check if the target is alive and things like that.

Dave Bittner: Hm, so help me understand here. Was- the way you're describing it, so this was not a case of, for example, loading in the Python code and saying, "Hey, ChatGPT, convert this for me."

Daniel dos Santos: No, no, it was not exactly like that, because we wanted to change it a little bit, and also because there are some safeguards in tools like ChatGPT. If it identifies that the code is potentially malicious, or, you know, if you're asking, "Convert this malware for me," or something like that, the tool will actually say, you know, "I'm not supposed to generate malicious code," or things like that. So we basically just told the tool to take parts of what we wanted, like, you know, write a scanner in Go for this part of the network, and things like that, and it- yeah, we were driving the parts of the code that we wanted. Of course, we had a part that was actually- you know, some parts of it were like the equivalent of the Python code that we wanted it to translate, but some other parts were just saying, you know, fix this, this thing here, or change this part of the code, and things like that. So it was not just like, put the whole code in, and we get the whole code out. Right? It was a little bit more of a conversation.

Dave Bittner: Yeah. You mentioned that ChatGPT has some safeguards built in. How did you approach this to avoid tripping some of those safeguards?

Daniel dos Santos: Yeah, so in this case, it was really by not mentioning that things are malicious, right, because in this case, it's- it was difficult- it would be difficult for the tool to identify that that's an actual exploit. I mean, you're sending a packet over the network, and the packet could be could be benign, could be malicious, could be, you know, just checking if a device is alive, so the payload itself wasn't known to be malicious by the code. So in this case, you can use it without, you know, tripping any of the safeguards, right, because if you mention specific things like write a malware or create an exploit or let me test this vulnerability or things like that, that would become obvious to the tool that you're trying to create malicious code. But when you're just asking for code, write code that can have a dual purpose, let's say, the tool won't notice that you're doing anything malicious.

Dave Bittner: Now, you say that you and your colleagues were, by your own admission, not experts when it came to Go code. How did you evaluate the code that spat back at you?

Daniel dos Santos: Yeah. So that's a good question, because there were two. I mean, the test was actually running it on the network, right? But there are two parts to it. The first is making sure that the code like compiles, like runs, so the code makes sense. And the second is that the code does what it's supposed to do, right? So the first part is you take the code that the tool spits out, and you try to run it, and in some cases, there were issues like missing packages and things like that, and then we just asked the tool itself, like, you know, "ChatGPT, I'm getting this error line. What can you tell me? How can I fix this?" And then the tool would realize, oh, yeah, I forgot to add one line, or maybe you forgot to add- to install this specific package on your computer, so please go ahead and do this, this, this and the following, right? It was really very kind of step-by-step instruction on how to run the actual code. And the second part to test that it works, it was really when the code was running, compiling, let's say. We could actually just run it on the lab, the same lab where we tested the original Python exploit, and we saw that things were working.

Dave Bittner: Is there any sense when you look at the code that it generated of how sophisticated or elegant or efficient it is?

Daniel dos Santos: Yeah. So in this case, the code was somewhat simple, right? Like, there was nothing too complex around it. There was another experiment we did, which is not published in this research, per se, but it was looking at other use cases and applications that were more in the medical domain, like trying to write, you know, parsers, for some specific protocols to get some data out and things like that. In that case, it was interesting to see because the code was a little bit more complex, right? It's a parser for some fields in a protocol. It was interesting to see that the tool struggled sometimes with things like regular expressions. But in other cases, it had some very good code in the sense that it follows, you know, standard- in that case, we were using Python, not Go. It follows the standard, you know, Python guidelines for formatting of the code or for the libraries that you would use and things like that. So it's kind of a mixed bag there in the sense that you can get very good code. You can get code that works pretty well, or you can get code that is somewhat in between and works some of the times, doesn't work all the times, right? So you do need- it's not kind of get it and for sure it will be running. Sometimes you need to massage the code a little bit, but you can use oftentimes the tool itself to help you do that.

Dave Bittner: It strikes me that it's potentially a huge time saver and also takes away that problem of staring at the blank page.

Daniel dos Santos: Yeah, I think the time-saving component is the most important one. It's like, again, we could have learned probably enough Go or are looked at examples to write it ourselves, but if the tool is there to help you, and it's not just Go, right? Imagine we would want to do the same thing in Rust or in any other programming language. Or, you know, this thing I mentioned before for the parsers, writing regular expressions can be pretty annoying. If the tool can help you with that, it's a great tool, right? And that's what I mentioned right at the beginning of our conversation, that both the good guys and the bad guys are looking at the tool with the same kind of- different motivations, let's say, or different purposes but the same motivation, right? The motivation is to save time, to gain efficiency, to have more, let's say, return on investment on whatever they're doing. Of course, the purpose in some cases is to detect attacks or to increase business efficiency or whatever good motivation or purpose that can be. But in other cases, it should launch more attacks, to have attacks launched faster, to have attacks that might go deeper into a network because you have less time crafting specific exploits for specific environments or trying to understand the environment you're in because the tool can help you with that.

Dave Bittner: So based on what you all have gathered here and learned, what are your recommendations for folks out there who are tasked with defending their organizations? How does this inform that process?

Daniel dos Santos: As of now, the attacks themselves, let's say, the tactics, techniques and procedures that the attackers are using, are probably not changing superfast, right? The fact that they're using or they can be using generative AI means that the attacks will come faster, they will come probably at a higher rate, a higher volume, so you can expect an increase in attacks. But that's something that we were already expected, right? We already experiencing an increase in attacks. This is just, again, probably exploding the number of attacks that we will foresee. So as of now, the nature of attacks is not changing too much. We do expect that maybe in the future, there can be new types of attacks, you know, mixing things like disinformation, mixing things like again, the phishing campaigns that can become more sophisticated, or, in some specific domains, like I mentioned healthcare in the past, I'd really like to cite to this research that was not done by us but it was done by an academic group of researchers a couple of years ago, where they trained a generative adversarial network, again, to create images of patients that had gone through CT scans, and they inserted fake tumors in those images, right, which is very scary, and it's a terrible use of the technology. But it allows you to imagine what kind of potential outcome attackers can get with that. So I would say, for each specific domain, there's probably a different type of attack that in the future can happen that we're not even imagining now. So I would urge, you know, the research community to look at that, people who are domain experts to think about how this technology could be misused in their environment to look at that. But from a cybersecurity point of view in general, you know, defending against attacks that we are seeing these days, I would also urge the defensive community, the security operations center analysts, the, you know, cyber-defenders out there to look at how to use AI in their day-to-day job to make it more efficient, right? We all know that people are flooded with intrusion detection alerts these days. Maybe generative AI can help you to go through those alerts and see what is actually a threat. We all know that, like, reverse engineering code and understanding malware code is very difficult. There are people out there looking at how to use generative AI to explain reverse engineered code and make that kind of human-understandable, human-readable, and so on and so on, right, even generating code for threat hunting and things like that. So there's lots of opportunities for the defenders as well to use this type of technology for their own purposes.

Dave Bittner: Our thanks to Daniel dos Santos from Forescout for joining us. The research is titled AI-Assisted Attacks Are Coming to OT and Unmanaged Devices. The Time to Prepare is Now. We'll have a link in the show notes.

Dave Bittner: The CyberWire "Research Saturday" podcast is a production of N2K Networks, proudly produced in Maryland out of the startup studios of DataTribe, where they're co-building the next generation of cybersecurity teams and technologies. This episode was produced by Liz Irvin and Senior Producer Jennifer Eiben. Our mixer is Elliott Peltzman. Our Executive Editor is Peter Kilpe, and I'm Dave Bittner. Thanks for listening.