When “safe” documents aren’t.

Transcript

Dave Bittner: Hello, everyone, and welcome to the CyberWire's Research Saturday. I'm Dave Bittner, and this is our weekly conversation with researchers and analysts tracking down the threats and vulnerabilities, solving some of the hard problems, and protecting ourselves in our rapidly evolving cyberspace. Thanks for joining us.

Omer Ninburg: PDF engines are something that a lot of companies embed into their applications. And then if you have a vulnerability in one PDF engine, so as a third-party attack, you can compromise a lot of companies and customers just by them integrating those PDF engines inside their applications.

Dave Bittner: That's Omer Ninburg, CTO of Novee Security. The research we're discussing today is titled "From PDF to Pwn." [ Music ] So, before you all brought AI into this process, what did you all do to manually identify the issues inside of some of these PDF viewers? So what made you think there's something deeper here? This is worth pursuing.

Omer Ninburg: When we start to investigate any application, you don't know if there is a vulnerability or not. But you always presume that there is. The mindset of a vulnerability researcher is there's always another vulnerability, there's still something that nobody else has found before, and if you keep on digging, you'll find it or find traces that will lead you to the correct way. We didn't start this whole process because we thought we're going to find, like, thousands of vulnerabilities, but we wanted to understand what's the limits that we can do with AI. And then we just started with the first engine, which was PDFTron by Apryse, because we found a few of our customers that had that engine.

Dave Bittner: Well, let's walk through it together. Can you take us through the story of how you all dug into these and what sorts of things started to be unveiled for you?

Omer Ninburg: So I think the first thing that we -- once we started, that we found out, that a lot of the engines, or if -- I'll talk specifically about PDFTron, the engine itself is embedded in the application as an iframe. What that means is any application that uses that engine in order to render PDFs, for example, needs to communicate with that iframe via postMessage or something like that. And we started to investigate the trust layers between the application itself, the hosting application, which is unknown because it can be any application, and the engine of PDFTron or the embedded JavaScript inside the iframe. Once we tried to understand all the connectivities between the two, we found interesting postMessages that the parent of the iframe sends to the iframe itself in order to initiate it. So, for example, one of the things that we found that there's a -- like, there's a parameter that's unrequired, but you can provide it, and it has UI configurations that changes the way that the engine displays the rendering application. So think about it. The PDF engine, what it is, is a place where you can edit files, add annotations, put comments, add signing, and things like that. And once we started to investigate all the inputs that are available, and we found something that's undocumented, but it had something that appeared to be a very massive changer to the application itself, we just started to dig deeper and deeper. And this whole thing is obfuscated, minified JavaScript, which is also, like, always nice. And once we started to dig deep and deeper and deeper, what we found is that some of the inputs that you're able to provide from the configuration, the external configuration via the postMessage, it gets into a sink that just evaluates JavaScript. It didn't end there. We still needed to bypass some kind of mechanisms, because we were able to inject JavaScript code into an image tag, but we couldn't supply -- let's say, for SVG, we couldn't supply JavaScript inside the SVG. And then there we did something that was also very nice. Inside the SVG, we embedded HTML. The DOM processor, once it saw that we're in the context of SVG, and then we're again in the context of HTML, so it didn't parse the internal context of HTML as SVG, and then all the bypasses that were in the code, we just bypassed them altogether. And then we, like, found a way to execute JavaScript, which was really nice.

Dave Bittner: Must have been an exciting moment.

Omer Ninburg: Yeah, for sure.

Dave Bittner: Now, in the research, you talk about this problem of scaling. You found these things manually, but then you had to go back and say, "Can we do this again and again at scale?" Why is that so difficult in these dynamic applications?

Omer Ninburg: The places where scaling is a lot easier is when code is just code and programs are statically by nature, so it's a lot easier because then you have a very distinct path from sink to source. That means from a user input into a dangerous function. If it's easy to go and draw that path, so it's usually easy to create an exploitation and then prove that it's a vulnerability. But in dynamic applications -- and most single-page applications and modern applications are dynamic by nature -- it's very, very hard to understand how you connect the dots. So in JavaScript, a lot of the times, you have objects, or you have tables, and functions that call -- entries inside of tables that call different entries of different tables and, like, indexes all over the place, and everything is dynamic. And the only place that the actual code flow can be investigated is actually at runtime. And that's something that, let's say, the old tools, it was very difficult for them to understand, and in order to find vulnerabilities in that case, you actually need to live in runtime. You need to run on real applications that are actually running now, and then you need to understand how to add a tracing or instrumentation to the application itself in order to be able to investigate them or give AI or any code, any program the tools in order to investigate in real time. [ Music ]

Dave Bittner: We'll be right back. [ Music ] You know, one of the things that caught my eye in the research, when you all were going through the process of teaching your LLM agents, you say that the edge wasn't just running LLMs on code. You describe it as embedding elite researcher instincts. What does that really mean in practice?

Omer Ninburg: Yeah. So in practice, if you look for, again, sinks and sources, as I said before, you'll find tons of potential links and tons of things that you should look for. But when somebody that knows how to research vulnerabilities and done it for years, so they just have instincts of what's more important than the other things. When you go into a specific path, what are the hurdles that you'll probably meet, and how do you need to mitigate them? If you're blocked, let's say, the -- what I explained before with the SVG and then HTML inside it, and then another SVG inside of it, so that's like a bypass technique that you need to know about. And in order to do or to -- like, to create an agent that's able to do that, you need to give it tons of tricks and intuition. And, we actually trained our agents on those intuitions, and we try to navigate their preferred path to paths that actually correlate to finding more vulnerabilities.

Dave Bittner: How do you teach an agent to think in terms of these trust boundaries instead of just, like, pattern matching?

Omer Ninburg: That's something that we didn't cover in the blog itself, but we are going to provide another blog that's a lot more AI-related.

Dave Bittner: Can you give us a preview?

Omer Ninburg: Yeah, yeah, for sure, no problem. Like, what we need to do in order to be able to do something like that, we need data. And in vulnerability research of real applications, data is actually environments. So what we need is hundreds or thousands of environments with vulnerabilities that the agent didn't learn on, because we don't want the environments to be contaminated. They need to be something that the agents don't know. And then we actually just give them the task, let's say, find an XSS in this specific part of the application, and we do that across thousands of applications. And that's our data. And it's not a single action, but it's a iterative motion that the goal is, at the end, to find a vulnerability. And once we have tons of data and traces, so after we have all that, we can actually optimize the path and, like, quantify it. What changes can we make in order to make the agent better? And what does it mean better? It just means that statistically, it finds more vulnerabilities.

Dave Bittner: You all describe this as a collaborative swarm. You talk about tracer, resolver, and bypass. Can you explain that to us?

Omer Ninburg: You can have a generic agent that does everything, but like most things, if you have something generic that it's not -- then it's not specialized. So the way we try to tackle that task is we do create specialized agents. And we do have flows that match the way that researchers research. So, for example, we start by investigating, what are all the sinks? What is all the attack surface? So sinks is something actually pretty far, like pretty advanced, but we start by understanding, what is the attack surface that an attacker can have? And after we have that, so we can continue and then ask what are all the sinks that are available in the code? We don't have code all the time, but if we have, that's great. And then we try to create hypothesis that connect sinks to source. Because if you have a connection, then it means you have a vulnerability. But let's say we did that, then we have to optimize on actually creating an exploit. And there's a big difference between a potential connection sink to source and an actually working exploit in a live environment. And each tasks, like, from the tasks that I just described, requires different -- like, a bit of a different mindset. So let's say for creating the exploitation, what you -- like, what we understood is what you actually need is a very good coding agent, because creating an exploitation is actually creating code. We need to create a POC script that proves that what we've done actually works. But in the previous steps, let's say finding sinks inside of a source file, that's something that you can do statically. But connecting it to dynamic patterns, that's something very hard. So there you need agent that's very good at instrumentation and reading logs. And each different task requires different skills that the agent needs to embed inside itself.

Dave Bittner: You all make a claim -- it's a strong claim -- that most AI security tools produce vibes, not actual proof, but you say your approach is different. What distinguishes what you all are doing there?

Omer Ninburg: Yeah, I think that our end product is not assumption or, like, a claim that we have a vulnerability. In our platform, what we actually strive for and provide to our customers is full validation that we have a vulnerability. And the way that we are able to provide full validation is by actually creating reproducible code that proves something is vulnerable. So let's say if we're talking about IDOR, for example, so we prove that from one user context, you can access data from a different user context. And the way to prove that is writing a code that logs in, has a user context, and then being able to provide some kind of request, for example, and then extract data of a different user. And that's something that's verifiable. And if we do XSS, for example, so we prove it by instrumenting the browser, for example, and then validating that we were able to spawn an alert box, for example. So the proof that we provide is actual something that you can just take, run, and then you'll say, "Ah, yeah, this makes sense. It does exactly what I would expect it to do, and it's not just a very nice hypothesis."

Dave Bittner: From a defender's point of view, what should security teams take away from this research?

Omer Ninburg: This research is really talking about offensive security and how to find vulnerabilities. But I think in today's world, you don't have the privilege to not use a tool like this, or use AI in order to look for vulnerabilities yourself inside your applications, and even the most niche ones. Because if once you could have thought of why should an attacker look inside some niche place and do whatever -- like, maybe it made sense, okay? But it's still something that's -- it's high effort. And if today, there is a tool that can find vulnerabilities that yesterday were impossible to find at scale, and you, as a defender, you're not using that tool to discover those before the bad guys, so you're going to be in trouble. So from a defender point of view, I think this research and other researchers' research in the same field just proves that defenders must move a lot quicker than before, because it's just easier now to automate and scale everything. [ Music ]

Dave Bittner: Our thanks to Omer Ninburg from Novee Security for joining us. The research is titled "From PDF to Pwn." We'll have a link in the show notes. And that's Research Saturday brought to you by N2K CyberWire. We'd love to know what you think of this podcast. Your feedback ensures we deliver the insights that keep you a step ahead in the rapidly changing world of cybersecurity. If you like our show, please share a rating and review in your favorite podcast app. Please also fill out the survey in the show notes or send an email to cyberwire@n2k.com. This episode was produced by Liz Stokes. We're mixed by Elliott Peltzman and Trey Hester. Our executive producer is Jennifer Eiben. Peter Kilpe is our publisher, and I'm Dave Bittner. Thanks for listening. We'll see you back here next time. [ Music ]

HOST(S):

Dave Bittner is a security podcast host and one of the founders at CyberWire. He's a creator, producer, videographer, actor, experimenter, and entrepreneur. He's had a long career in the worlds of television, journalism and media production, and is one of the pioneers of non-linear editing and digital storytelling.

Schedule: Saturdays

Creator: N2K Networks, Inc.