Dec 9, 2019

Maddie Stone: Whatsup with WhatsApp: A Detailed Walk Through of Reverse Engineering CVE-2019-3568

Maddie Stone speaking at the Jailbreak Brewing Company Security Summit on Friday, October 11, 2019.

This talk will be a detailed walk-through of the WhatsApp bug (CVE-2019-3568) used by NSO's 0-day exploit from May 2019. Not only will this talk explain the bug in detail, but it will also walk through the process and tools to find and reverse engineer the details of the bug.

(Source: Jailbreak Brewing Company)

Transcript

Maddie Stone: [00:00:39:05] Hi, thank you for sticking around post-lunch. As he said, my name is Maddie Stone, pronouns she and her. I am a security researcher on Google's Project Zero team as of two months ago, and before that I was working on the Google Android security team doing malware reversing, leading teams that looked at OEM devices. And prior to that I worked down the road at JHUAPL for about four years.

Maddie Stone: [00:01:07:01] So the goal of this presentation is not just to talk about what was this big WhatsApp Zero-Day used by NSO back in May, because, quite frankly, the bug is kind of boring. Instead I thought for this audience let's talk about tools and techniques of when these types of zero days and vulnerabilities become public, what can we do to really understand them for the many different reasons that each of us in this room might want to analyze the bug, think about what the exploit might look like, and things like that.

Maddie Stone: [00:01:37:14] So the agenda, AKA walking through the reverse engineering process, is first what went public and what did we know about the bug? Then we're going to go through a review of four different patch diffing tooling that is available currently, and how did each of those stack up against a very well scoped problem in application, versus a lot of times I see a lot of theoretical papers out about tooling versus how do you use it out of the box when you need it?

Maddie Stone: [00:02:07:07] Next we'll talk about when static analysis came in, what types of things to look for, what was the result in these cases, and finally getting into dynamic analysis with Frida. So I hope we have some fun.

Maddie Stone: [00:02:22:07] So what do we know about this CVE? Facebook published an advisory that told us it's a buffer overflow vulnerability in WhatsApp's VOIP stack, which allowed remote code execution via a specially crafted series of RTCP packets sent to a targeted phone number. So in terms of advisories, this actually provides us quite a few details of information. It helps scope where in WhatsApp you're looking for, exactly what type of vulnerability you may be patch diffing for and things like that.

Maddie Stone: [00:02:53:15] Then that same day Checkpoint Research came out with a short blog post a few hours after the advisory. And in that they highlighted two different size checks that they saw added in the new patched version of WhatsApp, which they believed could very likely be part of the vulnerability. So the first size check, which I will refer to as size check number one when we are going through some of the patch diffing tooling comparisons, is in one of the RTCP handler functions, which lines up with the Facebook advisory, so that helped scope down probably their patch diffing process. And it's checking the length argument of the packet it's about to process, so again goes with a buffer overflow type of fix.

Maddie Stone: [00:03:46:19] The next one is they found another size check, called size check number two, which had the same boundary for length, and it was doing a check that it would only do the following memcpy, which takes that length value if the length was less than hex 5C8. So two size checks they found quickly that were added in the patched version of WhatsApp versus the vulnerable version of WhatsApp, and so that's where we're going to start our analysis.

Maddie Stone: [00:04:20:05] These slides will be up right after this talk, so if you're trying to follow along or practice or do some technique skills, here are the samples. Both of them are linked to their VirusTotal submission and just version numbers if you're trying to figure out which ones I was looking at with addresses and stuff.

Maddie Stone: [00:04:36:15] So let's get into patch diffing tooling. Prior to joining Project Zero two months ago, I had been working in malware, analysis versus vulnerability research, so I really hadn't been using any of the patch diffing tooling that was out there. So I put out a call on Twitter asking, "What's your favorite binary patch diffing tools?" And all of the responses only gave me four tools that people liked, of everyone who responded. We had DarunGrim, BinDiff, Diaphora and Radare2, AKA the Radiff2 command line in Radare. So that's what we're going to look at.

Maddie Stone: [00:05:12:02] But specifically what I wanted to figure out is will these diffing tools highlight this change quickly and correctly? The few times I tried to run patch diffing or binary diffing tools previously, it gave me a lot of data that I didn't know what to do with. So I figured let's have a well scoped problem. So based on Check Point's research, I quickly found that the size check number one they referred to was in subroutine 51E34 in the patched version of WhatsApp, and the size check number two was in subroutine 52D0C in patched. What are those functions in the vulnerable versions? What actually was changed? Are there any other changes as well? Because Check Point had also put in their blog that these were the two changes that quickly stood out to us, but we didn't see everything so there might be more.

Maddie Stone: [00:06:17:14] So let's start with DarunGrim. It's been out for a while and it still only runs on Windows, its current support is not IDA 7.3, but is IDA 5.6. And it is open-source, but it was last publicly updated in February 2017. So this actually nixed me testing it fully because I have Linux and Mac desktops that I have all my tooling setups for, but I don't really use Windows. And also I don't have an IDA 5.6 lying around. So that was the first one that I crossed off the list. Then I went to the OG, BinDiff. BinDiff was created by Zynamics, who was then acquired by Google. It's not open-source, it doesn't have a lot of public releases, but they have updated it to work on IDA 7.x.

Maddie Stone: [00:07:17:04] And so the way BinDiff works is it has this plug-in that gives you integration with IDA. Most of my reversing paths, as you'll see, do still use IDA, although I've been switching to Ghidra for some things more recently. But as of now still lots of IDA. So once it finishes its analysis, you open it up in one database, you tell it the other database you'd like to compare it to, and then it will open four tabs in your IDA. And so it will tell you any matched functions, meaning even though they might be different, we're saying these two functions go together, it will tell you unmatched functions in the first database, and any unmatched functions in the second database, as well as different statistics. For this case, you're going to always have a primary and a secondary one, and so it's just the order that you open these and how it's doing the comparison. I had the vuln database open and compared it for secondary to the patched one.

Maddie Stone: [00:08:20:10] So, for example, this is the functions that had the size check added in right before the memcpy. So what you see is two gray boxes that don't exist over here, and then there are two more boxes down here on the patched version that aren't in the vuln one.

Maddie Stone: [00:09:10:15] The two smaller boxes are highlighted gray rather than yellow, meaning they were not found in the primary, AKA the vulnerable, database. And when you scroll in even more, right here is that size check, verifying whether or not the length equal is less than 5C8. I was actually very pleasantly surprised that in this whole function it highlighted these two important parts where there are size checks happening. So that was a first win for BinDiff.

Maddie Stone: [00:09:52:02] Then we have the other function though, because remember there were two size checks added, and what was interesting about this is that there are actually two sets of red clusters of blocks or boxes missing in the patched version that exist in the vulnerable. So this was a new find because what we knew about previously when we started this analysis was that WhatsApp had added two size checks in, but this clearly shows us that they actually took blocks of code out as well. So that's a great place to then begin static analysis, once we get there.

Maddie Stone: [00:10:35:02] So overall my review of BinDiff was, the matching is pretty spot on for a lot of the cases. It showed me where the vulnerable functions were versus the function addresses I knew were patched. The UI for highlighting changes is pretty good for being able to see what's new, what's changed and what's added. However, I don't know that it's very obvious in the whole listing of the matched functions, which ones are important. Especially for something like this, we're currently analyzing the native library LIB WhatsApp that exists in the WhatsApp APK, so I had already scoped it down to LIB WhatsApp, which I knew was responsible for the VOIP and RTCP handling, and there were still hundreds of matches that weren't quite 1.0 but were in that 98 to 85% match range, which is just enough little tweaks to probably make it interesting.

Maddie Stone: [00:11:46:18] So that's one place that I didn't think BinDiff, out of the box, solved all the problems for us. They don't support the decompiler, which some of the other tools we will talk about do. The UI for these graph images that I'm showing is outside of IDA. That didn't bother me, but it is something to note. And what was really nice, compared to some of the other tools, is BinDiff did not get caught up in name changes, offset changes, things like that, which is very helpful because those aren't the types of changes we often care about.

Maddie Stone: [00:12:21:03] So the next one up was a new to me tool, Diaphora. It's open-source, constantly maintained and supported. When I made these slides it had been last updated two weeks ago, and it does have current support for IDA from 7.1 through 7.3. They're also currently developing Ghidra support, and their plan is to support Binary Ninja in the future, after Ghidra is launched.

Maddie Stone: [00:12:47:21] So like BinDiff, Diaphora runs through IDA, it creates a bunch of tasks. It's really nice because all you have to do is run script, so you say, "Load script file" and then you run the Diaphora script and it gives you all this output. It does take quite a bit longer than IDA. For my two native libraries, .so files, it took about two and a half hours to run, but if you only really have to do it once, that's not too bad.

Maddie Stone: [00:13:21:00] So this is, again, an output, this was all of the partial matched functions. Unlike BinDiff, which grouped everything that they say is matched together in one, they separate out the complete 100% matches from partial matches. And, again, though, a very long list. So what's great is here we find one of the size checks we were looking for, this is sub 52DOC, the vuln is my name that was added to it, and this is the size check number two, which is the size check prior to the memcpy. And it says it matched it from same rare constant. So you can see the different descriptions, and they have pretty good documentation to understand what they mean.

Maddie Stone: [00:14:08:22] This is then a flowchart that Diaphora creates right inside of IDA rather than separately, but I actually found this one a little more difficult to tell what's changed, because they're actually highlighting things as changed when some are just the offsets have changed, and that's why it's not so clear where is the size check that occurred. So you can see here, the code in the two bottom boxes actually didn't change, they just changed some registers and offsets.

Maddie Stone: [00:14:53:01] And so here's just a reminder of what BinDiff looked like as a comparison.

Maddie Stone: [00:14:59:19] Then we get to the other size check that we knew was added, and here Diaphora actually matches just the complete wrong function, and that becomes pretty obvious also when you open up the two graph views. It did highlight to you that it believes it's an unreliable match, and the reason why they say it was a match was the same address and rare constant. So they weren't confident in it, but it's still disappointing that it couldn't find the right match to help us work off of that.

Maddie Stone: [00:15:36:06] So for my take for this scoped problem, the matching wasn't that great. It tends to get thrown off by naming, different offsets, things like that. It does have support for decompilation diffing, but if the decompilation uses any sort of different variable names, you know how decompilers do, or already set them up, it's very much a straight, you know, string type of diff. So I didn't find the decompilation output very helpful. However, it's open-source and currently maintained, and I think there's a lot to be said for that in our tooling landscape and ecosystem right now, and I do know that they are open and want people to continue just to support and help grow them. And they want to support other tools, like Ghidra and Binary Ninja, which is also something important in our community right now.

Maddie Stone: [00:16:29:13] So then I got to Radare2. Honestly, to me, I had never really worked with Radare; it is something that I keep hearing about. There were actually a lot of people on my Twitter thread who kept saying, "We love Radare's diffing." So I figured check it out, you know, here's the links. They actually have a really great book that has a lot of details in it. It's open-source and currently developed, even more why they developed because the last commit was two hours prior to me making these slides, and there's very small text that Radare sort of talks about themselves with.

Maddie Stone: [00:17:12:16] But they also have the CLI tool Radiff2, which people were raving about, so let's check it out. So Radiff2 has a lot of different capabilities built in, it's not just one "here's the diffing, now let's look at what UI you want." So the very first thing is Radiff2 is a pretty straight binary to binary comparison. So not that helpful because, write down 52DO2, which I've now highlighted in green, that's where one of the functions, 52DOC, is the one we care about and it's just showing us a strip of binary and how those bytes changed at the address 52D1C. It resulted in 150,533 different diffs. So not great, but what else can it do?

Maddie Stone: [00:18:08:16] So then I tried the next one. So if you use the -A and -C tags when you're running it, it's going to do function analysis first and then do a function to function comparison. And you also tell it which architecture you're running on and things like that. So I ran this, it took nine and a half hours to run, which I can't imagine if you have multiple libraries, like you might find in an app, or something larger than the little libraries I was looking at, but this is the output. It gives you a function. If it's unnamed it just gives the fcn prefix, sort of like IDA's sub. If it's named, it will put the name there. And then tells you is it matched, unmatched, new, things like that. Unfortunately, for both of the functions we care about, it says they're new, there were no matches in the library, which was disappointing.

Maddie Stone: [00:19:09:02] So overall this was obviously one problem, I'm not trying to say across the board these are how the tools stack up, but for me it was very interesting of how do these work out of the box? When you're not an expert in knowing all the secret customizations, will they stand up to the goal you're trying to solve, especially when it's a very straightforward goal? So in the end BinDiff matched both the vuln and patch functions correctly for two out of two, Diaphora did one out of two, Radare did zero out of two, DarunGrim just runs on Windows, which makes me sad. And my take on whether it clearly showed me the changes I would likely care about, yes for BinDiff, meh for Diaphora in the one function they checked correctly. Well, Radare can't do much if it didn't match. But none of them did I think, it's minute zero, Facebook has just released the new patched version of WhatsApp, I have the vulnerable version, will these quickly highlight to me the changes I want to review and highlight? Probably not because all of them still came up with hundreds of options that they suggested, all in the same priority, were important changes. I do think Diaphora could probably be a very strong tool, but just out of the box for this case didn't seem to work.

Maddie Stone: [00:20:46:09] So now I know what's the vulnerable function that matches up with these patched ones, I also now know that for one of these size checks where Check Point said, hey, size check was added, also clusters of code were moved, thanks to BinDiff. So now we get into our static analysis. Where we're at, we have the two size checks, we know the corresponding functions. So I'm actually very interested in finding out what did WhatsApp remove, because this function is obviously important, they added a size check here, what is this? So this is a cutout of one of those clusters of code that was removed, and this is actually the processing to allow burst packet processing for RTCP. So in this function they removed all of their support for burst processing in the patched version, and removed this recursive call to itself. So this is a cutout of the code that I named Callsvuln51D30 because it calls the function that includes the vulnerable memcpy, and in it they have a recursive call whenever they're trying to do burst packet processing.

Maddie Stone: [00:22:08:09] So that's really interesting, and led to a lot of information of this is probably where exploitation took place, because when you're doing burst processing including recursion in there, it leads to a lot of good signals for buffer overflows, especially if those can lead to RCEs in the customized packets it probably includes.

Maddie Stone: [00:22:35:21] So now what? We have a lot of things that look interesting, we have some facts, but what makes this an interesting bug? What do we want to know about the exploit? Each of us in this room may have different reasons of analyzing, but largely we want to know, okay, we know there's a buffer overflow, but what can we write? What can be done with it? Then you're likely thinking to yourself, to frame your static analysis, how do we exploit that then? And lastly how do we trigger it? They said this can be done remotely by just calling a target phone, so there must be a way that this little budget code can be triggered. So these are all the thoughts that come into our head in helping us to frame and think through our static analysis.

Maddie Stone: [00:23:25:04] These are the subroutines of interest, where the function with the vulnerable memcpy and the vulnerable library was 52F00 patched by 2DOC, and then the other one with the other size check were those. I had started reversing the ARM32 libraries, so that's why it's there.

Maddie Stone: [00:23:56:10] So let's look at this memcpy. They obviously added a size check right before this memcpy, and that length of their checking is just passed into the function so there's no checks in the vulnerable version, and they just do the straight memcpy. So when we're looking at the disassembly, it takes memcpy, which was arg0, which is a buffer partner, and they go into offset 1F7A4, then they have the pointer to the packet that they want to copy, and then the length, which was arg2 to this function. But what's interesting is, right after the memcpy, hex 100 bytes after the start of their buffer where they copy to, then they write the length of what they copy. So it's clear that they always assumed that the amount of data they would be copying from the packet would be less than 100 in this case, because then they start writing other important values.

Maddie Stone: [00:25:02:06] So in the patched version, though, up at the very top in the first block we have the size check where they're checking to make sure that length is less than 5C8. Then they do the memcpy down in the middle of this block, but this time, instead of writing the length to hex 100 bytes from the beginning of the packet copy, they do it at 5C8. So that also confirms the gut check that they didn't mean for the value in the vulnerable version, or anything after that in the struct, to be overwritten by the packet buffer, because in the patched version they ensure that these values are after the maximum length that that could be.

Maddie Stone: [00:25:53:03] So after you start to see that, you're going to do a lot more static analysis that's not that interesting, so that's why I'm focusing more on process than just walking you through step by step. And so then I started backtracking to understand what is the structure we're copying this data into? What, besides the length, can we overwrite? And are those values that are at the end of the struct after the buffer? What is the size of this buffer supposed to be? Because if it's actually shorter than the maximum amount we can copy, the buffer space is embedded in a larger struct, and that's where the length comes in. So if the larger struct that includes the buffer we're writing to, not a pointer to the buffer but the actual buffer, and we can overwrite the end of that struct, then we also potentially have the ability to overwrite the following struct in figuring out what that is. And so those are all of the things that inform the static analysis of what's next, what do I look for? What is still worthy of static reversing versus switching over to other tooling at this point?

Maddie Stone: [00:27:04:10] Let's back up a few steps to help inform where that static analysis should go. So WhatsApp uses an open-source project called PJSIP for its video conferencing VOIP implementation. Thanks to Natalie Silvanovich of Project Zero I did not have to redo all of that learning because she has published a lot on this and other video conferencing and messaging solutions. So when I Google WhatsApp, you know, my teammate's blog post pops up and tells me, oh, it uses PJSIP. So WhatsApp did add additional customizations on top of this, but to answer those questions on the previous slide, I used a lot of PJSIP's implementation and some of our diffing tools to be able to find what are the structs that PJSIP at least defined for these common functions, and then do some matching to see what values and things like that are in the structures in WhatsApp.

Maddie Stone: [00:28:08:22] And So they did differ quite a bit, but at least that gives you the understanding of most often WhatsApp's changes were at the end of the struct, where I could usually figure out what the members were at the beginning and stuff like that. They also used a lot of the same strings for logging so that makes it a lot easier too.

Maddie Stone: [00:28:30:12] So after we start to do that static reverse and figure out what we can overwrite, then the thoughts become, how do we exploit it? We have a buffer overflow, what does that mean? What can we do with it? And so, based on the differences, the removal of the burst packet processing from the vulnerable to the patching, and the way that burst processing works when you dive into the implementation, that is very likely the easiest way to exploit this bug and really overrun that buffer. And also what was interesting is that the values in addition to the length that are in the struct after the end of the buffer are all related to burst processing in the vulnerable one, so you can continue to change those as you continue to process each burst packet.

Maddie Stone: [00:29:23:19] And now we get into how do we trigger it? And that really gets into what path calls this vulnerable memcpy that we care so much about? And so I started, like every reverse engineer, with pressing Ctrl X and then creating the graph two, but it stopped very quickly. And this function I named is not an exported function, so there is not a clear way that it can get called, because remember this is a library, not a standalone executable, there must be a way for something external, AKA the WhatsApp app, to call and cause this path. And so when I look at this top node and do a Ctrl X on it, what I see is there are no call references, just data references in IDA. And so that means that its address is being saved to a spot in memory, but nothing is directly saying a branch and link or a branch or a call to this function.

Maddie Stone: [00:30:30:06] This is where we can see it saved, it's just straight up saved onto the stack at a simple variable. So this gets a little complex, so that's when I decided to switch from our static analysis to dynamic analysis, and in this case Frida. So how to set up and get that information using Frida. Frida is a dynamic instrumentation framework. It runs on just about all platforms, actively developed, open-source. So if you're trying to analyze an Android app then the easiest way to make it work is if you have a routed Android device and you run the Frida server just on the Android device in the background. And then on your analysis machine, you can then write combo Python JavaScript scripts to hook into an instrument the application or targets that are running on the Android app.

Maddie Stone: [00:31:40:19] This is my setup, Pixel 2 running a User P bug, which means you can get route. Android 9 build, I have a Verizon test SIM. One of the frustrations in what makes Natalie Silvanovich a saint for doing all of this type of analysis is that when you're working with messaging platforms and things like that, you usually need SIMs and for it to authenticate correctly with servers. And then I was injecting the scripts into the Frida server from both, I did it on a Mac and Linux. The reason why I was running 9 is originally Frida didn't work on Android 10, but thankfully, due to its active community, they immediately, a day later, got that fixed.

Maddie Stone: [00:32:31:22] So I installed the vulnerable version of the WhatsApp app on my device. And, oh joy, they won't let me run it, they say it needs to be updated, it's way too old. So these are all the instructions it required to get the old WhatsApp to run. So first I had to install the current version of WhatsApp. I registered the app, did the authentication with my SIM card to get the number, and then I force quit the app, you want to quit fully. So then, using ADB, I saved off the contents of the data/data app directory, because that's where all of that authentication and registration will be stored. So that I saved to the other machine. I then fully uninstalled WhatsApp and disconnected the phone from both WiFi and cellular, and then changed the date to a date that was prior to the date. So I changed it to June 10th, since they told me that this app can only run prior to June 24th. And you have to make sure that auto date changing and all of that stuff is turned off, and that's why I also had to disconnect from WiFi and cellular.

Maddie Stone: [00:33:45:17] So then, using ADB, I installed my vulnerable version of the app back onto the device. Then, using ADB, I copied the saved off files back into the new data/data directory and overwrote all the files there. That's one of the reasons you need a routed device. Then you start up the app, still with WiFi and cellular off, and if it starts up correctly then you can turn on WiFi to do the WiFi calls and WiFi messaging and stuff like that, but do not ever turn on cellular or else you will have to go through this again, because it turns out cellular will override your automatic date and time settings and update it anyway.

Maddie Stone: [00:34:26:16] So now, after running through these steps about six different times, we could hook the functions of interest, the three that were in our little call graph. Those are the three that I know are of interest to me, that's the path I know about thus far, so I want to record off their arguments and who calls them to get a better understanding of when and where they occur in the landscape.

Maddie Stone: [00:34:59:20] But it's not as straightforward as some of the tutorials that exist out there for Frida, because most of the time with an Android app you're interested in Java, because most Android apps are written in Java, but this is a native library. And then for the tutorials that do exist for native libraries most of the time you're interested in the Java native interface functions, because those are the ones that are called from the Android apps, but in this case the three functions we knew we were interested in weren't JNI or exported functions. So instead we need to find a way by using offsets to get Frida to hook those rather than by using names. So, for example, if the function had been exported, we would have been able to use the Module.findExportByName function to automatically get the pointer and have Frida hook it. But instead we need to get the base address of where this library is loaded into memory, and then add the offsets to that address in order to get the correct pointer.

Maddie Stone: [00:36:05:06] So this is what it ended up looking like. First we're going to get the base address of where libwhatsapp.so is loaded into memory. And this is just an example of what it would look like if you are interested in a named exported function, you can instead just do this straight up. But we couldn't, so we added our offsets that we were interested in to the libBaseAddr. One thing to note, you have to use the .add method instead of the addition operator because otherwise it thinks it's strings and will concatenate the two together. So you need to use add, sub, things like that for any arithmetic operations you're interested in.

Maddie Stone: [00:36:48:17] You may also notice that these offsets are different than what I have been talking about before. I started analyzing the ARM32 versions of the libraries – most modern devices all run ARM64 – so I originally had put all the ARM32 offsets in there and couldn't understand why it was hooking totally different things. I then realized it's running a different processor. So just went, looked it up and quickly found the offsets in ARM64lib.

Maddie Stone: [00:37:23:14] So this is what the first thing outputs when we run it, it's going to be different since they're dynamically loaded each time you run, but we get libwhatsapp.so's base address. I was printing JNIOnload to make sure that it matched the offset that I was looking for in my IDA database, and then I can get the three different functions we're interested in. We have vuln, callsVuln and callsCallsVuln.

Maddie Stone: [00:37:52:08] So how do we hook them now to get this information? Thankfully, Frida makes it pretty straightforward. We're going to use Interceptor to say, "that address I just calculated, that's what I want to hook." And we want to do these behaviors that I'm logging right here on our interest to the function. If there were things we wanted to do on exit, such as change the return value, we would just do on exit instead of on enter. So what I'm saying here is I want you to log whenever I enter this function, get the return address, which thankfully Frida makes available through an API, and because it's offsets, and I'm looking at my IDA database at the offset rather than the memory address as it's loaded in memory, I'm just going to subtract the BaseAddr so I don't have to do math separately. And, again, using .sub rather than the subtraction operator.

Maddie Stone: [00:38:50:13] And then for the actual vulnerable memcpy function, I printed out a lot more because that was the pointer to our buffer where we would do the write to, the contents of the packet that we could see, as well as the length that was being passed in. So this gives me a good ground truth of how does this operate in the good state, before we try to exploit it. One of the nice things is Frida also has this hex dump argument, which meant just in line I could still not just print the pointer address, but the actual contents of the pointer.

Maddie Stone: [00:39:31:01] So this is what I found, while a call is ringing to my device, it keeps repeating the In callsCallsVuln, In CallsVuln over and over. And after we enter the call was when it triggered the vulnerable method. So now I have a lot of good information, which can then feed back into our static analysis, because now I have the address of exactly where is that callsCallsVuln, which was their stopping point before, where is it called from?

Maddie Stone: [00:40:10:01] And this is important. So the reason why the In callsCallsVuln stopped, and one of the important parts of knowing PJSIP was that one of the main structural things of how PJSIP and other RTC libraries work, is they use a lot of callbacks. So it was saving everything to a callback table. But there are lots of different places that use callbacks and decide which callback table they want. So now we have the address to find where is it actually being called from. And with that you go back through your normal static analysis processes, trace it back, you'll find a JNI exported method that can then be called, that is triggered by the app during normal calls.

Maddie Stone: [00:41:07:21] And where do you go from here? So hope that I've given you lots of different techniques and ideas and thoughts of going into when you're trying to analyze one of these recently released Zero-Day vulnerabilities, but you have a couple of different options. The first thing I continued to do after this is continue instrumenting with Frida the vulnerable functions in path you're interested in. So that included, instead of copying the packet that WhatsApp thought it should copy, I replaced that using Frida with much larger, much more random data to see can I at least get some crashes, or what other values of importance began to change.

Maddie Stone: [00:41:56:19] That then leads into beginning to hypothesize and more formalize what you believe this exploit looks like. So once you fully understand the vulnerability and its capabilities, what makes the most sense for exploiting it? And what are the goals? We know that Facebook said it's an RCE, so you're probably looking to overwrite a function pointer rather than corrupting memory to get kernel rewrite, things like that.

Maddie Stone: [00:42:27:17] And you're trying to also understand what can I as an attacker control from the other side of the WhatsApp servers? This gets to be a little harder because WhatsApp did say they made changes to their servers after the release of this vulnerability, and so that's why I use the word "hypothesize" versus creating the exact copy, because what we also heard is that these functions were triggered prior to the user ever having to answer the call, that's what open-source media said, whereas I just said in the testing it was clear that parts of it only happened after. So that's something to keep in mind when you're doing these types of changes. you usually only use one at a time. So doing that variant analysis to understand what other patterns, what other bugs are likely the attackers holding onto, to then be able to block, burn, etc.

Maddie Stone: [00:43:07:09] And lastly I think one of the most important things we can do as a community, whenever there is in the wild Zero-Days used, is doing the variant analysis, because when you're finding a bug or a vulnerability, you're almost always finding more than one. And that's the same whether you're on the defense side, or you're on the offensive side. But to use the zero-day in the wild you usually only use one at a time. So doing that variant analysis to understand what other patterns, what other bugs are likely the attackers holding onto, to then be able to block, burn, etc.

Maddie Stone: [00:43:45:17] And with that I hope it was somewhat interesting, informative. I know a little different not talking straight up about the bug, because I figure anyone can go and read and buffer overflow is really not that interesting, but instead how can we apply this and continue growing our tools to do these types of analysis quickly, to hopefully help protect people?

Maddie Stone: [00:44:07:09] I do think we still have a ways to go with our binary diffing and patch diffing tool capabilities. Nothing out of the box is able to highlight to you those big changes, like a size check added, versus some of the other changes that were highlighted. But also sometimes it does take getting into a committed relationship with your tools so that you can really begin to understand each of their intricacies and customizations that allow them to be that much more powerful than out of the box. So setting up our tooling sets prior to needing them, and getting to know them well.

Maddie Stone: [00:44:50:17] I love static analysis, and if you've ever seen any of my previous talks, I almost always strongly defend my reasons for doing static analysis and not dynamic analysis, but it would have taken me so much longer to stick with static analysis and not switched to Frida. It took me maybe an hour and a half, after I got WhatsApp installed on the phone, to get all of that data, Frida installed, things like that, and get the data I needed versus, how many hours of combing through all the different callback calls to find when it's called. So with that, thank you and are there any questions?

Male Audience Member: [00:45:43:07] [INAUDIBLE]

Maddie Stone: [00:45:47:06] I do not. If anyone's in Checkpoint wants to tell us, that's cool.

Male Audience Member: [00:45:52:22] Do you know how they found it so quickly? Like, with all the diffs – did they find it because of the CVE details?

Maddie Stone: [00:46:06:05] I don't know because I did not talk to them, but I would guess the CVE details in the Facebook advisory were very clear in that it's a part of the RTCP handling in the VOIP stack. So my guess would be that, if I was to do this over again, is go ahead and do a dynamic analysis first to see what sort of function pack happens on a call, since we knew that from some of the reporting is that this was occurring on people's devices from a call that didn't have to be answered. Get that call stack traced, and then see what functions are also in the diff and localize by that. I didn't show you, but in the screenshot of BinDiff it can tell you if there's instruction change versus jump changes versus graphical structure changes, and so using some of that detail to further localize and stuff like that tells in the Facebook advisory were very clear in that it's a part of the RTCP handling in the voiceover IP stack. So my guess would be that, if I was to do this over again, now knowing, because I didn't fully know what those details meant to me at the moment when I read it, honestly, is go ahead and do a dynamic analysis first to see what sort of function pack happens on a call, since we knew that from some of the reporting is that this was occurring on people's devices from a call that didn't have to be answered, is do that, get that call stack traced, and then see what functions are also in the diff and localize by that. And also using some of the intricacies. I didn't show you, but in the screenshot of BinDiff it can tell you if there's instruction change versus jump changes versus graphical structure changes, and so using some of that detail to further localize and stuff like that.

Female Audience Member: [00:47:12:07] [INAUDIBLE].

Maddie Stone: [00:47:19:19] Yeah, I definitely spent way too long with static analysis banging my head against the PJSIP libraries and trying to find what all these structures probably were, who was calling them, what was the format of the callback tables, and hypothesizing which functions fit in where, whereas I said it took me maybe 90 minutes as soon as I hit up dynamic analysis. Although BinDiff gave me the greatest results, I kind of moved away from it at the beginning and went to others first because I couldn't figure out how to get it installed correctly, and the IDA plug-in button to show up the way it was supposed to. Although BinDiff has been updated, its documentation has not been updated.

Maddie Stone: [00:48:18:17] Thank you so much.