Has GOLD SOUTHFIELD resumed operations?
Dave Bittner: Hello everyone and welcome to the CyberWire's Research Saturday. I'm Dave Bittner, and this is our weekly conversation with researchers and analysts tracking down the threats and vulnerabilities, solving some of the hard problems of protecting ourselves in a rapidly evolving cyberspace. Thanks for joining us.
Dave Bittner: Before we jump in here, let's establish - are you an R-evil (ph) guy? Are you an - a REvil guy?
Rob Pantazopoulos: It's - I'm a REvil guy. And above all...
Dave Bittner: All right (laughter).
Rob Pantazopoulos: Above all, I am not a Sodinokibi guy.
Dave Bittner: That's Rob Pantazopoulos. He's a senior security researcher and malware reverse engineer at Secureworks. The research we're discussing today is titled "REvil Development Adds Confidence about GOLD SOUTHFIELD Reemergence."
Rob Pantazopoulos: With families like REvil that we determine rise to the level of - that we do tracking on them, we set up typically a number of different tripwires, if you will. So we will identify samples. And if those samples have any different types of variations in them or unique aspects of them - so if there's a new version value within the binary, if there's a new configuration element, so on and so forth, we will set up notifications. So we'll get alerted to when we find those.
Rob Pantazopoulos: We also do open source monitoring, and that is actually, in this case, how we came across a sample. There was a Twitter post by the director of Malware Research over at Avast - Jakub, Jacob (ph), I'm not quite sure how you pronounce his name. But he had notified the good people on Twitter that they had identified a new REvil sample or what they thought to be a REvil sample. And one of the interesting quirks about it was that it wasn't actually encrypting files. So they weren't quite positive. It seemed like they weren't quite positive of what the sample was.
Rob Pantazopoulos: So as soon as we saw that, pulled down the sample, and now because we've been tracking REvil since it hit the scene, we had all types of research stored up on it, including every single version or every single variant that we've come across, we've analyzed, documented. So one of the first things that we did was pulled up one of the most recent analysis of Version 2.808 that we had done back in October and did a side-by-side comparison of the old sample and this new sample within IDA Pro. And we found that basically the decompiled pseudocode was almost exactly the same. There were some new features which we had called out, but ultimately, the core of the code was nearly identical.
Rob Pantazopoulos: There's other aspects of it as well, such as there's a string format that it uses we call the stats.json, which contains things like the REvil version information, information about the computer that is obtained at runtime, information about the encryption session, is placed inside this JSON data structure. And that information is actually, ultimately sent back to the - was sent back to the threat actor. So that was there as well. So that was really kind of like the absolute, yes, this is REvil.
Dave Bittner: Well, give us a little bit of the background history here. I mean, we're talking about two groups. We've got REvil, and then we've got GOLD SOUTHFIELD. And I guess what we're getting at here is, you know, is that a distinction without a difference? What's the backstory?
Rob Pantazopoulos: Sure. So GOLD SOUTHFIELD is the name of the threat group that runs the ransomware-as-a-service offering leveraging REvil ransomware. So REvil really is the software used by the GOLD SOUTHFIELD threat group.
Dave Bittner: I see. And so let's dig into some of the specifics here, some of the changes that you all were tracking in these most recent samples. Can you take us through some of the highlights?
Rob Pantazopoulos: Yeah, sure. So these samples shared by Jakub actually didn't contain much of interest. But one of the first things that we did once we realize that, yes, this is REvil, we tried to find aspects of that code to try to then perform Retrohunt within VirusTotal to maybe find other samples that had not yet been identified. And sure enough, we had found - we had hit on a sample from March 11 of this year, 2022, that nobody else had reported on. So we retrieved that sample. We did analysis of that sample. And that sample actually contained a lot of the new features that you see in the report that we published. And for some reason that wasn't in the sample that Jakub had published. I don't exactly know why. Jakub's sample was actually compiled later on. That sample was compiled on - I think it was March - or April 12. So this was roughly a full month before, but contained the older sample, contained more - newer features. So once we saw that, we really kind of put our full focus into analyzing that sample.
Rob Pantazopoulos: One of the first features that we found was that there was an inclusion of a new command line argument, dash T. Now, when we submitted the sample initially to our sandbox, it didn't do anything. But then, once we began analyzing it, we realized that this dash T expected to receive some type of token value that it then used for decoding strings at runtime. And these strings were critical to the success of its execution because it decoded strings such as, like, kernel32.dll and all the different function names that it would be dynamically importing at runtime that were critical to its execution. So if you didn't have the appropriate value, it just wouldn't run at all. The token used here was implemented within all of the string decode operations. What we wanted to do is, in order to find out what the token was, we loaded the old sample up that didn't have the string decoding logic into IDA Pro, and we compared it to the new code that was using the string decode logic. And one of the values that this token was being applied to was the key length for the encrypted string, all right?
Rob Pantazopoulos: So in REvil, in order to - the strings used by REvil are encrypted using RC4. And they're stored. They have the key, which is then immediately followed by the actual encrypted string. And the string decrypt function has, you know, the location of the decryption key, the length of the decryption key, the location of the encrypted string and the length of the decryption string. So that's how it knows, all right, here's the start and end of the key. Here's the start and end of the encrypted string. And this is - when it goes to decrypt it, that's how it kind of extracts that information out. So in this new sample of those four values, the encrypted - or the key length was encoded, and then the address of the encrypted string was encoded. So they did that with the intention to make it impossible to figure out what the actual full key and what the actual full encrypted string was.
Rob Pantazopoulos: The problem was - is that the - so the way that they do the encoding is that this token value that you pass in the command line has an XOR operation applied to another four-byte value. And in this case, the token value was XOR with the hex value 2F9BODCA. Now, fortunately, because the old version didn't have this encoding applied and they also didn't changed any of the code around there, we knew that the string being decrypted, that location, had a key length of 12 bytes. So we knew that whatever value was passed into the token field was XOR with this 2FB90DCA equals the integer 12. So those are kind of like the - now we have kind of the pieces that we need in order to determine what the token value is.
Rob Pantazopoulos: So the second bit of information that's important to know is for - what is XOR, right? The XOR is a mathematical bitwise operation. And the interesting bit of information is that any time you XOR a value with itself, the resulting value is null or, you know, hex 00. So the interval value 12 only takes up a single byte. However, the code allocates this value within a four-byte memory allocation, which is padded by null bytes. So the first three bytes of this four-byte allocation are nulls. So we can apply that logic to say whatever this token is - XOR with the 2F9B0DCA - if the result of that operation, the first three bytes are null bytes, then the first three bytes of that token must be the 2F9B0D.
Rob Pantazopoulos: So now, instead of having to figure out - like, maybe brute force, you know, all four bytes and trying to find the appropriate value, all we really have to do is brute force that last byte to see what XOR by, you know, the hex value CA equals the integer 12, which is a really fast operation that we could do within, like, a Python script. And it turns out that the hex value C6 XOR by the hex value CA equals the integer of 12. So that means that the expected token passed to the command line was the hex value 2F9B0DC6 or the integer 798690758. So I know it was a long kind of technical explanation, but that was really - for me that was really interesting that they made a little bit of a mistake there in their encoding routine. It was supposed to be this really complicated thing to prevent people from running their malware within sandboxes or performing analysis on it, but it took, you know, a few minutes of us doing this analysis to determine what the key was. And then we reran it through our sandbox. We provided the appropriate token value. And then boom, it fully executed.
Dave Bittner: Must have been quite gratifying to - you know, when faced with this attempt from them at obfuscation, to be able to unpack it and figure it out, you know, so quickly.
Rob Pantazopoulos: Yeah. I mean, I feel like that's why a lot of malware researchers and security researchers do what they do because it's just a constant puzzle. So this was kind of figuring out one of those puzzles, and then it's kind of on to the next one.
Dave Bittner: Yeah. Well, what are some of the other things that you all noticed here - things that were of interest to you?
Rob Pantazopoulos: Sure. So the second thing that I found really interesting was the inclusion of a new configuration element - the accs configuration element. Now, we had identified a new sample back in October of 2021. This was after they had been - the takedown - western law enforcement had performed a takedown of GOLD SOUTHFIELD's infrastructure. So this sample was identified after that had occurred, which is what kind of gave us - piqued our interest. This new sample - there was - we didn't publish anything publicly about that, but this sample had contained this accs configuration element, but it didn't contain any values. It was just kind of an empty array.
Rob Pantazopoulos: But through reverse engineering, we knew that it was - it played a role in the encryption of remote resources, such as, like, map drives that would try to authenticate to these remote resources using whatever credentials were contained within this accs configuration element. But at that time, there was speculation as far as what type of credentials would be contained within there. Would it be kind of generic, like, admin password, or, you know, admin - welcome123, and it would just be like an opportunistic brute force type of credentials, or would it be more targeted so the malware could operate a lot faster? Targeted credentials that may be obtained through initial compromise of their network - you know, they try to, you know, obtain as many using passwords as they can from that network, and then, when they deploy REvil, REvil is already packaged with the credentials that are for that environment, so they can kind of get maximum impact from an encryption standpoint.
Rob Pantazopoulos: This sample that we identified - the March 2022 sample - actually had credentials stored within it, and they were targeted credentials. So that kind of answered that question that we had of - what kind of credentials would be stored within there? And it turned out to be targeted credentials. One, I guess, unfortunate side effect of this is that because they're targeted credentials, now, if these samples get released into the wild, it may be easy for other people to, you know, figure out that you were compromised and infected with REvil ransomware even though that information may not have been made public.
Dave Bittner: Oh, that's interesting. One of the things that caught my eye - you know, these ransomware groups, I guess, famously have restricted their operations, you know, to not affect what we presume is their own homeland. But in some of the things you examined here, they had deactivated that - that region check.
Rob Pantazopoulos: Yeah. That's definitely an interesting change, and we're not 100% sure of exactly why that is. We know why they implemented it to begin with, right? They don't want to bring heat upon themselves by, basically, friendly fire.
Dave Bittner: Right.
Rob Pantazopoulos: But why they removed it was a curious move. There was definitely a lot of turmoil, if you will, around that time. It was initially removed in that October time frame, and that was roughly when, you know, the takedowns had occurred and then, you know, ramping up of - with tensions with Russia and some of the Ukraine stuff going on. So there's a lot that was happening around this time frame, but nothing really stands out as to why they did it.
Dave Bittner: So what are the take-homes for you? I mean, as you look at the changes that you and your colleagues tracked here, what do you take away from it?
Rob Pantazopoulos: So in the past, when we've seen this kind of activity - meaning when we've seen multiple new samples without a new version value, multiple changes between the samples - it was typically indicative of - you know, we could expect a new sample or new official version to be released - typically within a month to two months is what we've seen. That has yet to play out, as a matter of fact. So we published on our public blog on May 9, and the last actual activity that we've seen from GOLD SOUTHFIELD was on May 6. So the last victim published to their leak site was on May 3, and the last sample that we have was compiled on May 6. And since then, we haven't heard anything. We're not quite sure on what to expect for the next steps. There's many different scenarios that could play out. But, certainly, we're going to be ever-vigilant and, you know, try to keep on top of this.
Dave Bittner: Our thanks to Rob Pantazopoulos from Secureworks for joining us. The research is titled "REvil Development Adds Confidence About GOLD SOUTHFIELD Reemergence." We'll have a link in the show notes.
Dave Bittner: The CyberWire podcast is proudly produced in Maryland out of the startup studios of DataTribe, where they're co-building the next generation of cybersecurity teams and technologies. Our amazing CyberWire team is Rachel Gelfand, Liz Irvin, Elliott Peltzman, Tre Hester, Brandon Karpf, Eliana White, Puru Prakash, Justin Sabie, Tim Nodar, Joe Carrigan, Carole Theriault, Ben Yelin, Nick Veliky, Gina Johnson, Bennett Moe, Chris Russell, John Petrik, Jennifer Eiben, Rick Howard, Peter Kilpe, and I'm Dave Bittner. Thanks for listening. We'll see you back here next week.