Research Saturday 4.29.23
Ep 279 | 4.29.23

HinataBot focuses on DDoS attack.


Dave Bittner: Hello, everyone, and welcome to the CyberWire's "Research Saturday". I'm Dave Bittner and this is our weekly conversation with researchers and analysts tracking down the threats and vulnerabilities, solving some of the hard problems of protecting ourselves in a rapidly evolving cyberspace. Thanks for joining us.

Allen West: Yeah. So we were looking through our honey pot logs specifically for results for a certain vulnerability in YARN Hadoop and Larry was talking about how go binaries are so much bigger than all the other ones that we've seen. And so I started filtering by megabytes and then we found some pretty interesting samples.

Dave Bittner: That's Allen West. The research we're discussing today comes from Akamai and is titled "Uncovering HinataBot: A Deep Dive in to a Go Based Threat." Joining us are Larry Cashdollar, Chad Seaman, and Allen West.

Dave Bittner: Well, let's go through it together here. I mean what exactly is going on under the hood here? What are we talking about? Chad, do you want to take us through this? We haven't heard from you yet.

Chad Seaman: So yeah. HinataBot comes in through a couple of known CVEs and one Hadoop misconfiguration. A lot of people stick Hadoop out there, Hadoop YARN, and they leave it exposed to the internet probably due to an oversight and through this mechanism they -- they can achieve code execution. So it's spreading through those three primary mechanisms that we've found also through SSH brute forcing. From there the malware once it gets a foothold it phones home to a -- a C2. A big part of this research was actually since the C2 was down at the time of discovery was reverse engineering the command and control protocols so we could actually cause the binary to stage attacks against a controlled victim and see what that traffic looked like. It -- it has two primary attacks. There's a UDP attack and an HTTP attack at this point. There's been some historical attacks that we've seen in previous samples that have been removed, but we're not really sure, and its primary goal is to push either volumetric DDP or layer seven HTTP and HTPS attacks.

Dave Bittner: Do you have any sense of the -- the history behind this? So where it may have come from or -- or been influenced by.

Allen West: Yeah. So in looking back through the logs the IP that we found distributing these samples originally which also turned out to be the command and control IP, we looked back and sort of used that as a pivot and saw that it had for a while been distributing a bunch of different Mirai variants for one. It was about a month at the beginning of the year. And then about halfway through the month we started seeing them distributing the HinataBot, and when we looked up the historical DNS resolve of the IP as well we also saw that it was Mirai lovers or hi hi Mirai lovers for a little bit there. So we definitely could see some influence from Mirai, and obviously now they're trying to make their own thing.

Dave Bittner: And -- and who are they targeting here? Who are they trying to infect? Who's the primary victim here?

Chad Seaman: So there's targeting in two senses. One sense we have no visibility as of yet. Because the C2 was down at the time of the discovery, we haven't been able to see them launch attacks and figure out what their primary motive is as far as victimization goes. From the exploit stuff it looks like they're -- they're targeting low hanging fruit, to be honest. There's nothing novel or new about the infection vectors that we saw them using. The Huawei exploit is old and well known. We've seen it used to spread Mirai and Gafgyt and other samples that are competing in this space. The SSH brute forcing was a primary tenant of Mirai spreading back in the day. It was telling that in SSH brute forcing campaigns that really got a foothold for that one before they moved in to some of these other RCE variants or exploits. So I think at this stage it's just they're victimizing low hanging fruit in general, and I wouldn't be surprised to see them continue down that path because it's easy. The problem is is that it's also a very competitive landscape so it's hard to really get a foothold there. In the article we allude to we hope that they stay on this path. And without trying to give them any ideas or credit where -- we would not like to see them be more successful than they are. Put it that way. So if they continue for low hanging fruit where there's a lot of competition, that's good news for everyone.

Dave Bittner: Larry, I'm curious. They're using the Go language here. What -- what is the appeal of that? It strikes me that we're seeing more and more attention there.

Larry Cashdollar: I think because Go is portable and it's fast I think a lot of malware authors are sort of gravitating towards it. You know you have the ability to cross compile it to other architectures and we saw HinataBot had a lot of compiled binaries for many different architectures, ones that I -- ones that I was surprised. I think there were some binaries for Solaris. I think we saw some Solaris [inaudible] binaries in there. So I thought that was pretty wild, but I might, you know -- it's also static, statically compiled. So there's no -- you know, it won't need any libraries or anything linked to it because the -- the binary is a ball of everything it needs. So you can just stick that on the machine and it should run. So I think that's the other appeal. And, you know, I think Go is an up and coming language, and I think a lot of developers are -- are also gravitating towards it. So the malware authors will follow in turn.

Dave Bittner: And, as you mentioned, I mean does that -- that -- all those things lead to it having a larger file size?

Larry Cashdollar: You know, typically with an IOT binary that's been compiled for like MIPS or ARM they're -- and it's written in C. They're usually they're dynamically linked. They're between I'd say 40 and 200 kilobytes. A Golang malware binary is usually between like 4 and like 30 megabytes.

Dave Bittner: And -- and that's because it's so self contained?

Chad Seaman: No. I think it's also in part because of how the Golang binaries work. As far as it being able to -- to be deployed in multiple places, the Go binary typically generates a 5 meg or more binary because I believe it ships the interpreter inside of the Golang binary itself which is in part why it's so much easier for cross compilation of function because they really need to -- to compile to underpinnings for OS, but then the core code that the author would be writing is -- is operating through that -- that Golang level where it's more abstracted against the various OSs and platforms that they're compiling for.

Dave Bittner: Just for my own understanding here, please pardon -- pardon my ignorance, but is this kind -- similar to like on Mac OS you hear about like fat binaries, you know, where something's been compiled, for example, for both Intel and, you know, the -- the M series chips. Is it that sort of thing where it has everything it needs to -- regardless of where it's deployed?

Chad Seaman: No. They still -- they still compile in to platform specific packs, but I think each one of those platform specific packs ships a Go interpreter or some kind of bi code that will take the -- the Golang and operate it on that platform if it makes sense. Like each one of them is large because each one of them is shipping an architecture specific interpreter or environment for Go to execute it.

Dave Bittner: Part of what you all cover here is the process you all went through to map out the C2 communications. Can you take us through that process?

Chad Seaman: As far as the -- the reversing of the protocol was concerned, we kind of started with not really knowing what the binary -- or how to speak with it. Because of the sample that we had and it not being stripped, we did have some hints about what the function names were. So we -- we knew that at some level there's an HT -- HTTP flood capability. At some level there's a UDP flood capability. And the problem became, okay, so in the code how do we get to where those functions are going to be executed? So it became a process of basically patching the binary so the C2 that was down we overwrote the location of that C2 to an IP that we can control. From there we spun up a socket and started communicating in to that binary when it would phone home to us thinking that we were the C2. At that stage it's really just kind of trying to navigate and see where you land in the binary at what stage and then figuring out, you know, at this point in code we're either going to go left or right. You know, we're -- we're going to -- to venture in to different parts of the code based on what input we're providing it. And -- and then it's just kind of mapping that out. So at this point it's expecting, you know, a hex 30 or a hex 31 which translates to the ASCII character for a 0 or a 1. And if we go zero, it goes this way. If we go one, it goes this way. And if it's neither, it goes this way. So we'll want to end up on this side and we can kind of map out that once we get down that fork of the code ultimately we get to the UDP or the HTTP flood and then it starts mapping out, okay, so how many args are we expecting to pass in to that function? What -- what arg in position one? Where is it going to end up in the final payload? Is it going to become the IP address that we're attacking? Is it going to become a packet link directive? Is it going to trigger some other fork of code that we're not familiar with yet? So it's really just a -- a pretty feedback loopish experience. I'll put it that way.

Dave Bittner: And what did that reveal as you went down that pathway? How were things uncovered for you? Did it get easier as you went along?

Chad Seaman: It was a little more frustrating than I think any of us would like to admit.

Dave Bittner: Fair enough.

Chad Seaman: Yeah. So, you know, because when you see the final product it's such a simple and fairly straightforward protocol that it becomes frustrating that it was such a pain to kind of map it out. Yeah. The initial stage was just trying to figure out how to even talk to the thing. What was the prefix that we needed to pass in order to get to where we would even start processing the command arguments? Before that there was a handshake and it expected certain things to be responded to from itself and from the C2 to send back a certain response before it would even go in to the -- the attacking functions and start checking for the -- the next stage of payloads. So I won't say it was easy. It wasn't insanely difficult, but it was just a lot of back and forth and tinkering to figure out like, okay, we got here and now what? What happens here? And sometimes it was why did this crash because that was one of the things is as we started to add more parameters, you know, you start out and you're just thinking like, okay, we'll put -- we'll put a numbered argument or something we can identity in memory in each one of these different offsets as we go in to that binary to try and map out where it's going to land in memory. Is it going to be passed? These were X64 so what register is it going to end up in, going back in to Glang's own function documentation and looking at Golang internals to figure out what is this function going to be called and what parameters are -- are expected in to this library function. So it was just a lot of back and forth with that before we finally got it done. I think all in all I don't remember how many days. It took at least a couple to a few days to get everything mapped out before we -- we felt confident that we had all of the fields mapped and could control everything we wanted to control.

Dave Bittner: I'm curious, you know, Allen, can you give us some insights? What is that collaborative process like for -- for you and your colleagues there? Is -- are you -- are you tossing things back and forth? Are you asking each other questions? Are you documenting things along the way? What is the process that you all use?

Allen West: Sure. Yeah. So originally one of us finds a sample and we start tearing it apart by ourselves and we find something that's particularly interesting about it. Maybe I'll message Larry and I'll be like, "Hey, is this normal to you? Like have you seen anything like this?" And then, you know, spin up a document, start documenting as you go so you don't have to remember everything you did a week ago in detail. Just basically over taking notes. And then as you get stuck phone a friend, and then everybody working on their own side while you're kind of in a meeting talking together and somebody has a breakthrough. You tell everybody else and you just keep going that way until -- until you're fried for the day.

Larry Cashdollar: Yeah. I'll get stuck and then be like, "All right. I'm going to see what Chad thinks." Then Chad will be like, "What are you guys looking at?" And he's like, "Oh. You should try this." And then he'll just fire up a -- a terminal. And he'll start poking at it and then we'll sort of watch what he's doing is. And then we'll sort of throw ideas around at each other. So it's -- it's definitely a major collaboration when we get something like this and one of us gets stuck or the other one gets stuck and we sort of just pass the ball around.

Dave Bittner: Well, let's dig in to the two attack types here. As you mentioned, there's HTTP flood and then also there's a UDP flood. Can -- can you describe to us exactly what it's doing?

Chad Seaman: Sure. So the two -- like you said, there's -- there's two primary attack functions. There's a UDP flood and an HTTP flood. The UDP flood is pretty straightforward. In both cases the attack functions spin up a pool of 512 worker threads. This is super easy to do in Golang. It's a pretty common technique for, you know, high performance, parallelized data processing. Any kind of thing that -- that's going to be quicker if you do it in parallel obviously. So it's no surprise to see that used here. It's cheap. It's fast. It's easy. And I don't know why you wouldn't do it. It's going to give you more throughput and -- and more just overall processing speed for -- for what you're attempting here. From there it's just going to establish a UDP socket and then it appears to pass that socket reference in to the various workers and the workers run through a loop. They -- they have a duration. They're given a duration as part of the attack directive. And they basically check a timer, and for -- for each iteration of their loop they're checking that timer. And if the timer is older than the time that they expect to be done with the attack, then they exit. Otherwise they're going to just continue to shove up a big fat null padded UDP packet through that socket. The other attack, HTTP attack, same kind of deal. We get 512 workers. But it's actually using Golang's native HTTP client. So this is a very well built client. I'll put it that way. It handles a lot of edge cases. It handles redirection, DNS resolving, a lot of stuff that we see less advanced bots kind of struggle with or not do properly. It has no problem. It can handle HTTP, HTTPS. They -- I believe it was hard coded to HTTP 1.1 or 1.0. It did not do HTTP 2 by default. Supports a few different headers. Some of them are randomized. Some of them aren't. With the HTTPS support it will get through a TLS or SSL handshake. No problem. So it's a pretty thorough and functional HTTP attack platform for sure.

Dave Bittner: I was going to say how does this compare to other bot net tools that you all have analyzed before? Do these folks seem to be up to the task? Do they know what they're doing?

Chad Seaman: Yes and no. I mean there's -- there's difficulties in operating with raw sockets in Golang in general from a standpoint of like, "I don't -- I don't suspect we'll see a SYN flood come out of this." I'll put it that way. It's possible, but I doubt it just because of using truly raw sockets in Golang is -- is a little bit more difficult. I would say that the two attacks that we see in there, the UDP flood and how it's tooled, and the HTTP flood and how it's tooled, are both pretty straightforward attacks to get wired up. And that's -- that's in part because Go is a great language in my opinion. It's pretty easy to get a UDP socket up and running and start sending data over it and receiving data from it. And the same for the HTTP library. You know, they didn't have to go down and start, you know, figuring out, okay, we've got our socket. Now we need to build our HTTP request. And now from that we've got to do this. We've got to redirect. We need to follow it. You know, the -- the DNS resolution, all of that is -- is handled because the underlying Golang libraries are solid. So I wouldn't -- I wouldn't say it's incredibly advanced. A lot of the code to -- a lot of the underlying code that would empower this you could probably find on GitHub or Stack Exchange with a handful of Googles and it's just a process of slapping it together really.

Dave Bittner: What are your recommendations then for organizations to best protect themselves against this? Let's start with infection. How do you -- how do you prevent this from taking hold on your network?

Allen West: Yeah. Most of all the standard security procedures that you would recommend in general like updating your applications and using strong passwords. As we saw, you know, brute forcing SSH credentials, they were not complex credentials, the ones that we saw. They're like admin admin kind of stuff. And then on top of that these are very old CBDs that are being targeted or old misconfigurations that have been known for a long time. So yeah. Just keeping up to date with the common vulnerabilities and updating your applications.

Dave Bittner: Larry, I'm -- I'm curious. With the C2 being down, I mean is that a typical thing in the playbook here that would, you know, not be active until it's needed or what does that generally indicate?

Larry Cashdollar: It could indicate that the malware was up at one point and then the C2 was either found by, you know, someone -- it was -- like it was on a network and somebody saw malicious traffic and they shut the C2 down. Or it's possible that the actors shut down the C2 and moved it somewhere else or they're relocating their operations. We've seen it before where we get a malware sample and the C2 is offline. I'm actually analyzing a piece of malware right now where the C2 was also down, and the malware that I'm looking at is only a few days old. So I'm -- I'm assuming that they either started up the malware and then tested it and then shut the C2 down and managed to get a sample caught by us -- this is my guess, but it's -- it happens.

Dave Bittner: Yeah. Hard to know for sure.

Chad Seaman: To ride on that as well, in the past we've seen other authors engaged in this kind of activity and typically -- and it's the same for -- for this piece of malware. If the C2 is not up, the malware doesn't die. It just waits. And it's going to phone home every so many seconds if it can establish a successful connection. So what we've seen in the past is the authors will bring the C2 up. They'll let it sit for 5 or 10 minutes and make sure all the bots have phoned home. Once they have a pool of functional attackers, they'll issue their attack command and then they'll turn it back down. You know, it's -- it's kind of a -- a decent technique, I'll say, because if you're a researcher and you're looking at it and you think the C2's down, you're less prone to look in to that. And also if -- if you have adversarial concerns from researchers or law enforcement or whatever, appearing to have gone away is a technique that you could debate the merits of it I'll guess.

Dave Bittner: Our thanks to Larry Cashdollar, Chad Seaman, and Allen West from Akamai for joining us. The research is titled "Uncovering HinataBot: A Deep Dive in to a Go Based threat." We'll have a link in the show notes.

Dave Bittner: The CyberWire "Research Saturday" podcast is a production of N2K Networks proudly produced in Maryland out of the startup studios of DataTribe where they're co-building the next generation of cybersecurity teams and technologies. This episode was produced by Liz Irvin and senior producer Jennifer Eiben. Our mixer is Elliott Peltzman. Our executive editor is Peter Kilpe and I'm Dave Bittner. Thanks for listening.