Research Saturday 4.6.24
Ep 324 | 4.6.24

Leaking your AWS API keys, on purpose?

Transcript

Dave Bittner: Hello, everyone. And welcome to the CyberWire's Research Saturday. I'm Dave Bittner. And this is our weekly conversation with researchers and analysts tracking down the threats and vulnerabilities, solving some of the hard problems, and protecting ourselves in a rapidly evolving cyberspace. Thanks for joining us.

Noah Pack: It was the beginning of the COVID-19 pandemic. Every conference and career networking event had been canceled that I could find. And I created a script that would email companies and ask for free conference swag.

Dave Bittner: That's Noah Pack, an intern with the SANS Internet Storm Center. The research we're discussing today is titled, What Happens When You Accidentally Leak Your AWS API Keys?

Noah Pack: When I started my time, in university, I was doing an introduction to computer science class. And towards the end of that class, the instructor encouraged all the students to pursue a personal project, something they could post publicly on GitHub or learn from. And I saw a video online of a student at a different university who created a Python script to email universities and ask for free swag so things like a T shirt or a mug. And he got a great response. A lot of admissions departments sent him things to encourage him to apply there for grad school. I wanted to take that same idea and adapt it a bit. It was the beginning of the COVID-19 pandemic. Every conference and career networking event had been cancelled that I could find. And I created a script that would email companies and ask for free conference swag. So I wrote this up in Python. I found a list of 10 companies that I was fond of and I hoped would respond to me with a T shirt or keychain of some kind. I added those company names to my script. And it worked flawlessly. I sent it out. I checked the email sent folder and saw all 10 messages. But, to celebrate my achievement, I posted this code on GitHub shortly thereafter, receiving multiple login requests to the email address I'd created for the script to use. That was because I hard-coded, which means that I put in plain text inside of the code the username and password for that email account for my script to use.

Dave Bittner: Oh, Noah. Noah, Noah, dear sweet Noah.

Noah Pack: And I was just a freshman in computer science, so I hadn't learned safe programming practices yet. But I certainly learned from the situation. There had been no ill consequence. But if my project had been bigger and I was using AWS or another cloud provider and hard-coded credentials, that could have dire financial consequences.

Dave Bittner: Yeah. So the lesson here, I suppose, is that moments after you put this hard-coded email address and password in -- up on GitHub, I guess automation from other people had searched that out and just started hammering that email address.

Noah Pack: That's exactly what happened. It happened within minutes. And I saw the same thing in my research when I published canary tokens on GitHub for AWS API keys. They were snatched up and used immediately by both threat actors and security companies that were monitoring for them.

Dave Bittner: Well, let's dig into the research that you did here. I mean, this does involve canary tokens. For folks who may not be familiar with that, how do you describe a canary token?

Noah Pack: Yeah. So a canary token is sort of like a honeypot. At the Internet Storm Center, we use honeypots, most of which are Raspberry Pi's that look like an attractive target sitting on the internet for a threat actor to go after. The honeypots record what commands the threat actor uses and what files are downloaded. Canary tokens are similar but on a much smaller scale. They work really well to supplement the honeypots that we use. Canary tokens can be things like an Excel document, a QR code, or AWS API keys. When a threat actor opens the document, scans the QR code, or uses that API key, it sends an email alert to whoever made the canary token, alerting them and giving them a little bit of information about how it was used. In my use case of AWS API key tokens, it gave me the IP address and user agent that tried to use those credentials.

Dave Bittner: Well, let's dig into the research here. I mean, what exactly did you set up?

Noah Pack: So, to conduct my research, I added some AWS API key canary tokens to about a moderately but small e-commerce website that I help maintain. It gets roughly 1000 visitors a day. It's enough that it'll come up at the top of a Google search. But someone who's not looking for the things that they're selling probably won't find it. So I didn't really expect this to be found very quickly, and it wasn't. So it took a while before someone picked up those keys and tested them. They could have been picked up much earlier than they were actually tested. But, when they were tested, the user agent string was pretty interesting. The person that was testing them was using the boto3 library. And then we're using Python on Windows subsystem for Linux. And their IP address came from proton VPN. So because of the anonymity of a VPN service, it's hard to tie this to any other attacks or to figure out who is actually behind testing this. Could be anyone from threat actors looking to abuse this credential to a security researcher that's just scanning websites.

Dave Bittner: So just so I'm clear here, you had a preexisting website. And, within this website, you embedded the canary token, which to the outside world looked like an AWS API key.

Noah Pack: That's exactly right. And I tried to make it look as though a developer might have accidentally left it there.

Dave Bittner: And so who do you suppose was going after this? I mean, was it -- this was obviously an automated process here.

Noah Pack: No. I would assume that the key was picked up in an automated process but that it was manually tested and that, if it were a larger website that receives more traffic, one with a much different threat profile, full automation, or a different threat actor who uses more automation could pick it up much quicker.

Dave Bittner: I see. Well, you didn't stop here. You posted your AWS key elsewhere. Take us through the next step of the process here.

Noah Pack: That's right. I also added some AWS API key canary tokens to GitHub. Now, I created a GitHub repository that I knew any security researcher who lays their eyes on would know that it's a honeypot. It's there to catch people.

Dave Bittner: Okay.

Noah Pack: The repository was named Canaries, and it had a readme that said something like this is for some research. If you're a bad guy, try these out. If not, please just ignore.

Dave Bittner: Wow. Okay.

Noah Pack: And after making that repository public, the requests just flooded in. It was much different to when I embedded them in the website. I ended up having to turn off the alerts just to preserve my email inbox. But the first one came from AWS, the first attempt at using those credentials. And I didn't touch on this in my research but, when you publish AWS API keys on GitHub, almost before you can even refresh the page, AWS will test those keys themselves. And it's because GitHub has secret scanning built in where they send anything that they think might be an API key to AWS to test it. And AWS will take action. If it's a real API key and not a canary token, they'll send you an email with a urgent subject line, Action required. Your AWS access key is exposed for AWS account, and then it will list your account number. But not even seconds after AWS tested the key, I got a ton of requests from a company called Get Guardian. Now, they have a service that will scan public and private repositories for your secrets. It's a paid service. And they tested the keys multiple times within the first few minutes to verify them, all from similar IP addresses in Canada.

Dave Bittner: We'll be right back. So they're looking for your business here. They're saying, Hey, look what we did. We found this. Look how quickly we found this thing. And if you use our service, we'll help you protect against this sort of error.

Noah Pack: Not necessarily because they don't reach out to you in any way like AWS did. But they certainly did see it right away and test it out to protect their customers.

Dave Bittner: Oh, interesting. Okay.

Noah Pack: Yeah. That would be a great marketing strategy, though.

Dave Bittner: No. I -- clearly, I assumed too much. But that's interesting. So -- so, at this point, I mean, you're getting hammered. You said you had to turn off your email because it's just -- everything's flooding in.

Noah Pack: Right. So this key got a ton of alerts right away. Almost all of them were from Get Guardian. There was also the request from AWS and a couple from IP addresses that had been seen doing similar things and scanning the internet.

Dave Bittner: And so what was your response to that? I mean, you see the degree to which this has triggered all of this activity. As a researcher, what do you do next?

Noah Pack: Yeah. So my next step was to remove the GitHub repository. I had gotten the results that I wanted. From my research, I found out that, if you publish your AWS API keys on GitHub, they will be used. If you publish them on your website, they will be used. They might take a couple of seconds; it might take a couple of days. But we also don't know the difference between when they're picked up and when they're used. You might be able to rewrite your GitHub repository history and erase those API keys. But someone might still have access to them. They might have downloaded your repository or the source to your website or scanned your website before they used those keys. So the best practice is definitely to rotate them, to remove all permissions from those keys and create new keys with the permissions that your code needs.

Dave Bittner: Yeah. That was going to be my next question. So, like, once you had removed the information from GitHub, were those keys still being activated? Were people still trying to hammer away using those credentials?

Noah Pack: They were. It took about an hour after removing the repository before my last alert came in. So you could chalk that up to someone having the repository open or downloading it before looking through it. Perhaps their scanner that they're using to find these leaked secrets has a bit of a delay or a bit of a backlog from other code that's being uploaded.

Dave Bittner: I wonder if they'll ever get hit again. You know, are there folks out there who will grab this and then say, Okay. Well, clearly this person realized they had a problem, but we're going to check again in a month just in case.

Noah Pack: Oh, I am extremely excited if -- or I will be extremely excited if I see that because that would be so cool. We know that a lot of threat actors like to lie in wait on networks before they execute their attack. So I'm sure a similar thing is possible here.

Dave Bittner: Yeah. Well, I mean, I think that the lessons here are pretty clear. How do you sum them up for -- in terms of the things that you've learned?

Noah Pack: Yeah. So leaking your AWS API keys or any credentials is an extremely big deal. According to Verizon's 2023 data breach investigation report, they said that 61% of data breaches were due to leaked credentials. And while leaking credentials might seem kind of silly, it seems like a fixable problem. I mean, it's the equivalent to leaving keys to a building in the parking lot. But it really is harder to stop than you might think. Users reuse usernames and passwords on sites that are breached. Anyone can fall for a social engineering attack. Even experienced developers can accidentally publish credentials, and all of those reasons are -- or those are just three of many reasons that this issue exists and why it's so prevalent and why entire companies like Truffle Security and get Guardian exists to solve this problem. I've seen horror stories from small businesses that had their AWS account hacked. And the attackers racked up bills in excess of $300,000 before the developers could figure out how to rotate those keys and mitigate the problem because they didn't have the incident response experience. And they didn't have tools integrated into their code pipeline to find these secrets and stop them from being published.

Dave Bittner: As -- it's a really good reminder of what I -- you know, I suspect there are folks in our audience who are just nodding along and saying, you know, what a -- what a basic straightforward thing this is. And, yet, as you say, despite that, it does happen to so many people.

Noah Pack: It happens all the time. There was a cryptocurrency. It was sort of what some people call a meme coin back in I think 2022 called Shiba Inu coin. And the developers had a code repository on GitHub where they accidentally leaked their AWS credentials. Luckily, some security researchers who were fans of the crypto project found them. And, unfortunately, they had no way to contact the developers. There was no bug bounty program. There was no security.txt on their website. And those researchers noticed that, after a few days, the AWS API credentials were revoked. They stopped working, which means that either they did get ahold of someone at Shiba Inu, or the people at Shiba Inu noticed that someone else, maybe a bad actor, was using those credentials.

Dave Bittner: Right. What's your advice for folks to help mitigate something like this if it does happen?

Noah Pack: The first advice I ever heard on how to mitigate this issue is actually bad advice, and that would be to rewrite your code repository history on GitHub. That's because things like the Wayback Machine exist. And you don't know if somebody's downloaded that code with the API keys in it. So the better idea is to rotate those keys. You could also do things like looking at your cloud trail logs in AWS or set up alerts. At SANS, we like to say that prevention is preferred; detection is a must. So finding out that those keys were accessed is extremely important. Teaching secure coding practices is also a -- probably the best and easiest way to prevent this. This includes avoiding the Git command git add and wildcard because that can very easily add sensitive files to your repository. Name the files that contain sensitive information in your.gitignore and your.NPMignore files. Those are sort of like the robots txt of your website but for git. And then, as a threat hunter, one of the techniques that I really like to use is to take a baseline of something. So, on a network, I would take a packet capture and look at all of the traffic for the network, slowly eliminating things that I know aren't bad. And, at the end, I'll end up with just the network traffic that could be malicious. And I'll have a bunch of filters that will filter out all the stuff that I know was good. Then I could dig into those things that are bad. And do the same exercise again in a month or a week or a quarter. And add those same filters and find the new traffic that's bad. That same concept can apply to any log type, including logs from your cloud provider. So look at those cloud show logs. Understand what is supposed to be running in your AWS account. Who is supposed to be running what? And look for services you don't recognize. There are over 200 AWS services at this point, so it's hard to know them all. But you can at least know what ones you use, and everything else you can assume is something that you don't. And you can dig into it more.

Dave Bittner: Our thanks to Noah Pack from the SANS Internet Storm Center for joining us. The research is titled, What Happens When You Accidentally Leak Your AWS API Keys? We'll have a link in the show notes. The CyberWire Research Saturday Podcast is a production of N2K Networks. N2K's strategic workforce intelligence optimizes the value of your biggest investment: your people. We make you smarter about your team while making your team smarter. Learn more at n2k.com. This episode was produced by Liz Stokes. Our mixer is Elliott Peltzman. Our executive producers are Jennifer Eiben and Brandon Karpf. Our executive editor is Peter Kilpe. And I'm Dave Bittner. Thanks for listening. We'll see you back here next time.