Code comments cause SAML conundrum.
Dave Bittner: [00:00:02] Hello everyone, and welcome to the CyberWire's Research Saturday, presented by the Hewlett Foundation's Cyber Initiative. I'm Dave Bittner, and this is our weekly conversation with researchers and analysts tracking down threats and vulnerabilities, and solving some of the hard problems of protecting ourselves in a rapidly evolving cyberspace. Thanks for joining us.
Dave Bittner: [00:00:26] And now, a moment to tell you about our sponsor, the Hewlett Foundation's Cyber Initiative. While government and industry focus on the latest cyber threats, we still need more institutions and individuals who take a longer view. They're the people who are helping to create the norms and policies that will keep us all safe in cyberspace. The Cyber Initiative supports a cyber policy field that offers thoughtful solutions to complex challenges for the benefit of societies around the world. Learn more at hewlett.org/cyber.
Kelpy Ludwig: [00:01:02] The vulnerability first came on my radar as part of an internal review for potential dependency.
Dave Bittner: [00:01:08] That's Kelby Ludwig. He's a Senior Application Security Engineer at Duo Security. The research he's discussing today is called "Duo Finds SAML Vulnerabilities Affecting Multiple Implementations."
Kelpy Ludwig: [00:01:21] During that process, I identified a somewhat unintuitive behavior when I inserted comments, which are valid in the SAML world, to these messages. And so, from there, once we identified the single instance and looked into the root cause of the issue, it turned out that this unintuitive behavior that we were noticing actually seemed like it could be a bit more common than just this one specific instance. So once we realized that, we determined that it would be an interesting idea to look at other, you know, SAML service providers and libraries and see if it was a widespread issue, and it turns out that there were a number of vendors and libraries that were affected by this type of issue.
Dave Bittner: [00:02:09] So, let's back up just a little bit. For folks who may not be familiar with it, can you describe to us, what is SAML, what does it stand for, and what's it used for?
Kelpy Ludwig: [00:02:17] Yeah, so SAML is the Security Assertion Markup Language, which is why people say "SAML," because that's a bit more of a mouthful to say. And so, what you'll see SAML used, and it's a common language that's used for, most often, in a single sign-on system. So, single sign-on is frequently used within organizations to give employees an easier authentication experience. So, with single sign-on, you log into one application and that grants you access to, in turn, multiple different applications. So, this is great for users because it requires them to remember less passwords, they're entering passwords less frequently because they sign in once, and then they are granted access to multiple applications.
Dave Bittner: [00:03:04] So that's what it is. What's going on under the hood that sort of exposed this vulnerability that you discovered?
Kelpy Ludwig: [00:03:11] The way that SAML works is these--SAML itself, this common language that multiple people speak--these messages are passed between a user's browser. So, if you are logging into one service and then you want to log into all these other services, these messages pass through your browser. So, ultimately, as the attacker, you can touch this, like, effectively hold onto these messages and edit them as you please.
Kelpy Ludwig: [00:03:39] Now, generally speaking, you shouldn't be able to do that because there's cryptographic signatures on these messages as they're passed through your browser. So, what we identified was a way to edit these messages and change their meaning, without invalidating these signatures. So, as an attacker, when these messages are passed through your browser during the normal authentication experience, you end up possibly authenticating as different users than what the service intended you to log in as.
Kelpy Ludwig: [00:04:12] So take, for example, say you have, you're the user John Doe, and you log into your service as normal. Say it's your email provider as who you log into everyday when you start your day at work. Once you log in as John Doe and you want to then access, say, your financial information through, for payroll. The SAML messages pass through your browser, but if you were malicious and you want to maybe access the financial information of someone with a slightly different username like John, you can edit these messages, and authenticate, and see John's financial information.
Dave Bittner: [00:04:51] So, let's dig into some of what's going on under the hood here, why you're able to make this work. So, take us through some of the details.
Kelpy Ludwig: [00:05:00] Yeah, so the way this ends up working is these SAML implementations have made a false assumption about what the DOM would look like once they're processing this data. And so, the DOM is, basically, you can think of it like a tree. What these implementations expect is that your username and this tree should only have just one branch. And so, what it ends up doing, in some cases, is taking the first bit of information from this branch and using that as your username. But, in reality, these comments, when you add them, it actually splits the tree out into multiple branches. And so, you end up only taking out part of the username that you're using to authenticate these users.
Dave Bittner: [00:05:44] And this is all within XML, correct?
Kelpy Ludwig: [00:05:47] This is all within XML, yeah. This is not technically wrong, it's just kind of unintuitive. You have to, basically, you would have to know that a comment won't invalidate a SAML assertion signature, but it will split this tree into multiple branches. So, what's also interesting about this vulnerability is it's not exclusive to truncating towards the end of the text. So, earlier I gave the example of John Doe being truncated to John. So, there are possible variants, and someone has self-reported a variant to us, of something like, not a truncation, but actually extracting the end of the comment. So, in the case of, you know, John Doe becoming John, you could also have a variant of this vulnerability where, instead of adding a comment and getting "John," you could add a comment and get "Doe."
Dave Bittner: [00:06:42] I see. So, it's one of those situations where it's not necessarily a bug, it's more a behavior, albeit an unexpected one.
Kelpy Ludwig: [00:06:50] Exactly. It would be like saying SQL injection is a bug class, and you could have SQL injection affect a different service and a completely different way. So everyone has their own instance of SQL injection if they're affected, but the general idea is the same.
Dave Bittner: [00:07:09] So help me understand how this concept of the tree works, and what the real world effects would be of being able to get, I guess, essentially two different answers depending on how the question was asked. Is that a good way to say it?
Kelpy Ludwig: [00:07:23] Yeah, that's a great way to say it. So, for SAML to work, you need to have a shared understanding of what a user identifier is. So this could be something like an email, this could be something like a user name, it could be a set of information. But ultimately what you want is that the two parties that are sending these SAML messages should have a shared understanding of what a user is identified by. And so, when you're inserting these comments and changing the tree, and then certain things are getting extracted in weird ways, you're ultimately changing what the understanding of the tree is to both parties, which gives you some leeway in who you authenticate as.
Dave Bittner: [00:08:04] Now, is there a non-bad guy reason to insert comments like this?
Kelpy Ludwig: [00:08:11] You know, possibly. In practice, I haven't seen this, so I assume that there is a valid reason because, you know, it is something that is standardized, so there could be a very practical reason why someone does want to include comments as part of these messages, but, in practice, I've never seen it done, yeah.
Dave Bittner: [00:08:34] Yeah. So just, again, for my clarification, can you really lay it out basically for us here, how, in the real world, how someone would exploit this?
Kelpy Ludwig: [00:08:45] Yeah, so that's where it gets really interesting, is this shared language showed, SAML being the shared language, allows all the other parties participating to kind of have some flexibility in how they handle things. Because, ultimately, what they're doing is relying on the shared language to convey something like a username. The way that someone might exploit this is by gaining access to someone's identity provider. So, in this single sign-on context, you log into one application which grants you access to multiple applications. As an attacker who has access to the identity provider, this could be, say, someone who has legitimate access, like an insider threat, or this could be someone that has phished valid credentials for a user.
Kelpy Ludwig: [00:09:29] So once you have that level of access, you now have some leeway in what applications you can access and what username you can use to log into those applications. So if, say, you log into your identity provider, and that grants you access to an internal chat application. So now you can take these SAML messages and tamper with them, under the context of that user, and possibly gain access to a different user that has a similar username.
Dave Bittner: [00:09:58] Now, has there been any reports of anyone using this out in the wild?
Kelpy Ludwig: [00:10:02] No, we've seen no evidence of this being discussed before, or exploited before.
Dave Bittner: [00:10:08] And how about protecting yourself against this, what are your recommendations there?
Kelpy Ludwig: [00:10:13] So I think our recommendation here would be contacting SAML service providers that you may use within your organization, and just asking them if they're familiar with the vulnerability, what they've done to protect against this, or if they've already confirmed if they're not affected, just kind of getting that information from them. Because it's fairly difficult to tell, unless you're, like, a penetration tester or something like that, it may be difficult to tell if you're affected from a black box perspective. So, we just think it's best to just contact the people that have engineered these systems and see if they have a response for what they've done about this vulnerability class.
Dave Bittner: [00:10:51] And is there anything to be done, sort of on the other side of it, the way that things are parsed? Are there updates or corrections to be done from that end, or is it, again, like we said at the beginning, just more a way that something functions, rather than a true bug?
Kelpy Ludwig: [00:11:06] Yeah, so I think there's clearly some level of unintuitiveness to what has happened here, because, like, this isn't a mistake that one person made, this is a mistake that multiple people made independent of one another. So there's definitely, like, some level of unintuitiveness, or a lack of clarity around what should be done in this scenario. So this could be something that, say, like, a SAML specification update could address, to where maybe people that are implementing these specifications have a consideration for how to handle this particular situation.
Dave Bittner: [00:11:41] Yeah, it's an interesting one. It's sort of, um, it's almost like a translation sort of thing, you know, I imagine if two people are speaking a different language you really need to depend on your interpreter, and two different interpreters might have subtle nuances in the way they interpret things, and this seems to me to be one of those situations where a nuanced interpretation can make all the difference.
Kelpy Ludwig: [00:12:04] Yeah, absolutely. It's kind interesting about the, what started a big part of this research was the reason, or one of the things that made it click for us on, like, hey, multiple people could be affected, was once I found this initial bug, I started writing up possible examples that I could find in other XML libraries that exhibited this behavior. And one of the ones I wrote was not technically affected, I just wrote the code wrong. And so, it turned out, I, in writing a proof of concept to describe to other people, ended up making the same exact mistake, which really made it click for us, like, oh, this is, like, this person made this mistake, I just made this mistake knowing it's a thing, like, this this could be much larger than this single instance.
Dave Bittner: [00:12:51] Yeah, yeah. It's an easy mistake to make.
Kelpy Ludwig: [00:12:54] I think the most important thing is contacting your SAML service provider is probably the best option that an organization can take in the face of, you know, looking to remediate this vulnerability. Unless you have built a SAML service provider yourself, it's kind of hard to get to that level of intimate detail from, like, say, a black box perspective. So, we strongly suggest just contacting SAML service providers that you may use in your organization, or maybe in a different context, and just asking if they have been affected by this vulnerability, or if they've patched it.
Dave Bittner: [00:13:27] Now, just from a community point of view, you discovered this sort of thing, is this the sort of thing where you go out to the other SAML service providers and spread the word and say, hey everybody, heads up, we found something here?
Kelpy Ludwig: [00:13:38] Our disclosure process, we went through CERT/CC. So, once we identified that this vulnerability may affect other implementations, one of the things that we did was look at open-source SAML libraries that we could find on, say, GitHub, and see if we could replicate this across those libraries. And so, once we did--that was all local to my computer--so, once we had some, like, local results that suggest that this may be more widespread than just a single instance, that's when we went to CERT/CC to actually contact other SAML service providers, like, maybe that have cloud services, to disclose the general concept and to determine if they were affected before we published our results.
Dave Bittner: [00:14:22] It's interesting how nuanced it is, and it must have been fun for you to kind of have those "aha" moments when you go, wait a minute, is this, is what's happening what I really think is happening? You know, I think, if you're like me, those kinds of moments can be really kind of fun, I think.
Kelpy Ludwig: [00:14:37] The best ideas and experiences always start with, hmm, that's a little weird.
Dave Bittner: [00:14:42] (laughs) Right, yes, exactly. I wonder what would happen if I did this.
Kelpy Ludwig: [00:14:47] (laughs)
Dave Bittner: [00:14:52] Our thanks to Kelby Ludwig for joining us. You can find the complete report on the SAML vulnerabilities on the Duo website. It's in their blog section.
Dave Bittner: [00:15:01] Thanks to the Hewlett Foundation's Cyber Initiative for sponsoring our show. You can learn more about them at hewlett.org/cyber.
Dave Bittner: [00:15:09] The CyberWire Research Saturday is proudly produced in Maryland, out of the startup studios of DataTribe, where they're co-building the next generation of cybersecurity teams and technology. It's produced by Pratt Street Media. The coordinating producer is Jennifer Eiben, editor is John Petrik, technical editor is Chris Russell, executive editor is Peter Kilpe, and I'm Dave Bittner. Thanks for listening.