Research Saturday 6.29.19
Ep 92 | 6.29.19

Giving everyone a stake in the success of Open Source implementation.

Transcript

Dave Bittner: [00:00:03] Hello everyone, and welcome to the CyberWire's Research Saturday, presented by Juniper Networks. I'm Dave Bittner, and this is our weekly conversation with researchers and analysts tracking down threats and vulnerabilities, and solving some of the hard problems of protecting ourselves in a rapidly evolving cyberspace. Thanks for joining us.

Dave Bittner: [00:00:26] And now a word about our sponsor, Juniper Networks. Organizations are constantly evolving and increasingly turning to multicloud to transform IT. Juniper's connected security gives organizations the ability to safeguard users, applications, and infrastructure by extending security to all points of connection across the network. Helping defend you against advanced threats, Juniper's connected security is also open, so you can build on the security solutions and infrastructure you already have. Secure your entire business, from your endpoints to your edge, and every cloud in between, with Juniper's connected security. Connect with Juniper on Twitter or Facebook. And we thank Juniper for making it possible to bring you Research Saturday.

Dave Bittner: [00:01:13] And thanks also to our sponsor, Enveil, whose revolutionary ZeroReveal solution closes the last gap in data security: protecting data in use. It's the industry's first and only scalable commercial solution enabling data to remain encrypted throughout the entire processing lifecycle. Imagine being able to analyze, search, and perform calculations on sensitive data, all without ever decrypting anything - all without the risks of theft or inadvertent exposure. What was once only theoretical is now possible with Enveil. Learn more at enveil.com.

Tim Mackey: [00:01:53] So, this is actually our fourth edition of the report.

Dave Bittner: [00:01:55] That's Tim Mackey. He's principal security strategist within the Synopsys Cyber Research Center. The research we're discussing today is titled, "2019 Open Source Security and Risk Analysis report."

Tim Mackey: [00:02:07] This comes from an initiative that we had within the Black Duck Software community. Synopsys acquired Black Duck Software in December of 2017. And so the research work, when it was Black Duck, was known as the COSRI report, or the Center for Open Source Research and Innovation. And we brought that forward into Synopsys, and this is the second incarnation of this under the Synopsys branding.

Tim Mackey: [00:02:29] The research itself is looking at one aspect of Black Duck business, which is all about doing audits of commercial software code bases, typically in either a merger and acquisition scenario or some new VC funding round, basically as part of a tech due diligence. So, we're looking at actual real code, real applications, real libraries as opposed to, say, doing a survey.

Dave Bittner: [00:02:54] I see. So, there's some really interesting data here in the report. Let's just start off with sort of an overview. Can you give us the lay of the land? Where are we when it comes to the prevalence of open source software in code these days?

Tim Mackey: [00:03:08] The easy statement for that is open source development's where it's at. We typically see that the majority of software components making up a commercial application are open source in nature, and if you look at how development teams have evolved, say, over the last five or ten years, this kind of makes sense. We have a preponderance of libraries and options and frameworks and runtimes that enable development teams to create their unique functionality feature set offering, without necessarily having to be stuck with, hey, if I don't have the expertise in-house, I'm really going to struggle. That expertise can be anywhere in the world, and that's one of the key values that open source development brings to modern applications.

Dave Bittner: [00:03:53] So, you have sort of a tried-and-true component of functionality that has built a good reputation for itself over the years, and a development team can basically take that off the shelf and plug that functionality into whatever they're developing?

Tim Mackey: [00:04:05] True. And within the report, we saw that, of the code that we analyzed, ninety-six percent of it contained at least one open source component, and that on average it was about 60 percent of the code was open source in nature. And that was independent of industry.

Dave Bittner: [00:04:25] Take us through - what are some of the most prevalent places where open source is being used?

Tim Mackey: [00:04:30] It really has no meaningful locus. So, for example, that could be IoT development, that could be new mobile applications, that could be cybersecurity, that could be heavy industry. They're all using some level of open source componentry in order to build their systems. And if we look at how an application stack is created, it kind of makes sense. Maybe you're deploying on top of Linux, or you have containerized applications or bringing Docker into the mix. You might have common runtimes like Java or .NET, all of which are open source. So, you're bringing in open source technologies as part of your overall solution delivery, and that's kind of a good thing.

Dave Bittner: [00:05:10] What are the most common components that you're seeing in use?

Tim Mackey: [00:05:14] The most common component last year was jQuery. So, we were seeing a fair number of applications that had a web-based front end to them. And as you would kind of expect, there's an awful lot of JavaScript in a modern web-based application so jQuery kind of topped the list.

Dave Bittner: [00:05:30] Hmm. Any other notable components that you see there that come up a lot?

Tim Mackey: [00:05:35] It really does vary. So, we see things like Font Awesome coming up - that was actually the third most common component that came up. But it's clear across the board, so if someone's gone down the Node path, we're going to see an awful lot of Node. If they've gone down the Angular path, we're going to see an awful lot of Angular. And that's true also on the server back end, where Java and .NET, Golang, all of the capabilities that you would expect out of those languages are represented in an open source form in these applications.

Dave Bittner: [00:06:02] I see. So, let's dig into some of the security issues here. Again, give us an overview - what are the vulnerabilities that we're talking about?

Tim Mackey: [00:06:10] We saw quite a spectrum in terms of the vulnerabilities and the patch state, and I think the patch state's really a key thing to focus in on. One of the vulnerabilities that we saw last year - and this is right now top of the leaderboard, we've never seen it quite this striking - is a vulnerability that was in FreeBSD. And so, this particular application was using a very old version of FreeBSD that had a vulnerability that was disclosed in May of 1990... 

Dave Bittner: [00:06:36] Wow.

Tim Mackey: [00:06:34] ...Or the way we put it, probably it is older than some of the developers working on modern code.

Dave Bittner: [00:06:42] (Laughs) Right. Right.

Tim Mackey: [00:06:44] (Laughs) And so, we looked into how this could be, and one of the things we came out with was this was an application that just fundamentally met its requirements, and no one saw any reason to deviate from this until they brought in a company like us to go and assess the software and say, well, what exactly are the, quote unquote, smoking guns that might be present here, and what do we need to do to move forward. That we saw this it was, it's working, why do we need to change it?

Dave Bittner: [00:07:11] Yeah, that's really fascinating, because, I mean, I can see sort of if it's not broken don't fix it sort of thing. If everything's working the way it's designed, and you're also - I would imagine you're not getting complaints from the users about functionality problems as well.

Tim Mackey: [00:07:26] Correct. And this actually ends up manifesting itself in a different aspect of patching when it comes to open source components, and that is - there's no one vendor. There's no quote vendor known as "open source" where you can just go and get all your patches from. Your patch has to match wherever you obtained your code from. So, the easy example is, if I have a patch for OpenSSL, I could have a patch that comes from upstream. I could have a patch that comes from, say, Canonical. I could have a patch that comes from Red Hat. If I apply the wrong patch, I could change the behavior of OpenSSL in ways that I don't expect, and that could be a really, really bad thing. So I have to know not only that I have to patch something, but where to get the correct patch from.

Dave Bittner: [00:08:12] If I have a piece of software that's working fine, and there's been multiple versions over the years, that the parts that are working just fine, it's unlikely that I'm going to go back and check the parts that haven't changed - there's no been no functional change since the last version - is it likely that I'm not going to go back and check to see if that open source component has had any updates or patches?

Tim Mackey: [00:08:37] That's actually a very common scenario. What we see developing teams doing - and when you step back, this makes perfect sense that they would do this - is here's a component that meets my requirements. I don't want to run the risk of, say, the component not being available from where I downloaded it from, so I'm going to bring it in-house and I'm going to cache it in some form of binary or repository. This is awesome, because it assumes that and enables that I'm going to have a very consistent build environment. That application is going to come out exactly the same way every time. Over time, there might be security disclosures of one form or another against that component in its specific version. If I don't have some process to go and keep it up to date, I'm now going to get progressively out of date. And when the time comes to actually update it - it might be six months, it might be a year, it might be two years later - the delta in functionality can pose some significant tax on the organization when they go and apply that new patch and suddenly there's some behavioral change or configuration change or so forth.

Tim Mackey: [00:09:33] And that's why one of the big things that we saw as an alarm is that eighty-five percent of the code bases contained a component that was more than four years out of date from whatever the current version is, or had absolutely no development within the last two years. And so, it's that level of awareness that teams really need to have, is, am I getting stale? Am I getting out of date? What's the, quote unquote, operational risk that's going to be associated with updating to the new version when I finally get around to it?

Dave Bittner: [00:10:02] And I suppose, I mean, there's an assumption that if we're bringing this in-house, then are we relying on our own team to keep it up to date in a way - does that make sense?

Tim Mackey: [00:10:12] It does. And so, it effectively becomes a question of if you're bringing it in-house, what is the procedure and process that you're going to run through in order to keep things, quote, secure and current. And that might mean that, hey, I'm going to maintain an independent fork because that makes sense for my organization, and there's a conscious decision behind it and there's humans with competencies in order to do that maintenance, or I'm going to build a process that has, for example, an engagement with that community to be aware of when they release new updates, when they release new versions, how they communicate their patch and security information. That needs to just be part of the overall, how do I responsibly consume open source software?

Dave Bittner: [00:10:52] So, let's go through some of the vulnerabilities that you saw. I mean, there were some that popped up over and over again. What were some of the ones that you kept seeing there?

Tim Mackey: [00:11:02] The one that really popped up over and over was associated with a Jackson Databind. And there were three vulnerabilities that were part of this puzzle: CVE-2018-7489, CVE-2017-7525, and CVE-2017-15095. They all fundamentally had the same root scenario. And for the people who don't know what the databind is all about, it really provides a serialization/de-serialization capability to bind data into Java objects, so that people working with Java can just use that data as if it was a member variable off of that object. And so, what was at the root of this is that some class types could have a polymorphic or a dynamic binding model associated with them.

Tim Mackey: [00:11:44] And so, the first attempt to fix this was, oh, we've stumbled across one of these classes and it's something we really shouldn't be touching, let's just go put a simple If statement in there that says we're going to ensure we don't touch this. The second attempt was, oh, there's a couple more so let's go put a case statement in there that says if you're in this set. And the third attempt was, well, you know what, we really need to have a different approach, because if there is yet more, we're going to end up in some serious trouble. So, you effectively had three separate attempts to patch this, and what we saw was that not everyone had actually moved on to what the final set of patches were, and there were a number of people who were on the intermediate steps. And it may have worked for them, but they now have a latent risk if the developer at some point in time in the future goes and says, aha, I want to go and do this, now you expose yourself to that unrefactored code, if you will.

Dave Bittner: [00:12:36] Yeah, I mean, that's an interesting point as well, because I can imagine an inquiry in-house, so someone saying, hey, have we patched this bit of code? And someone can do a quick check and say, yes, we have patched it, but it might not be the most recent patch.

Tim Mackey: [00:12:50] Correct. And if we put our developer hats on and forget about the whole open source angle, what we're effectively saying is, in our own development teams, we've probably had a situation where we've attempted to fix a bug and it didn't necessarily work out correctly the first try. So we went and we came up with a different avenue of attack. That's exactly what happened here, except it's an open source, freely downloadable version - there's no vendor control where they can go and push that update out. So, the onus very much is on the consumer of this component to go and ensure they're up to date.

Dave Bittner: [00:13:22] One of the things that you all looked into here were license risks, and how different components have different licenses attached to them. Take us through what you discovered here.

Tim Mackey: [00:13:33] One of the key things that we look for are anything associated with license conflicts or challenges to the intellectual property. It's part and parcel of what we're trying to do from a tech due diligence perspective. The canonical example is let's assume that someone has some GPL3 code, but they're trying to release their project under an Apache license. That's going to create some challenges for them. And so, looking at the license is definitely high on the list of tech due diligence and equally important to looking at what the security state of the situation is. So, what we found were sixty-eight percent of the code bases had some form of a license conflict. Sixty-one contained some form of a GPL conflict.

Tim Mackey: [00:14:15] And those are relatively straightforward things to work through. The more alarming scenario was that thirty-two percent contained some form of custom license that would need a legal review in order to interpret it. That someone had taken a standard license and modified it in some way, added some clause into it, wrote their own version of a license and said, gee whiz, this is open source as long as you go into these set of things, but it's not a standard example that might be endorsed under SPDX or under the OSI model.

Tim Mackey: [00:14:44] But even worse, were thirty-eight percent of components we saw had no identifiable license associated with them. Which means that, who owns that code and what are the rights and obligations that are granted? So, as we go through, the core thing that we want to call out on the license side of things is make certain that you can actually identify where you got the code and what the rights are, so that you can fulfill any obligations that are associated with that license.

Dave Bittner: [00:15:12] Yeah, I would imagine that could create a real remediation headache, if you find something like that in your code, and now the folks at legal have to go digging around to figure out what our situation here is.

Tim Mackey: [00:15:23] Correct, and what a lot of companies try to do in order to avoid this situation, is they will say, you can use anything that, say, has an MIT license or an Apache license, or we want everything to be GPL - if it's not GPL, we don't want it. They'll pick one of the very standard, understandable, recognizable, tested licenses and say developers can run with those. But there's always going to be some exception someplace where those components fits exactly what the requirements are. It has a license that's a little bit off. And so, one of the big things that I personally advocate for is that development teams make friends with their lawyers... 

Dave Bittner: [00:16:00] Hmm.

Tim Mackey: [00:16:00] ...Take them out to lunch, hang out with them a little bit. It's like - it seems so goofy, but at some point in time, that legal team is going to need to be there for you. They should at least know that you're on, quote, the good guys' side of camp and you're not trying to do anything and bend any rules, you just are legitimately trying to do the right thing for the company. And when you have that relationship, it's a whole lot easier to go and have a conversation and say, look, I did this, how do we get ourselves unstuck? 

Dave Bittner: [00:16:29] Yeah, building up that relationship ahead of time, rather than when everyone's in a bit of a scrambling mode, I suppose.

Tim Mackey: [00:16:36] Exactly. It's like, when you're in crisis, the default is, how do we get ourselves out of here as quickly as possible? And if there's no relationship there, it's not going to make matters any better. But if there is a relationship, at least you know how the person's thinking about certain things in advance.

Dave Bittner: [00:16:52] So, let's walk through some of the recommendations here. What tips do you have for folks who are out there making use of these open source bits of code?

Tim Mackey: [00:16:59] So, the first thing I definitely want to call out is that, at the beginning, we said open source is kind of the way the world's at. We saw a sixteen percent increase in the number of open source components in the code bases we were looking at. And despite all of the license side of things, we found that the twenty most popular open source licenses covered ninety-eight percent of the code in place. So, it really is a case of people are doing the right kinds of things.

Tim Mackey: [00:17:23] However, if I want to move to action items, I need to recognize the first rule is you can't patch what you don't know you have. And so, no matter what kind of tooling and process that you put in place, you have to understand that you have that component in place and where you got it from. So, you have to have some form of inventory discovery tooling in place to solve that, because eventually there is going to be a patch. Eventually there's going to be something that needs to be updated. And if you don't have that process in place, you're kind of stuck. And as a result, looking for a vendor known as "Open Source" isn't going to necessarily help things.

Dave Bittner: [00:18:01] Hmm. How about having an audit done? Is that something that folks should have routinely?

Tim Mackey: [00:18:06] I would say having an audit done periodically for a major event is a really good thing. Having an audit done, say, when you're going to release the first version of your product, or a major refactor of your product, would be a very good thing. But by the same token, there is tooling available that you can bake into your SDLC, so that you can ensure that, say, a non-compliant license isn't introduced at the outset, that you have continuous monitoring in place for new vulnerability disclosures against whatever your application looks like. So that maybe you're about to ship tomorrow, and all of a sudden there is something that's really hairy and audacious that comes down the pipe today, maybe you're able to fix that in time but you have to push the update out till Monday. Okay, that's fine. Knowing that in advance and having that level of awareness is also key.

Dave Bittner: [00:18:58] And I think, you know, one of the points that's made throughout this research here is that it's not the open source software itself that's necessarily risky - it's how you implement it. It's how you go about using it.

Tim Mackey: [00:19:11] Exactly. And one of the key things that I advocate for everyone to do is try and identify what your most critical components are. That might be a framework like Node.js or Angular. That might be a database like, say, a Mongo or an Elastic. That might be a delivery paradigm like, say, Kubernetes or Docker. Find out what your top ten, fifteen most critical components are in your environment, and then engage with those communities. Find out how they work, where they work, do they have meet ups? How are they discussing their future direction? And be an active participant, not just on the consumption side, but on the steering of the future, because some of these components - they only have a handful of developers, and they have a backlog of activity, so if you have development energy that could go and solve a problem, they're probably really willing to accept it. And if you're in that community, you're also feeling more engaged with the future direction of everything.

Tim Mackey: [00:20:07] The only other thing that I'd probably call out is that, when adopting open source, you probably want to make certain that you have a robust strategy for its consumption. And that would cover all of the things that we've been talking about, but it would also make certain that the development teams, and the legal teams, and various software architects, and so forth are actively engaged in that process, so that it's a standard within the culture of the development team in an organization, as opposed to something that's, say, lawyers saying we must do this, or an executive saying we must do this, so that everyone has a stake in the success.

Dave Bittner: [00:20:47] Our thanks to Tim Mackey from Synopsys for joining us. We were discussing the 2019 Open Source Security and Risk Analysis report. We'll have a link in the show notes.

Dave Bittner: [00:20:57] Thanks to Juniper Networks for sponsoring our show. You can learn more at juniper.net/security, or connect with them on Twitter or Facebook.

Dave Bittner: [00:21:07] And thanks to Enveil for their sponsorship. You can find out how they're closing the last gap in data security at enveil.com.

Dave Bittner: [00:21:15] The CyberWire Research Saturday is proudly produced in Maryland out of the startup studios of DataTribe, where they're co-building the next generation of cybersecurity teams and technology. The coordinating producer is Jennifer Eiben. Our CyberWire editor is John Petrik. Technical Editor, Chris Russell. Our staff writer is Tim Nodar. Executive Editor, Peter Kilpe. And I'm Dave Bittner. Thanks for listening.