DevSecOps and securing the container.
Rick Howard: Hey everyone, and welcome to CyberWire-X, a series of specials where we highlight important security topics affecting security professionals worldwide. I'm Rick Howard, the chief security officer, chief analyst and senior fellow at the CyberWire. And today's episode is titled "DevSecOps and Securing the Container."
Rick Howard: Now, we all know that the move to the cloud had great potential to improve security, but the required process and cultural changes can be daunting. There are a vast number of critical vulnerabilities that make it into production and demand more effective mitigations. Although shifting security left should help, organizations are not able to achieve this quickly enough, and shifting left does not account for runtime threats. Organizations must strive to improve the prioritization of vulnerabilities to ensure the most dangerous flaws are fixed early. But even then, some risk will be accepted, and a threat detection and response program is required for full security coverage. So on this show, we will be discussing how to secure your software development lifecycle, how to use a maturity model like the Building Security and Maturity Model, or BSIMM, where do containers fit in that process, and the Sysdig 2022 Cloud-Native Security and Usage Report.
Rick Howard: A programming note - each CyberWire-X special features two segments. In the first part of the show, we will hear from industry experts on the topic at hand, and in the second part, we will hear from our show's sponsor for their point of view. And since I brought it up, here's a word from today's sponsor, Sysdig.
Rick Howard: I'm joined by Tom Quinn. He's the T. Rowe Price CISO and regular guest here at the CyberWire Hash Table. Tom, welcome. It's always great to have you come on the show and explain stuff to us.
Tom Quinn: Rick, I'm happy to be here. Thanks for having me.
Rick Howard: So you've been doing cybersecurity in the financial vertical for most of your career. You've worked at State Street, BNY Mellon, JPMorgan Chase, and you've been at T. Rowe Price now for over six years. So other than not being able to hold down a steady job - (laughter) right? - you have lots of tons of valuable experience securing big organizations. I'm interested in whether or not software security best practices have moved into the CISO's realm of responsibility. In other words, does the buck stop at the CISO for securing your company's software development life cycle, or is it more of a shared responsibility where you guys work?
Tom Quinn: I'll take a - maybe a bit of a historical perspective, a pre-cloud and then a post-cloud perspective on this. So I think for pre-cloud companies, software security usually was the realm of the security team. Security teams would install static code analysis capabilities. They would do pen tests on code, dynamic testing on code - they may even do red team testing and the like - and would come up with best practices for, you know, creating environment for an application to be in, as well. Post-cloud, and even modern environments, where you've got CI/CD pipelines and you have build trains and common ways to deploy, we're seeing more of the architecture and application engineering teams take more of a role.
Tom Quinn: Static code analysis, dynamic analysis, IAST may be an acronym people have heard - those things are getting embedded into the build trains themselves. And any defect that comes out of it is being tracked through the build trains and through the architecture practices that the firms have. And I think clouds, in general, have security capabilities and configurations by default and by design. And we're finding - right? - that cloud developers in many cases - and certainly cloud engineering and cloud architect and cloud operations people - have thought the control aspects of this through a bit, and I'm sure there's variability in there. But what I'm finding for more modern maybe organizations is - and certainly for cloud - they're building security in and others are participating.
Rick Howard: I guess what I'm getting at though is, so when those architecture guys are building the CI/CD pipeline, are they figuring out the security angles themselves or does the CISO own that stuff? Like, are we checking with the latest patch levels or the latest versions of open source code? Whose job is it to make sure that's all secure?
Tom Quinn: I'll speak for my firm, and I'm proud of this. The architecture team is running the play, and the security team is participating. We have a group called developer services, and they're the ones that are creating this environment for the developers and with a goal of making it as easy as possible for us to build safe, secure and performant software as fast as possible. So in that case, the goal is really speed of deployment and removing - I think, the phrase, you know, I've heard regularly is toil - right? - removing toil from the process. So we've done a very good job of integrating security capabilities to the left, embedding it into the practice itself. And you're getting security for free. And, Rick, to your question, who's kind of running the show? It's the architecture teams, developer services team, engineering teams, and we're participating. And I think it's exactly what it should look like, by the way, as well.
Rick Howard: So, I mean, I get that the architecture teams build the pipeline and they build the infrastructure as code - stuff that we've been talking about for a number of years. But when your team looks at that and says, you know, we need to figure out how to pull SBOM information out of that pipeline now, is that you guys suggesting that and saying, that's a new feature we need? Or how does that come out?
Tom Quinn: It's a great question. We are providing requirements. And engineers and developers are building those capabilities to meet our requirements. So it's really unique. I find it like a breath of fresh air - right? - in many cases. But it's awesome where that is the case. And then there - not only is there the system, like CI/CD pipeline or whatever. But the system is being designed to meet a variety of needs, and security is just being one. Part of those needs are evidence and artifacts - right? - that auditors or external parties may want to see as well, and providing the transparency by design as well to it.
Rick Howard: So that's how T. Rowe Price is doing it. Do you have a sense on how the rest of the financials are doing t? Is that, like, standard practice? Or are you guys on your own here?
Tom Quinn: I won't name any other names, but I have peers of mine that have similar kinds of environments that have been built. Again, I think pre-cloud is going to look different than cloud or post-cloud. And again, I think also, depending on the environment, security teams may still be playing more of a traditional role - right? - where they're manually reviewing things that need to go into the cloud or applying those concepts to a cloud environment where it's critically important to ensure that you are applying modern cloud approaches. And you mentioned security as software, that you're embedding those kind of requirements into the code fabric itself. And that includes performance, resiliency, security controls and the like. And, you know, I think a lot of the cloud providers themselves are practicing that approach in the way they build software and certainly encouraging their people who are using their platforms to do the same thing. So - but I'm certainly aware that there are peers of mine that have similar approaches to building and deploying software.
Rick Howard: So that covers infrastructure as code. But the new kid on the block these days is containers. I think we all feel like the notion of containers have been around for a while and I guess they have. But I think we forget that it wasn't until 2013, less than 10 years ago, when Docker released an open source container management platform called dotCloud and established a partnership with Red Hat Linux, that it started to take off. Still, containers are just collections of software that we have to deploy. Are they part of the continuous delivery, continuous deployment pipeline? Or is securing containers a separate task somehow?
Tom Quinn: No, it is part of the pipeline. But I think, Rick, you've raised a very good point about containers - right? - is, why do we have them? And I think that's important, right? And in some cases, we have containers for management reasons. We have containers so that we don't get, maybe, lock-in for one cloud vendor or to another. Or people want the flexibility of deploying content or instances with maybe more control or more management to them. So our approach to containers has been the same thing as our approach to CI/CD pipelines, right? Where the pipelines are deploying to - it's OK if they're going to deploy to a container. It's just another code base - right? - within a code base...
Rick Howard: Right. It's just software, right? It's just software.
Tom Quinn: It is. But I think the thing is, is if you've not designed control and management and security and resiliency into that, what could happen is that you'll accelerate anarchy. And that's a concern that we have when we started looking at a variety of different container solutions is making sure - right? - that there's enough - I'll call it architecture - architecture, engineer and control - but that we've designed how we want them to be used, put constraints in what they can be used for as well.
Rick Howard: So in the mid-2000s, two open source models emerge for developing secure code. One is called BSIMM, the building security in maturity model, originally created by Gary McGraw but now sponsored by Synopsys. And this is not a prescriptive model. Their latest report, BSIMM12, is a survey of some 128 firms about what they actually do with their own internally developed software. The other is called SAMM, the Software Assurance Maturity Model. This one is a prescriptive model originally created by Pravir Chandra in 2009, but now managed by OWASP, the Open Web Application Security Project. And the question I have for you, Tom, is, are you using any of those models? Are they useful? Or are they just mostly academic?
Tom Quinn: Yeah. So at a previous firm, we were an early adopter for BSIMM. You know, I think Gary is, you know, a security luminary. And for folks that may not know Gary and some of the work that he's done, I would certainly encourage people to do a little research on the work that Gary has done on software - not only software security, but resiliency and good coding practices and quality assurance. But BSIMM, I've been a participant in BSIMM at three or four companies, including my current one. And what I found, I found them to be valuable. And one thing is, it's a yardstick to use to measure oneself against and to make determinations on whether or not there are improvements that could be made, and then having that yardstick be able to be comparing yourself against hundreds, right - you know, I think you said it was just below 150 or so - but hundreds of other firms is helpful, too, in understanding, again, where you're at and where you're going to. I find it valuable. I find it more than academic. And I think if you're using it as a diagnostic tool, it's a terrific thing.
Tom Quinn: I haven't used the other tool. I'm a big fan of OWASP and the work that they've done. But I can't comment on the other tool. But again, I think having a standard way to understand where you're at and where you're going to is helpful. And as long as you're willing to take the advice that comes out of that tool and make changes or drive change, then I think it's great. What I'm not sure of - right? - is, like, that you could publish your BSIMM score like a Moody's or an S&P rating and that it would be valuable to an outsider to compare and contrast. I'm not sure that that's an appropriate use for it. But certainly, as a way to measure yourself, where you're at and what you could do, I found it to be a useful diagnostic.
Rick Howard: I'm with you. I would never go to the board and say, you know, our competitors over there, they do these 10 things from the BSIMM model that we're not doing, and therefore, we should do it. I don't think that's the right way. But to be able to see your peers try these things, you know, and see how they do and say, well, maybe that is something I should try to get done because it looks like it's useful to them - I know that's a subtle distinction, but do you agree with me on that?
Tom Quinn: I am in alignment with your perspective. And I think in addition to that, it's also helpful to understand why. In some cases, people have chosen not to do best practices - maybe we'll call them - or good ideas because the risk model that they have isn't right. The maturity of the company or the software development practices aren't right. It's not the most important thing maybe they could improve upon or that they need to improve upon. And then in some cases, people are doing pretty high-end controls and validation and reconciliation because of the nature of the work they're doing. It's appropriate to put the extra or additional rigor, either in protection or testing in place. So the why really does matter as well. You may be perfectly fine where you're at for the purpose of what you're doing. And that's another aspect of the BSIMM - is to be a bit self-reflective.
Rick Howard: This is all good stuff, Tom, and it's really great to get your perspective. But we're going to have to leave it there. That's Tom Quinn, the CISO of T. Rowe Price. Tom, thanks for coming on the show.
Tom Quinn: You bet. Great to see you, Rick.
Rick Howard: Next up is my conversation with Anna Belak, the director of thought leadership at Sysdig.
Rick Howard: Anna, thanks for coming on the show.
Anna Belak: It is my pleasure. Thank you for having me.
Rick Howard: So let's talk some cloud basics. With everybody moving more and more stuff to the cloud, the permutations of how you do that seem exponential. We have cloud service providers like Google, Amazon, Microsoft and others providing virtual environments that we can run workloads in. We can run fully functional servers in those environments and connect them all with virtual networks. We can run containers in those environments. And we can run serverless functions in those environments. Most are running a combination of all three. So can you just give our listeners a little thumbnail sketch about the difference of those three things and why you would use one over the other?
Anna Belak: Well, I could probably write a whole book on that, actually. It's a big question.
Rick Howard: Maybe we should.
Anna Belak: So - maybe we should, yeah. So let's go super basic, right? So if we start at the highest-level abstraction, which is, like, function as a service, that's basically a little piece of application that just runs without the infrastructure because the infrastructure's abstracted, and the cloud provider provides all of that for you. So you can almost just write code and run that code, and everything else is handled for you. It's super nice for small things. It's a little difficult to use that as the only way to run applications because it removes a lot of the flexibility. So oftentimes, these things are used, as people say, glue to connect to other services or connect applications that are maybe more substantial. The applications that are more substantial can be run in many different ways. You can use containers. You don't have to use containers. If you do use containers, you will probably use something like an orchestration system, which the most popular one is Kubernetes. And what that will do is it will orchestrate - it will collect your containers, and it will put them on the correct piece of infrastructure. And it will also manage the relationships between them. So if a particular part of your application needs to scale because there's more demand from the client, Kubernetes will add more infrastructure to allow that to happen.
Anna Belak: And then the infrastructure itself, again, you could have more or less control. So, for example, if you were to use Kubernetes, you could use a Kubernetes-managed service like EKS, GKE or AKS. And that means that the cluster will be managed for you by the cloud service provider. You don't have to worry about setting up all the nitty-gritty bits of Kubernetes, which is quite complex, actually. Or you can roll your own on top of IaaS, which means you would provision the infrastructure-as-a-service instances, which are just virtual machines, and then you would overlay the cluster on top of that in any way that pleased you, which again, if you want more control, it's a reasonable way to go, but if you would like an easier time, the managed service is probably the option you would prefer.
Rick Howard: As I said, exponential ways to do these things, right? So it's all very complicated. But you guys, in January, published the fifth annual "Cloud-Native Security and Usage Report." And at a high level, it summarizes how Sysdig customers of all sizes and industries are using and securing cloud and container environments. So in one alarming finding that I ran across, 75% of containers are running with high or critical vulnerabilities. Yikes. I thought containers were supposed to reduce that kind of thing. What's going on here?
Anna Belak: Yeah, we also thought that, actually.
Rick Howard: (Laughter).
Anna Belak: We still hope it will happen, honestly. Containers are still kind of young. So there's multiple ways to think about that finding. I think if you're kind of a veteran of the industry, you might smirk a little and say, well, really, 75% of anything is running with critical vulnerabilities, if we're really honest about it.
Rick Howard: Oh, that's a good point (laughter).
Anna Belak: So I don't know if it's really worse than the status quo. But you are right that we sold this vision that containers would make things much easier to fix and replace, and that this patching nightmare that we've been living for several decades now will finally become better. And I think the reality is that there's a substantial culture shift and actually technical shift as well in adopting all of the best practices that would enable that to happen.
Anna Belak: So one thing I will say is if you look at this - at the data more granularly, there are customers that we have that are able to get that number to something like 4% criticals - higher criticals, right? So in their environment where they have implemented a lot of these best practices, it's actually fairly clean. The issue is that those environments tend to be smaller, and those customers tend to be more mature. So that means they have really committed to building out this shift left story. And they're probably checking for a lot of these flaws at multiple stages of the process from source to runtime.
Rick Howard: If I can restate what you said, so I understand it, containers give us the ability to reduce this problem, but there's still a huge giant culture shift to get people to do this, right? And is there anything besides culture that - preventing people or organizations from doing this, or is it all just basically, oh, we did it this way before them, we're going to continue to do it the old way?
Anna Belak: I'll actually push back a little on you in that containers do give us the ability, but they don't themselves give us much of anything, right? So they are a way to package applications. And they come with a lot of this philosophy of immutability and disposability, right? So in theory, you wouldn't patch a container, you would replace it. And the issue is that this only works if your application is designed to function this way. So a lot of folks will take an existing application and they will modernize it over the course of months or years. And part of that modernization is to containerize it. So if they're in kind of an early phase of that process and they've just repackaged so it hasn't become yet a distributed microservice kind of application where the different pieces are substantially decoupled from each other, so replacing one doesn't hugely affect the rest, they can't really subscribe to the whole philosophy and this whole culture and best practice approach because it just isn't going to make any sense. They're essentially taking a legacy workload or a, like, partially modernized workload and trying to manage it.
Anna Belak: So if your workload is not purely cloud native, the way in which you're going to deal with it can't be purely cloud native either. And that's I think what's happening in most of these situations is a lot of folks who went green field into containers and built everything from scratch have a much easier time adopting that culture shift and using these best practices and shifting left because they don't have that baggage and technical debt from before. And the folks who are more in these transitional states of taking your business-critical application and modernizing it are going to take much longer too to get to that end state.
Rick Howard: Right. So it's more than just moving the workload from the old data centers up to the cloud in the same way that you did before. You want to adopt this cloud-delivered model philosophy. And that's kind of where people struggle, right? Because it's kind of new to - even though it's been around for a decade now, it's still pretty new to most organizations.
Anna Belak: That's fair. Yeah. And the technical aspect is not that simple either, right? Like, you are potentially dealing with a lot of new tools and new processes. So if you want to get into shift left NCIC (ph) pipelines, you know, there's a whole collection of technology that you now have to string together and operationalize. And this is before you even hit security, so it's not simple.
Rick Howard: So we're going to get to shift left in a second. But I want to talk about another alarming find that just kind of struck me. I did a spit take of coffee across the room when I read the number - right? - that says - it was that most people are overpaying their cloud service providers an average of $400,000 per cluster. Well, explain what that means. And oh, my God, what do we do to fix that?
Anna Belak: Yeah. I think that's maybe not surprising for the folks that have gotten that crazy bill from their cloud provider. I usually kind of roll my eyes if anyone says they're going to save money by going to cloud because I don't know that this ever happened. What I do believe is that...
Anna Belak: ...If you go cloud intelligently, you can get a lot more for your money, right? So if you actually were to adopt some of these provisioning approaches that are more thin provisioning style, you can get a lot more free money. But you're not really going to save money. You'll just hopefully achieve more things with your money. So the number about over provisioning comes from the fact that people fail to impose capacity limits on their workloads, capacity and memory limits. And so if you don't put a limit on it, it'll use as much as it wants. And often, that's not really what you want. What you should do is overprovision it just slightly so that your workload can't run out of memory because that can also be very bad. But you should always set some kind of explicit limit, and if you don't do that, you just end up paying lots of money you shouldn't be paying.
Rick Howard: So there's this whole evolution of software development that's got us to this point. And one of the biggest reasons to move to the cloud was to transition, like you said, the development arm of your organization away. You know, I'm an old timer, the old waterfall method that took us years to get new software out the door and towards an agile software development model, a model where one key metric is producing working code at regular short intervals. That led to the DevOps movement. So instead of keeping the software developers and operators monitoring and managing the software separate so they don't talk to each other, we considered all that a system of systems and made those guys work as a team. And then, you know, Amazon, Google and others used those techniques to build infrastructures code. And so, of course, the security community wanted in on that action and started talking about what you said, shifting left on code development. So can you describe what shifting left is and how we're doing as a community trying to accomplish it?
Anna Belak: Yeah. So the simplest version of shifting left is just doing things earlier, basically. And it doesn't apply just to security. Security did totally show up to the party a little late, but it makes perfect sense to include security in the shift left process. So I believe originally shift left has to do with testing. There's like this old Toyota story about assembly lines, right? Like, instead of building a whole car and then figuring out if it's broken, you actually have some kind of assessment of its quality and functionality at every stage of the assembly line. And then if you see a defect, you can very quickly fix a defect, which saves you lots of suffering, time and money later. So that's exactly what shift left is in software, except we're building software and not cars. So in some ways it's easier, and in some ways it's harder because it's easier to see a flaw in a car than it is in software, as it turns out.
Rick Howard: (Laughter) That's very true. But we can use the shift-left mentality to, you know, implement basic secure software best practices, you know? Like, you could implement rules from the OWASP Top 10 or rules from the BSIMM model, the maturity model for developing secure software, even the SAM model. You could - if you were good at this, you could put those rules as far left as possible so nobody would break them when the developers wrote the thing that's making the company money. Is it - so that's where shift left comes in?
Anna Belak: Yes. So that is exactly what you are supposed to do.
Rick Howard: (Laughter).
Anna Belak: And it makes a lot of sense. But as with anything, there is, like, the romantic dream and then there's the practical reality. And the practical reality is that the more checks you do - at any point, actually, in time - the more data you get. So you're going to get a bunch of information about your system that will say, like, all of this stuff is broken, or all of this stuff is out of best practice. And then if you're a developer, you have some deadline by which you're trying to ship features. And so you're going to have to choose which of those things you're going to deal with. And the first issue is you're a developer. You may not have a lot of security expertise to decide which of these security problems is the most critical, you know, source of risk. And so that's difficult. You either need to develop that expertise or somebody has to help you figure that out, whether that's a person or a tool. And then the second piece is you can only do so much, right? So you're going to choose, like, the top five things. You're going to fix them, and then you're going to ship it.
Anna Belak: And ideally, there is, again, some guidance in your process. So if your pipeline has policy built in that says, you know, like, our company's risk appetite says that we can only ship things that have no critical vulnerabilities, then that's a pretty clear scenario, right? If it's more kind of like, oh, it's up to you. Like, you get to decide, then it's not - it's very difficult for somebody to know when something should be failed versus when something can be shipped. So one of the first steps is actually deciding what those gates should be and then tuning those gates over time because you're never going to get it right the first time.
Anna Belak: But, yeah, so the challenge is having good kind of understanding of company policy in terms of security translated into this workflow that makes sense for developers and then enabling them to actually make the decision of what they have to fix first so that their work sort of makes sense and they don't just, you know, either ignore the results because they don't know what to do or suffer indefinitely, trying to figure out what to do and slow down the release.
Rick Howard: So it's not a cookie-cutter process for any organization. Everybody's going to have their own view of this. So what they implement in the shift-left philosophy - what I do at the CyberWire is going to be different from what you do at Sysdig based on priorities and based on philosophy and those kinds of things. Is that what you're saying?
Anna Belak: Absolutely. I think that's actually the hardest part of security overall, is that there's no, like, grand correct answer. Everybody's correct answer is very unique to them, and it's very hard to figure it out. And then you - like, you have to own the answer. And if you - if the policy you chose is incorrect and somehow there's a failure or an issue, really, you're just like, sorry, that's too bad, right? But nobody can tell you what it should be. So it's very difficult in that sense. I mean, obviously, we have guidelines on the side of compliance and so on, but at the end of the day, it's about your company's assessment of risk.
Rick Howard: I thought originally that one of the things that everybody would do in this shift-left idea was implementing basic vulnerability management functions so that when, like, a Log4j pops up, most of us, when we reacted, that didn't have - we - it was all manual - to actually go out and look to find out where that thing was running in our networks and then submitting a patch to update it where it was found. Is this shift-left mentality a way to get better at this, or how does - or what's your solution? What are you telling your clients about how to get better at this?
Anna Belak: I have to cage and say that I think the answer is actually to have both a strong shift-left process and mentality and also, when you get to runtime on the right, to have really solid runtime controls for things that hit the fan after the fact, right? So Log4j is a great example of something that you wouldn't have known because it's a zero-day. So if you shipped software that had Log4j in it the day before it was disclosed, you thought you were fine. Even if you did everything perfectly - you had the most beautiful pipeline - you ship this thing with Log4j. And then tomorrow, you're like, oh, great, like, all my stuff is vulnerable.
Anna Belak: So you can use your shift-left tools to go back and say, OK, let me just rerelease all this stuff. I'm going to rescan everything, and I'm going to decide which of these things I have to block, which of these things I can mitigate and which of these things I have to patch right this second and just rerelease all your software. In reality, there are very few people that can really do that, especially with all of their ops. So what you end up doing is taking a hybrid approach and saying, OK, a lot of these things are running right now, and I can scan them in runtime, or I can audit the image that spawned the container or whatever. And then I can decide which of those I have to take down, which of those I have to kind of isolate somehow, which of those I - I think if you have both approaches, it's the most efficient because then you can leave the ones running that are not necessarily at risk. Like, maybe it's not exposed to anything. Maybe it's very difficult to access for an attacker. Maybe the exploit doesn't apply for whatever reason. As opposed to rereleasing everything all the time because that's also time-efficient and difficult, right? But for things that you do have to rerelease if you have this rapid pipeline, it's just much, much faster to identify the exact problems and then to fix the ones that are really important.
Rick Howard: This is all good stuff, Anna, but we're going to have to leave it there. That's Anna Belak, the director of Thought Leadership at Sysdig. Anna, thanks for coming on the show.
Rick Howard: We'd like to thank Tom Quinn, the T. Rowe Price CISO, for adding his valuable expertise to this discussion and Sysdig for sponsoring the show. CyberWire-X is a production of the CyberWire and is proudly produced in Maryland at the startup studios of DataTribe, where they are co-building the next generation of cybersecurity startups and technologies. Our senior producer is Jennifer Eiben. Our executive editor is Peter Kilpe. And I am Rick Howard. Thanks for listening.