CSO Perspectives (Pro) 8.10.20
Ep 17 | 8.10.20

Incident response around the Hash Table.


Rick Howard: When I hear the phrase incident response, my mind immediately goes to the SOC and the supporting network defender teams who track down the geeky details of the suspected cyberattack. And if it turns out to be a real attack and not just some weird anomaly, with actual cyber-adversaries trying to cause material impact to my organization, what actions do those technical teams take to prevent the success of that cyber-adversary?

Rick Howard: But as I invited some of the CyberWire's pool of experts to sit around the Hash Table and discuss the topic, I discovered that the tech team's activity is really just table stakes, that there are so many other things that have to be managed and coordinated and escalated up the chain as more and more evidence comes in that a real intrusion has taken place, that it is no longer just the technical team's responsibility anymore to respond to the incident. It is way bigger than that. And the CISOs of the world spend a good portion of their time planning and practicing those nontechnical things that involve people and process in order to successfully navigate a real cyber crisis when it happens.

Rick Howard: My name is Rick Howard. You are listening to "CSO Perspectives," my podcast about the ideas, strategies and technologies that senior security executives wrestle with on a daily basis. Today, we are talking to four cybersecurity thought leaders about their incident response experiences in the real world.

Rick Howard: Let's begin with getting organized and start with the ending in mind. I was talking to an old Army buddy of mine who just happens to be my best friend. His name is Steve Winterfeld, and he is the advisory CISO for Akamai Technologies.

Rick Howard: And, by the way, do you know how I categorize somebody as my best friend? I mean, we all have friends, right? We have colleagues and peers and friends we share a meal with or maybe even a beer. In those settings, I can say things that I would never say out loud in public because I'm not afraid of saying something crazy in a moment of weakness. All of that is important. But my best friend is the person I call when I need to get out of the country for, you know, reasons. Steven is the guy for me. He wouldn't even ask why I needed to go. He would just execute.

Rick Howard: Anyway, here's what he said about getting ready for incident response.

Steve Winterfeld: I would like to start by talking a little bit about frameworks. I think it's important to have a framework for auditors showing up, being internal or external. Or worst case is if you end up in, you know, post-breach, some kind of class-action lawsuit, to have something to point back to that you built your program around.

Rick Howard: What he is saying is that if it all goes south for me - if the cyber gods go against me and, despite my best efforts, the bad guys were still successful - it would be useful if I could say something intelligent about my incident response plan after everything happened, that it wasn't something I just made up out of whole cloth, that I was trying to be compliant with maybe an internationally recognized standard.

Steve Winterfeld: I would say probably the oldest and most well-established is the NIST special pub 861. You know, and it's the classic prepare, detect and analyze, contain and eradicate and then post-incident feedback, you know, that learning loop. 

Steve Winterfeld: Another great framework is the MITRE ATT&CK framework, which is a great way to look at all the threat vectors and start quantifying how well you're able to do it. 

Steve Winterfeld: So typically, we think about, you know, the red team will come in. They'll pick some adversarial techniques from the common body of knowledge. They'll test those. They'll go in. They'll execute some payload on the endpoint. You'll be able to detect, did the alert go out? Did the alert hit your SIM? From the SIM, was it seen in the SOC? Did your SOC react to it? And was there remediation? So I can do that kind of entire lifecycle using the MITRE ATT&CK framework. 

Steve Winterfeld: And then the last is - and this may be more tactical - the SANS Incident Response, the PICERL - Prepare, Identify, Contain, Eradicate, Recover and Lessons Learned. I think SANS has a lot of great resources. And so those would probably be the three frameworks and tools I would put out as useful in incident response. 

Rick Howard: But be cautious here. You don't want to claim you are following an international standard and then get discovered later that you weren't. 

Steve Winterfeld: The one thing I would caution you about it in your incident response plan is it is going to be able to be called as evidence in a class-action lawsuit. So the things you put in there - if you don't follow them, there is a consequence. So it's something you want to make sure is actually how you're going to function. And then you're going to have your processes below those two things on how you actually, you know, execute those two documents. 

Rick Howard: From the NIST standard and the SANS recommendations, the concept of incident response is not that complicated on paper. But in the real world, the implementation of it can get messy quickly depending on how big your organization is. Because at a certain point in the investigation, the infosec team has to start bringing in other members of the organization, like IT, like legal and risk, and eventually, the executive leadership team, and maybe even some outside contractors. 

Rick Howard: Ted Wagner is another old Army buddy of mine. He and I, along with Winterfeld, worked in the Army's computer emergency response team back when the internet was young. One of our first big cases was trying to track down a hacker that completely owned the Army's networks, not because he was trying to do us wrong or steal some of our most treasured secrets. No, nothing like that. But the hacker was sure that we were storing the secret documents that proved beyond a shadow of a doubt that aliens walked among us on Earth. And just between you and me, I wish he had found those documents. 

Rick Howard: Anyway, Ted worked for me at the ACERT, and later as my deputy CISO, too, when I worked for TASC. But now, he's a big fancy pants CISO in his own right at SAP's national security services. He makes it clear that incident response is a team sport. Just because the infosec team knows security, that doesn't mean they know everything about what is going on in our networks. Here's Ted. 

Ted Wagner: This is what I would say. So when we discover something, we start asking a lot of questions about it. We don't rely just on the expertise our security team. We'll go over to the operations team and say, you operate this server every day, does this look odd to you? And that way, we have a diverse perspective on the problem and we can usually come to an understanding very quickly. 

Ted Wagner: And what I really like about that one example - we were already onto our procedures and it got really reinforced during some tabletop exercises and some incident response exercises that we conducted. We came to the understanding that, you know, the security team - I hate to say this - we're not experts in everything. 

Rick Howard: What? 


Ted Wagner: Our ops brethren actually have some great insight into how these systems work. It's useful to go and talk to them. Sometimes it's adjacent teams because we are working with customers. It's not really an incident situation, it was more - we saw some odd traffic once and it was related to a customer. It was not - it was just an investigation where it turned out to be normal traffic. But the delay in getting the right person at the customer's site to weigh in, say, is this expected communication? Of course it was, but sometimes we've had to go back and make sure all of our points of contact are accurate, and when we reach out to a customer, we have the right person to make that go faster. So we continually work with adjacent teams who have a more day-to-day relationship with the customer to make sure that information gets updated. 

Rick Howard: But once the technical teams have done their evaluations, it may be time to escalate the situation higher in the organization. Every organization is different, but there are triggers within the organization that compels escalation. 

Rick Howard: Rick Doten and I have known each other for years. He is one of the smartest people I know about organizing security tasks and explaining it in terms that CEOs can understand. He is currently the VP of Information Security at Centene. 

Rick Doten: So somehow it gets in. And then it's an event, and it gets investigated into an incident. And then from an incident, gets classified as a severity level. And the severity may be based on the number of records, the sensitivity of the data, you know, whether it's involving an executive, you know, whether it's involving something that is especially sensitive, perhaps from compliance things, like, you know, GDPR that we need to know that, you know? So anything that is compliance-related, that affects our compliance, then would certainly get escalated to notifying the executives, you know, anything that may cause an impact to the business process. 

Rick Howard: Jerry Archer has been the CSO for Sallie Mae for over a decade. His escalation process has been battle-tested and oiled like a fine machine. 

Jerry Archer: So we have three levels of incident, right? There's alerts, there's events and there's incidents. So an alert is something that typically the SOC would get. So the SOC gets an alert that says, hey, there's something that looks funny. They then reach out to the cyber team. The cyber team will assemble, they'll look at it, and it could be what we call an event, which means it's a relatively minor thing that you wouldn't have to deal with too much in terms of a response team, and then an incident, which is a big deal, right? Meaning that we've got to have everything from legal and communications and all these other response teams brought into the picture. 

Jerry Archer: If it were a big enough incident, we'd bring in a third party to help us manage that event, meaning just provide additional arms and legs that we wouldn't have. When we reach those kinds of levels, then, you know, everything comes into play. We bring together what's called an incident management office. The incident management office is composed of the vice president of asset protection, so the guy who is our physical security VP. It brings together the president of the bank, and it brings together legal. 

Jerry Archer: Now, at a very minimum, those three constituents look at the event and decide what's next. So if there needs to be communications, if there needs to be additional legal support, then we assemble response teams as a result of the incident, whatever it might be, right? 

Jerry Archer: So the IMO is the next step. Now, the IMO - which, again, remember, is the vice president of asset protection and the bank president and legal - they then will make a determination if we have an incident, right? If it's an incident, then we bring together what's called the executive crisis management team. And that's basically all the C-level executives in the corporation. They're briefed, and the IMO will typically make recommendations on what the next steps are. And then the executives will receive those recommendations, make whatever comments they want to make, and approve or reject or however it would roll out, as being the ultimate authority, right? So that could include everything about briefing regulators when we do that sort of thing. So that three levels decides, you know, sort of manages the incident going forward. 

Rick Howard: As the situation gets more serious, Jerry brings in more and more help from around Sallie Mae, and if needed, from outside contractors like technical incident response experts and professional PR firms. To keep all those elements moving in the right direction, he uses something called the DACI model. DACI is a decision-making framework developed by the Intuit company as an improvement to the older RACI model, spelled with an R. 

Rick Howard: The DACI acronym spells out the framework - D, the driver, the person who organizes the potential decisions - A, the approver, the person who makes the final call on any decisions - C, the contributors, the people doing the legwork - and I, the informed, the people who might be impacted by the decisions being made. Here's Jerry again. 

Jerry Archer: I think some people refer to it as a RACI model. In any project, you form a project and you create a DACI model. There's a driver, there's a person who's running the project; there's approvers who approve the project and elements of the project; contributors, who are providing the input to solve the problem, the arms and legs, the real workers who are doing the real work; and then there's those people who are informed, they simply get information. 

Rick Howard: Winterfeld is also an advocate of the DACI/RACI model. 

Steve Winterfeld: I think one of the best tools out there to map out those roles and responsibility is a RACI. And a RACI, if you haven't seen one, is a spreadsheet that talks about - on the left, who is going to be doing it? On the top, what is going to be done? Reverse those if you want. And then you're going to talk about, you know, is this person, for this task, responsible, accountable, consulted or informed? 

Steve Winterfeld: When I build my RACI, only one person can be responsible. Multiple people can be accountable, consulted or informed. And then, you know, you break that out to different stakeholders for, you know, legal and public relations and, you know, leadership and all of these, and then, you know, deciding if there is a breach, and deciding to go public, and making the public announcement. So it's just a way to organize everything so in one graphic you can tell who's supposed to do what. 

Rick Howard: While we were talking about the escalation plans in RACI, Winterfeld mentioned something that I hadn't heard before. Remember, at the top of the show, Steve was worried about future legal actions after the incident was over, that maybe customers or shareholders might sue the company for damages. To protect the communications of the DACI team during the crisis, it might be possible to invoke something called the Kovel arrangements, designed to protect attorney-client privilege. In essence, you can try to protect all DACI or RACI team communications with that protection. 

Steve Winterfeld: Certainly. So, I mean, most security is a team sport. And in this particular case, you know, you're going to understand the impacts on the business. And so, typically, you know, I would define the team as an incident commander, their role being directing the investigation and informing leadership about what's been found, providing them with recommendations on what can be done and allowing them to make decisions for the leadership. Typically, that's on a committee that would include somebody like, you know, your legal team. 

Steve Winterfeld: I'll pause for a second on the legal team. The legal team can have a Kovel agreement with vendors. And that means that if it's done properly, anything discovered and discussed is privileged and cannot be discovered during subsequent legal action. And that means, basically, that the legal team has hired your forensics team, and the forensics team is supporting the lawyer, not the incident response commander. 

Rick Howard: That is a very interesting idea. Check with your in-house lawyers to see if that is possible for your organization. 

Rick Howard: The DACI or RACI model helps network defenders organize themselves for a crisis. And that is all well and good to have everything written down on paper, but as people come and go in the company or as people move laterally to get new responsibilities, how do you make sure that everybody understands their role when the crisis actually happens, especially in medium-to-large-sized organizations? Just as violinist Mischa Elman answered to two tourists when they saw his violin case and asked, how do you get to Carnegie Hall, the answer is practice. 

Rick Howard: Here's Jerry Archer again, the Sallie Mae CSO. 

Jerry Archer: We do two exercises a year, and we went through - we literally bring every C-level executive into the room, along with the supporting cast. And we will run through - on twice-a-year basis, we will run through scenarios that are brought in from the outside. So we usually hire a third party to come in and pose an incident. And then we will, at a table-top level, deal with that. So we do that twice a year so everybody's up to speed. They're all - you know, they're all looking at their playbooks so that if there are any gaps in their playbooks, they'll update their playbook. So you're talking about, you know, people who change. Anybody that takes them - you know, comes into the role has that playbook right away. So he knows what he's responsible for doing. 

Rick Howard: Ted Wagner, the SAP CISO agrees. 

Ted Wagner: We do exercise a lot, and we ask really hard questions about - if this is our situation, as dire as it may sound, what exactly do we have to do in response to it, and do we have enough internal resources to properly respond, or should we seek assistance from an out - another provider, whether it be a crisis management team or incident response - forensic incident response capability, whatever those cases might be? It does include the executive leadership. We found some interesting questions from our communication folks and from our contracts team that they might have to be involved in some of these response activities. And it was interesting to see that the larger team had a role to play, and they didn't necessarily realize it at first. And so those exercises have been really enlightening. 

Rick Howard: My friend Winterfeld, the Akamai Advisory CISO, had some interesting thoughts on exercise scenario design. 

Steve Winterfeld: I think that your exercise has to be designed to validate that the plan's going to work. And so, typically, you want to do an exercise around data loss. How are you going to deal with, you know, credit cards being stolen, something pretty straightforward? You may want an exercise around merger and acquisition to validate that you're able to do that correctly and securely and people understand their roles and responsibility. You may want to do it around the loss of availability, ransomware, denial of service - something like that. And so you want to think about the exercise you do being designed to validate your plan or program. 

Steve Winterfeld: So let's take data loss. You write out a data loss scenario. It needs to be technically feasible, so it could really happen. Then you're going to want to go in and talk to the leadership about we've discovered this, and we're alerting you. And they ask their questions. And, you know, then you say, OK, six hours later, we're coming back to you, and we now know we lost data. And they're going to say, how much data? What kind of data? I'm going to say, I don't know. The - legal's going to say, well, if we've lost data, we have to - you know, we have to say so publicly within 72 hours. And public relations is going to say, what are we going to say? You know, we don't know how much data was lost. 

Steve Winterfeld: And so you have that kind of, you know, series of letting them know what kind of information they would get. And within two hours, you kind of walk them through four or five steps and allow them to understand - you know, they can see the RACI. They understand who's going to be providing information. You have a chance to hear from them what kind of information they need to be successful. 

Steve Winterfeld: Then you take that same exercise, and you go do it with the SOC and the forensics folks. And so you take that same scenario, and you do it with three or four groups. You may go take that same scenario and do it with a business continuity disaster recovery crisis management team or with, you know, a standalone - with a privacy officer. And so a lot of ways to use one scenario multiple times in a quarter - well thought-out, technically feasible and validating the plan. 

Steve Winterfeld: If you can do all those - for me, it took a lot of work to make sure I accomplished all of those who didn't waste anybody's time. 

Rick Howard: With all of that, though, the CSOs that I talked to about incident response always came back to a central theme. You need good people. You can have all the technology and process you want, tracked with cool-looking RACI charts, exercised to death with diabolical cyber scenarios, but without the people that had the intellectual curiosity to be good at incident response, you are dead in the water. Ted Wagner, the SAP's CSO, captured this idea brilliantly. 

Ted Wagner: I will say that finding qualified personnel is such a struggle. I like the ideal of identifying folks who have that character, that intellectual curiosity, and then investing and developing them and training them and giving them all the tools to be successful. But the people, you know, we always talk about people, technology and process. Boy, the people are important. 

Rick Howard: Boy, the people are important. I love that, Ted. Rick Doten, the Centene CSO from before, had a hot take that I had not heard before or at least expressed in that particular way. I personally believe that we need people in the network defender community or self-learners, but Rick says there is a type of person who succeeds here. Here's Rick. 

Rick Doten: About the kind of people, that there's a certain type of people who do incident response. You know, it's - and you're born and not made. It's like detectives and teachers. And, you know, there's a personality type that's inquisitive, that doesn't give up easily, that continues to ask questions. You know, somebody years ago told me it's like I don't want a team who knows all the answers. I want a team that continually asks questions. And that's the kind of person that you need. 

Rick Doten: You look for people who are interested in doing it. You know, sometimes it's people in IT who are just - you notice that when you ask for things, that they give you more than you want, and that they're, like, interested in following on because they're inquisitive. They're - you know, you kind of put in job descriptions on NinjaJobs for people who want that. And it's really about kind of like talking to them and making sure you have the right personality. 

Rick Howard: This entire podcast series has been about first principle cybersecurity thinking. When I asked that question at the Hashtable - what is the primary purpose of incident response? - the answers I got were not in a very tight shot group. When I asked Rick Doten about it - who, by the way, has literally written a master's-level class on how to do incident response - he said this... 

Rick Doten: Well, you know, to identify threats on the enterprise, respond to them before they can spread and remediate them before they can cause harm. 

Rick Howard: So that sounds right out of a textbook, my friend. 

Rick Doten: Yeah. 

Rick Howard: So how do you measure that? 


Rick Doten: Right. Well... 

Rick Howard: And thanks, by the way. You've written that textbook. 

Rick Doten: Right. Yeah. Yeah. Sorry. I wrote that class, master's-level class. Really, the measurements is, you know, is kind of in outcomes. You know, what happened? How did it happen? Make it stop. Make sure it doesn't happen again. If we're not learning from these things, then that's not good. You know, there are all these traditional metrics of, you know, how - you know, the time to identify, the time to remediate, the time to, you know. And that's not as important to me as kind of like, you know, did we get all the information we need? 

Rick Howard: Jeffrey Archer, the Sallie Mae CSO, so was much more straightforward. 

Jerry Archer: Well, I, mean look. No. 1, we're going to stop the bleeding. 

Rick Howard: But I'll give the last word to Ted Wagner, the SAP CSO. 

Ted Wagner: One big goal is to provide us visibility on those risks - areas of risk and vulnerabilities that we know are present. No matter how strong your security program is, you're not going to be able to patch every system. Your architecture, for good business reasons, might be vulnerable to some risk. And you're going to have to find a way to mitigate that. And frequently monitoring is an effective way to mitigate that. 

Rick Howard: When I first started putting these last two episodes on incident response together, I presumed that I would be spending a lot of time thinking about the more technical aspects of the problem. What I discovered was that while the technical incident response teams are important and you couldn't do incident response without them, filled with the right kind of people, what CSOs spend more time thinking about relatively is the escalation plan and the preparation that goes into developing that plan. 

Rick Howard: I also learned a couple of things. First, there is a name and an actual methodology for something that I have been doing informally for years called DACI or RACI, if you want to pick your poison there. And there is a legal thing that I need to consider called the Kovel arrangement that might protect my organization even more down the line, way after we put the current crisis to bed. 

Rick Howard: And that's a wrap. Next week we will be talking about data loss protection, so you don't want to miss that. In the meantime, if you agreed or disagreed with anything I have said in the last two episodes about incident response, hit me up on LinkedIn, and we can continue the conversation there. 

Rick Howard: The CyberWire's "CSO Perspectives" is edited by John Petrik and executive produced by Peter Kilpe. Our theme song is by Blue Dot Sessions, and the mix of the episode and the remix of the theme song was done by the insanely talented Elliott Peltzman. And I am Rick Howard. Thanks for listening.