Enterprise Resiliency: Breakfast of Champions
Nic Fillingham: Hello, and welcome to Security Unlocked, a new podcast from Microsoft, where we unlock insights from the latest in news and research from across Microsoft security, engineering and operations teams. I'm Nic Fillingham.
Natalia Godyla: And I'm Natalia Godyla. In each episode, we'll discuss the latest stories from Microsoft Security, deep dive into the newest threat intel, research and data science.
Nic Fillingham: And profile some of the fascinating people working on artificial intelligence in Microsoft Security.
Natalia Godyla: And now let's unlock the pod. Hi Nic, I have big news.
Nic Fillingham: Big news. Tell me a big news.
Natalia Godyla: I got a cat. Last night at 8:00 PM, I got a cat.
Nic Fillingham: Did it come via Amazon Prime drone?
Natalia Godyla: No.
Nic Fillingham: Just, that was a very specific time. Like 8:00 PM last night is not usually the time I would associate people getting cats. Tell me how you got your cat.
Natalia Godyla: It was a lot more conventional. So I had an appointment at the shelter and found a picture of this cat with really nubby legs and immediately-
Nic Fillingham: (laughs).
Natalia Godyla: ... fell in love obviously. And they actually responded to us and we went and saw the cat, got the cat. The cat is now ours.
Nic Fillingham: That's awesome. Is the cat's name nubby.
Natalia Godyla: It's not, but it is on the list of potential name changes. So right now the cat's name is tipper. We're definitely nervous about why the cat was named tipper.
Nic Fillingham: (laughs).
Natalia Godyla: We're hiding all of the glass things for right now.
Nic Fillingham: How do we get to see the cat? Is there, will there be Instagram? Will there be Twitter photos? This is the most important question.
Natalia Godyla: Wow. I haven't planned that yet.
Nic Fillingham: You think about that and I'll, uh, I'll start announcing the first guest on this episode.
Natalia Godyla: (laughs).
Nic Fillingham: On today's episode, we speak with Irfan Mirza, who is wrapping up our coverage of the Microsoft Digital Defense Report with a conversation about enterprise resiliency. Now, this is really all of the chapters that are in the MDDR, the nation state actors, the increase in cyber crime sophistication, business email compromise that you've heard us talk about on the podcast, all gets sort of wrapped up in a nice little bow in this conversation where we talk about all right, what does it mean, what does it mean for customers? What does it mean for enterprises? What does it mean for security teams? And so we talk about enterprise resiliency. And we actually recorded this interview in late 2020, but here we are, you know, two months later and those findings are just as relevant, just as important. It's a great conversation. And after that, we speak with-
Natalia Godyla: Andrew Paverd. So he is a senior researcher on the Microsoft Security Response Center team. And his work is well, well, he does a ton of things. I honestly don't know how he has time to pull all of this off. So he does everything from safe systems programming to leveraging AI, to help with processes within MSRC, the Microsoft Security Response Center. And I just recall one of the quotes that he said from our conversation was hackers don't respect your assumptions, or something to that effect, but it's such a succinct way of describing how hackers approach our systems and technology. So another really great conversation with a, a super intelligent researcher here at Microsoft.
Nic Fillingham: On with the pod.
Natalia Godyla: On with the pod. Today, we're joined by Irfan Mirza, Director of Enterprise Continuity and Resilience, and we'll be discussing the Microsoft Digital Defense Report and more specifically enterprise resilience. So thank you for being on the show today, Irfan.
Irfan Mirza: Thanks so much glad to be here. And hope we have a, a great discussion about this. This is such an important topic now.
Natalia Godyla: Yes, absolutely. And we have been incrementally working through the Microsoft Digital Defense Report, both Nic and I have read it and have had some fantastic conversations with experts. So really looking forward to hearing about the summation around resilience and how that theme is pulled together throughout the report. So let's start it off by just hearing a little bit more about yourself. So can you tell us about your day-to-day? What is your role at Microsoft?
Irfan Mirza: Well, I lead the enterprise continuity and resilience team and we kind of provide governance overall at the enterprise. We orchestrate sort of all of the, the risk mitigations. We go and uncover what the gaps are, in our enterprise resilience story, we try to measure the effectiveness of what we're doing. We focus on preparedness, meaning that the company's ready and, you know, our critical processes and services are always on the ready. It's a broad space because it spans a very, very large global enterprise. And it's a very deep space because we have to be experts in so many areas. So it's a fun space by saying that.
Natalia Godyla: Great. And it's really appropriate today then we're talking about the MDDR and enterprise resilience. So let's start at a high level. So can you talk a little bit about just how security has changed as a product of the pandemic? Why is resilience so important now?
Irfan Mirza: Yeah, it's a great question. A lot of customers are asking that, our field is asking that question, people within the company are asking. Look, we've been 11 months under this pandemic. Maybe, you know, in some places like China, they've been going through it for a little bit longer than us, you know, a couple of months more. What we're finding after having sort of tried to stay resilient through this pandemic, uh, one obviously is on the human side, everyone's doing as much as we possibly can there. But the other part of it is on the enterprise side. What is it that we're having to think about as we think of security and as we think of enterprise resilience?
Irfan Mirza: There are a couple of big things that I think I would note, one is that, look, when this pandemic hit us, our workforce lifted and shifted. I mean, by that, I mean that we, we, we got up out of our offices and we all left. I mean, we took our laptops and whatever we could home. And we started working remotely. It was a massive, massive lift and shift of personnel, right? We got dispersed. Everybody went to their own homes and most of us have not been back to the office. And it's not just at Microsoft, even, even a lot of our customers and our partners have not gone back to the office at all, right? So that, that's a prolong snow day, if you want to call it that.
Irfan Mirza: The other thing that happened is our workload went with us. Wasn't just that, "Hey, you know, I'm taking a few days off, I'm going away or going on vacation and, and I'll be checking email periodically." No, I actually took our work with us and we started doing it remotely. So what that's done is it's created sort of a, a need to go back and look at what we thought was our corporate security boundary or perimeter.
Irfan Mirza: You know, in the classical model, we used to think of the corporation and its facilities as the, the area that we had to go and secure. But now in this dispersed workforce model, we have to think about my kitchen as part of that corporate perimeter. And all of a sudden we have to ensure that, that my kitchen is as secure as the corporate network or as the facilities or the office that I was working from. That paradigm is completely different than anything we'd thought about before.
Nic Fillingham: And so Irfan, in the MDDR, uh, this section, um, and if you've got the report open, you're playing along at home, I believe it's page 71. This enterprise resiliency is sort of a wrap-up of, of a lot of the observations that are in the MDDR report. It's not a new section. It's as you're getting towards the end of the report, you're looking for, okay, now what does this mean to me? I'm a CSO. I need to make new security policies, security decisions for my organization. This concept of enterprise resiliency is sort of a wrap up of everything that we've seen across cyber crime, across the nation state, et cetera, et cetera. Is that, is that accurate? Is that a good way to sort of read that section in the report?
Irfan Mirza: Yeah. It is really the, the way to think of it, right.? It's sort of like a, the conclusion, so what, or why is this relevant to me and what can I do about it? When you think about the report and the way that it's structured, look, we, you know, the report goes into great detail about cyber crime as you called out Nic. And then it talks about nation state threats.
Irfan Mirza: These are newer things to us. We've certainly seen them on the rise, actors that are well-trained, they're well-funded they play a long game, not necessarily a short game, they're looking, they're watching and they're waiting, they're waiting for us to make mistakes or to have gaps, they look for changes in tactics, either ours, uh, they themselves are quite agile, right?
Irfan Mirza: So when you think about the environment in which we have to think about resilience, and we have to think about security, that environment itself has got new vectors or new threats that are, that are impacting it, right? In addition to that, our workforce has now dispersed, right? We're all over the, all over the globe. We see emerging threats that are, that are, non-classical like ransomware. We see attacks on supply chain. We continue to see malware and malware growing, right?
Irfan Mirza: And, and so when you think about that, you have to think if I need to secure now my, my dispersed corporate assets and resources, my people, the workload, the data, the services and the processes that are all there, what are the, the sort of three big things I would need to think about? And so this report sort of encapsulates all, all of that. It gives the details of what, what's happening. And, and then page 71 is you say that resilience piece sort of comes back and says, "Look, your security boundaries extended. Like it or not, it is extended at this point. You've got to think beyond that on-site perimeter that we were thinking about before."
Irfan Mirza: So we have to start thinking differently. And th- there's three critical areas that are sort of called out, acknowledging the security boundary has increased, thinking about resilience and performance, and then validating the resilience of our human infrastructure. This is like new ideas, but these are all becoming imperatives for us. We're having to do this now, whether we like it or not.
Irfan Mirza: And so this report sort of gives our customers, and, and it's a reflection of what we're doing in the company. It's an open and honest conversation about how we propose to tackle these challenges that we're facing.
Nic Fillingham: And so Irfan if we can move on to that critical area, number two, that prioritizing resilient performance. When I say the word performance and resilient performance, is that scoped down just to sort of IT infrastructure, or does that go all the way through to the humans, the actual people in the organization and, um, how they are performing their own tasks, their own jobs and the tasks that are part of their, their job and et cetera, et cetera? What's the, I guess what's the scope of that area too?
Irfan Mirza: As we were thinking about resilience, as you know, shortly after we dispersed the workforce, we started thinking about, about what should be included in our classical understanding of resilience. But when you think about, about typical IT services and online services, and so on, a lot of that work is already being done with the life site reviews that we do and people are paying very close attention to service performance. We have SLAs, we have obligations, we have commitments that we've made that our services will be performing to a certain degree, but there are also business processes that are associated with these services very closely.
Irfan Mirza: When you think about all of the processes that are involved and services that are involved from the time a customer thinks of buying Office, uh, 365, as an example, to the time that they provision their first mailbox, or they receive their first email, there are dozens of process, business processes.
Irfan Mirza: Every single service in that chain could be working to 100% efficiency. And yet if the business processes, aren't there, for instance, to process the deal, to process the contract, to process, uh, the customer's payment or, uh, acknowledge receipt of the payment in order to be able to provision the service, all of these processes, all of a sudden have to, we have to make sure that they're also performing.
Irfan Mirza: So when we start thinking about resilience, up to now, business continuity has focused on, are you ready? Are you prepared? Are your dependencies mapped? Have you, have you done a business impact analysis? Are you validating and testing your preparedness? You know, are you calling down your call tree for instance? But I think where we're going now with true enterprise resilience, especially in this sort of modern
Irfan Mirza: ... day, we're, we're looking at performance, right? What, what is your preparedness resulting in? So if you stop and you think about a child at school, they get homework. Well, the homework really, they bring it home. They do it. They take it back to the teacher. They get graded on it. That's wonderful. This means that the child is ready. But at some point in time, the class or the teacher is going to give them a test, and that test is going to be the measure of performance, right?
Irfan Mirza: So we need to start thinking of resilience and continuity in the same way. We're prepared. We've done all our homework. Now let's go and see how many outages did you have? How critical were the outages? How long did they last? How many of them were repeat outages? How many of the repeat outages were for services that are supposed to have zero downtown, like services that are always supposed to on like your DNS service or your identity auth- authentication service, right? So, when you start thinking about, uh, resilience from that perspective, now you've got a new set of data that you have to go and capture, or data that you're capturing, you have to now have to have insights from it. You've got to be able to correlate your preparedness, meaning the homework that you've done with your actual performance, your outage and your, and your gap information. All right?
Irfan Mirza: So that, that's what prioritizing resilient performance is all about. It's about taking realtime enterprise preparedness and mapping it to real time enterprise performance. That tells you if your preparedness is good enough or not, or what it is that you need to do. There's a loop here, a feedback loop that has to be closed. You can't just say that, well, you know, we've done all the exercises theoretically. We're good and we're ready to take on any sort of a crisis or, or, or disaster. Yeah, that's fine. Can we compare it to realtime what you're doing? Can we break glass and see what that looks like? Can we shut you down and or shut down parts of your operation as in the event of an earthquake for instance, or a hurricane wiping out, uh, access to a data center, right? Can we do those things and still be resilient when that happens? So this is what performance and resilience come together in that space.
Natalia Godyla: So am I right in understanding that beyond, like you said, the theoretical where you think about the policies that you should have in place, and the frameworks that you should have in place, you have the analytics on, you know, the state of, the state of how performant your systems are to date. And then in addition, is there now the need for some sort of stress testing? Like actually figuring out whether an additional load on a system would cause it to break, to not be resilient? Is that now part of the new approach to resilience?
Irfan Mirza: Yeah. There are, there are several, several things to do here, right? You absolutely said it. There's a stress test. Actually, this pandemic has, is already a stress test in and of itself, right? It's stressing us in a many ways. It's stressing, obviously the psyche and, and, you know, our whole psychology, and our ability to sustain in quarantine, in isolated, in insulated environments and so on. But it's also testing our ability to do the things that we just so, uh, so much took for granted, like the ability to patch a server that's sitting under my desk in the office whenever I needed to, right? That server now has to become a managed item that somebody can manage remotely, patch remotely, update remotely when needed, control administrative access and privileges remotely. But yes, for resilience, I think we need to now collect all of the data that we have been collecting or looking at and saying, can we start to create those correlations between our preparedness and between our real performance?
Irfan Mirza: But there's another area that this dovetails into which is that of human resilience, right? We talked a little bit earlier about, you know, sort of the whole world enduring this hardship. We need to first and foremost look at our suppliers, subcontractors, people that we're critically dependent on. What is their resilience look like? That's another aspect that we have to go back. In the areas where we have large human resources or, or workforces that are working on our behalf, we need to make sure that they're staying resilient, right?
Irfan Mirza: We talked on a lot about work/life balance before. Now I think the new buzzword in HR conference rooms is going to be work/life integration. It's completely integrated, and so we need to start thinking about the impact that would have. Are we tracking attrition of our employees, of certain demographics within the employees? Are we looking at disengagement? People just sort of, "Yeah, I'm working from home, but I'm not really being fully engaged." Right? The hallway conversations we used to have are no longer there. And we need to start thinking, are people divesting? Our resources, are they divesting in the workplace? Are they divesting in their, in their work or work/life commitment? These measures are all now having to be sort of like...
Irfan Mirza: We used to rely on intuition, a look, a hallway gaze, look at the, the snap in somebody's walk as they walked away from you or out of your office. We don't have that anymore. Everybody's relatively stagnant. We're, we're, we're seated. We don't get to see body language that much. We don't get to read that. There's a whole new set of dynamics that are coming into play, and I think smart corporations and smart companies will start looking at this as a very important area to pay attention to.
Nic Fillingham: How are we measuring that? What tools or sort of techniques, or, or sort of frameworks exist to actually put a metric around this stuff, and determine sort of where, where an organization is in terms of their level of resiliency?
Irfan Mirza: This question is actually the whole reason why we brought this enterprise resilience sort of a conclusion to this fourth chapter, and, and, you know, the summation of this, of this report.
Irfan Mirza: What we're doing now is we're saying, look. Things that used to be fundamentally within the domain of IT departments, or used to be fundamentally with, within the domain of live site, or used to be fundamentally in the domain of human resource departments are now all floating up to be corporate imperatives, to be enterprise imperatives. I think the thinking here is that we need to make sure that the data that we've been collecting about, as an example to answer your question, attrition, right? A certain demographic. Millennials, uh, changing jobs, leaving the company, just to pick an example more than anything else. This is no longer just data that the HR Department is interested in, or that recruiting would be interested in, or, or retention would be interested. This is data that's about to significantly impact the enterprise, and it needs to be brought into the enterprise purview.
Irfan Mirza: Our classical and traditional models of looking at things in silos don't allow us to do that. What we're recommending is that we need to have a broader perspective and try to drive insights from this that do tell a more comprehensive story about our ent- enterprise resilience. That story needs to include the resilience of our services, our business processes, our suppliers, our human capital, our infrastructure, our extended security boundary, our data protection, uh, prevention of data loss, our intrusion detection. I mean, there's such a broad area that we have to cover. That's we're saying. And, and as we implement this new sort of zero trust model, I think the, the effectiveness of that model, how much progress we're making is becoming an enterprise priority, not just something that the IT department is going to go around on it's own.
Nic Fillingham: Irfan, I wonder if I could put you on the spot, and were there any interesting bits of data that you saw in those first couple months of the shift to remote work where like, yeah, the number of unique devices on the Microsoft corporate network quadrupled in 48 hours. Like any, anything like that? I'd just wondering what, what little stats you may have in hand.
Irfan Mirza: Yeah. The number of devices and sort of the flavors of devices, we've always anticipated that that's going to be varied. We're cognizant of that. Look, we have, you know, people have PCs. They have MACs. They have Linux machines, and, and they have service o- operating software. There's a lot of different flavors. And, and it's not just the device and the OS that matters, it's also what applications you're running. Some applications we can certify or trust, and others perhaps we can't, or that we still haven't gotten around to, to verifying, right? And all of these sit, and they all perform various functions including intruding and potentially exfiltrating data and Spyware and Malware and all of that. So when you think about that, we've always anticipated it.
Irfan Mirza: But the one thing that, that we were extremely worried about, and I think a lot of our Enterprise customers were worried about, is the performance of the workforce. What we found very early on in, in the, in the lift and shift phase was that we needed to have a way of measuring is our, our built processes working? Are we checking in the same amount of code as we were before? And we noted a couple of interesting things. We looked at our, our VPN usage and said, what are those numbers look like? Are they going up and down?
Irfan Mirza: And I think what we found is that initially, the effect was quite comparable to what we had, uh, when we experienced snow days. Schools are shut down. People don't go to work. They're slipping and sliding over here. We're just not prepared for snow weather in, in this state like some of the others. So what happened is, we saw that we were, we were sort of seeing the same level of productivity as snow days. We say that we had the same level of VPN usage as snow days, and we were worried because that, you know, when, when it snows, people usually take the day off, and then they go skiing.
Irfan Mirza: So what happened? Well, after about a week things started picking back up. People got tired of sort of playing snow day and decided that, you know what? It's time to, to dig in, and human nature, I think, kicked in, the integrity of the workforce kicked in. And sure enough, productivity went up, VPN usage went up, our number of sessions, the duration of sessions. Meetings became shorter.
Nic Fillingham: Can I tell you hallelujah? (laughs)
Irfan Mirza: (laughs)
Nic Fillingham: That's one of the, that's one of the great-
Irfan Mirza: Absolutely.
Nic Fillingham: ... upsides, isn't it? To this, this new culture of remote work is that we're all meeting for, for less amount of time, which I think, I think is fantastic.
Irfan Mirza: Look, you know, in times of crisis, whether it's a natural disaster, or a pandemic, or, or a manmade situation such as a war or a civil war, or whatever, I, I think what happens is the amount of resources that you are customarily used to having access to gets limited. The way in which you work shifts. It changes. And so the, the true test of resilience, I think, is when you are able to adapt to those changes gracefully without requiring significant new investment and you're able to still meet and fulfill your customer obligations, your operational expectations. That really is.
Irfan Mirza: So what you learn in times of hardship are to sort of live, you know, more spartan-like. And that spartan-ism, if there's such a word as that, that's what allows you to stay resilient, to say what are the core things that I need in order to stay up and running? And those fundamental areas become the areas of great investment, the areas that you watch over more carefully, the areas that you measure the performance of, the areas that you look for patterns and, and trends in to try to predict what's happening, right?
Irfan Mirza: So that is something that carries over from experiences of being in the front lines of a, uh, a war or, or from being, uh, you know, in the midst of a hurricane trying to recover a data center, or an earthquake, or any other, uh, type of power outage, right? These are all the sort of key scenarios that we would be going to look at. And that's one of the things they all have in common. It's really that you don't have the resources or access to the resources that you thought you did, and now you've got to be able to do some things slightly differently.
Natalia Godyla: Thank you for joining us on the podcast today. It's been great to get your perspective on enterprise resilience. Really fascinating stuff. So, thank you.
Irfan Mirza: Thank you, Natalia. And, and thank you, Nick. It's been a great conversation. As I look back at this discussion that we had, I feel even, even stronger now that the recommendations that we're making, and the guidance that we're giving our customers and sharing our experiences, becomes really, really important. I think this is something that we're learning as we're going along. We're learning on the journey. We're uncovering things that we didn't know. We're looking at data in a different way. We're, we're trying to figure out how do we sustain ourselves,
Nic Fillingham: ... not just through this pandemic, but also beyond that. And I think the, whatever it is that we're learning, it becomes really important to share. And for our customers and people who are listening to this podcast to share back with us what they've learned, I think that becomes incredibly important because as much as we like to tell people what we're doing, we also want to know what, what people are doing. And so learning that I think will be a great, great experience for us to have as well. So thank you so much for enabling this conversation.
Natalia Godyla: And now let's meet an expert from the Microsoft security team to learn more about the diverse backgrounds and experiences of the humans creating AI and tech at Microsoft. Welcome back to another episode of Security Unlocked. We are sitting with Andrew Paverd today, senior researcher at Microsoft. Welcome to the show, Andrew.
Andrew Paverd: Thanks very much. And thanks for having me.
Natalia Godyla: Oh, we're really excited to chat with you today. So I'm just doing a little research on your background and looks like you've had a really varied experience in terms of security domains consulting for mobile device security. I saw some research on system security. And it looks like now you're focused on confidential computing at Microsoft. So let's start there. Can you talk a little bit about what a day in the life of Andrew looks like at Microsoft?
Andrew Paverd: Absolutely. I think I have one of the most fascinating roles at Microsoft. On a day-to-day basis, I'm a researcher in the confidential computing group at the Microsoft Research Lab in Cambridge, but I also work very closely with the Microsoft Security Response Center, the MSRC. And so these are the folks who, who are dealing with the frontline incidents and responding to reported vulnerabilities at Microsoft. But I work more on the research side of things. So how do we bridge the gap between research and what's really happening on the, on the front lines? And so I, I think my position is quite unique. It's, it's hard to describe in any other way than that, other than to say, I work on research problems that are relevant to Microsoft security.
Natalia Godyla: And what are some of those research problems that you're focused on?
Andrew Paverd: Oh, so it's actually been a really interesting journey since I joined Microsoft two years ago now. My background, as you mentioned, was actually more in systems security. So I had, I previously worked with technologies like trusted execution environments, but since joining Microsoft, I've worked on two really, really interesting projects. The, the first has been around what we call safe systems programming languages.
Andrew Paverd: So to give a bit more detail about it in the security response center, we've looked at the different vulnerabilities that Microsoft has, has patched and addressed over the years and seen some really interesting statistics that something like 70% of those vulnerabilities for the pa- past decade have been caused by a class of vulnerability called memory corruption. And so the, the question around this is how do we try and solve the root cause of problem? How do we address, uh, memory corruption bugs in a durable way?
Andrew Paverd: And so people have been looking at both within Microsoft and more broadly at how we could do this by transitioning to a, a different programming paradigm, a more secure programming language, perhaps. So if you think of a lot of software being written in C and C++ this is potentially a, a cause of, of memory corruption bugs. So we were looking at what can we do about changing to safer programming languages for, for systems software. So you might've heard about new languages that have emerged like the Rust programming language. Part of this project was investigating how far we can go with languages like Rust and, and what do we need to do to enable the use of Rust at Microsoft.
Natalia Godyla: And what was your role with Rust? Is this just the language that you had determined was a safe buyable option, or were you part of potentially producing that language or evolving it to a place that could be safer?
Andrew Paverd: That's an excellent question. So in, in fact it, it was a bit of both first determining is this a suitable language? Trying to define the evaluation criteria of how we would determine that. But then also once we'd found Rust to be a language that we decided we could potentially run with, there was an element of what do we need to do to bring this up to, let's say to be usable within Microsoft. And actually I, I did quite a bit of work on, on this. We realized that, uh, some Microsoft security technologies that are available in our Microsoft compilers weren't yet available in the Rust compiler. One in particular is, is called control flow guard. It's a Windows security technology and this wasn't available in Rust.
Andrew Paverd: And so the team I, I work with looked at this and said, okay, we'd like to have this implemented, but nobody was available to implement it at the time. So I said, all right, let me do a prototype implementation and, uh, contributed this to the open source project. And in the end, I ended up following through with that. And so I've, I've been essentially maintaining the, the Microsoft control flow guide implementation for the, the Rust compiler. So really an example of Microsoft contributing to this open source language that, that we hope to be using further.
Nic Fillingham: Andrew, could you speak a little bit more to control flow guard and control flow integrity? What is that? I know a little bit about it, but I'd love to, for our audience to sort of like expand upon that idea.
Andrew Paverd: Absolutely. So this is actually an, an example of a technology that goes back to a collaboration between the MSRC, the, the security response center and, and Microsoft Research. This technology control flow guard is really intended to enforce a property that we call control flow integrity. And that simply means that if you think of a program, the control flow of a program jumps through two different functions. And ideally what you want in a well-behaved program is that the control always follows a well-defined paths.
Andrew Paverd: So for example, you start executing a function at the beginning of the function, rather than halfway through. If for example, you could start executing a function halfway through this leads to all kinds of possible attacks. And so what control flow guard does is it checks whenever your, your program's going to do a bronch, whenever it's going to jump to a different place in the code, it checks that that jump is a valid call target, that you're actually jumping to the correct place. And this is not the attacker trying to compromise your program and launch one of many different types of attacks.
Nic Fillingham: And so how do you do that? What's the process by which you do en- ensure that control flow?
Andrew Paverd: Oh, this is really interesting. So this is a technology that's supported by Windows, at the moment it's only available on, on Microsoft Windows. And it works in conjunction between both the compiler and the operating system. So the compiler, when you compile your program gives you a list of the valid code targets. It says, "All right, here are the places in the program where you should be allowed to jump to." And then as the program gets loaded, the, the operating system loads, this list into a highly optimized form so that when the program is running it can do this check really, really quickly to say, is this jump that I'm about to do actually allowed? And so it's this combination of the Windows operating system, plus the compiler instrumentation that, that really make this possible.
Andrew Paverd: Now this is quite widely used in Windows. Um, we want in fact as much Microsoft software as possible to use this. And so it's really critical that we enable it in any sort of programming language that we want to use.
Nic Fillingham: How do you protect that list though? So now you, isn't that now a target for potential attackers?
Andrew Paverd: Absolutely. Yeah. And, and it becomes a bit of a race to, to-
Nic Fillingham: Cat and mouse.
Andrew Paverd: ... protect different-
Natalia Godyla: (laughs).
Andrew Paverd: A bit of, a bit of a cat, cat and mouse game. But at least the nice thing is because list is in one place, we can protect that area of memory to a much greater degree than, than the rest of the program.
Natalia Godyla: So just taking a step back, can you talk a little bit about your path to security? What roles have you had? What brought you to security? What's informing your role today?
Andrew Paverd: It's an interesting story of how I ended up working in security. It was when I was applying for PhD programs, I had written a PhD research proposal about a topic I thought was very interesting at the time on mobile cloud computing. And I still think that's a hugely interesting topic. And what happened was I sent this research proposal to an academic at the University of Oxford, where I, I was looking to study, and I didn't hear anything for, for a while.
Andrew Paverd: And then, a fe- a few days later I got an email back from a completely different academic saying, "This is a very interesting topic. I have a project that's quite similar, but looking at this from a security perspective, would you be interested in doing a PhD in security on, on this topic?" And, so this was my very mind-blowing experience for me. I hadn't considered security in that way before, but I, I took a course on security and found that this was something I was, I was really interested in and ended up accepting the, the PhD offer and did a PhD in system security. And that's really how I got into security. And as they say, the rest is history.
Natalia Godyla: Is there particular part of security, particular domain within security that is most near and dear to your heart?
Andrew Paverd: Oh, that's a good question.
Natalia Godyla: (laughs).
Andrew Paverd: I think, I, I think for me, security it- itself is such a broad field that we need to ensure that we have security at, at all levels of the stack, at all, places within the chain, in that it's really going to be the weakest link that an attacker will, will go for. And so I've actually changed field perhaps three times so far. This is what keeps it interesting. My PhD work was around trusted computing. And then as I said, I, since joining Microsoft, I've been largely working in both safe systems programming languages and more recently AI and security. And so I think that's what makes security interesting. The, the fact that it's never the same thing two days in a row.
Natalia Godyla: I think you hit on the secret phrase for this show. So AI and security. Can you talk a little bit about what you've been doing in AI and security within Microsoft?
Andrew Paverd: Certainly. So about a year ago, as many people in the industry realized that AI is being very widely used and is having great results in so many different products and services, but that there is a risk that AI algorithms and systems themselves may be attacked. For example, I, I know you had some, some guests on your podcast previously, including Ram Shankar Siva Kumar who discussed the Adversarial ML Threat Matrix. And this is primarily the area that I've been working in for the past year. Looking at how AI systems can be, can be attacked from a security or a privacy perspective in collaboration with researchers, from MSR, Cambridge.
Natalia Godyla: What are you most passionate about? What's next for a couple of these projects? Like with Rust, is there a desire to make that ubiquitously beyond Microsoft? What's the next stage?
Andrew Paverd: Ab- absolutely.
Natalia Godyla: Lots of questions. (laughs).
Andrew Paverd: Yeah. There's a lot of interest in this. So, um, personally, I'm, I'm not working on the SSPL project myself, or I'm, I'm not working on the safe systems programming languages project myself any further, but I know that there's a lot of interest within Microsoft. And so hopefully we'll see some exciting things e- emerging in that space. But I think my focus is really going to be more on the, both the security of AI, and now we're also exploring different areas where we can use AI for security. This is in collaboration, more with the security response center. So looking into different ways that we can automate different processes and use AI for different types of, of analysis. So certainly a lot more to, to come in that space.
Nic Fillingham: I wanted to come back to Rust for, for a second there, Andrew. So you talked about how the Rust programming language was specifically designed for, correct me on taxonomy, memory integrity. Is that correct?
Andrew Paverd: For, for memory safety, yeah.
Nic Fillingham: Memory safety. Got it. What's happening on sort of
Nic Fillingham: ... and sort of the, the flip side of that coin in terms of instead of having to choose a programming language that has memory safety as sort of a core tenet. What's happening with the operating system to ensure that languages that maybe don't have memory safety sort of front and center can be safer to use, and aren't threats or risks to memory integrity are, are sort of mitigated. So what's happening on the operating system side, is that what Control Flow Guard is designed to do? Or are there other things happening to ensure that memory safety is, is not just the responsibility of the programming language?
Andrew Paverd: Oh, it's, that's an excellent question. So Control Flow Guard certainly helps. It helps to mitigate exploits once there's been an, an initial memory safety violation. But I think that there's a lot of interesting work going on both in the product space, and also in the research space about how do we minimize the amount of software that, that we have to trust. If you accept that software is going to have to bugs, it's going to have vulnerabilities. What we'd like to do, is we'd like to trust as little software as possible.
Andrew Paverd: And so there's a really interesting effort which is now available in, in Azure under the, the heading of Confidential Computing. Which is this idea that you want to run your security sensitive workloads in a hardware enforced trusted execution environment. So you actually want to take the operating system completely out of what we call the trusted computing base. So that even if there are vulnerabilities in, in the OS, they don't affect your security sensitive workloads. So I think that there's this, this great trend towards confidential computing around compartmentalizing and segmenting the software systems that we're going to be running.
Andrew Paverd: So removing the operating system from the trusted computing. And, and indeed taking this further, there's already something available in Azure, you can look up Azure Confidential Computing. But there's a lot of research coming in from the, the academic side of things about new technologies and new ways of, of enforcing separation and compartmentalization. And so I think it's part of this full story of, of security that we'll need memory safe programming languages. We'll need compartmentalization techniques, some of which, uh, rely on new hardware features. And we need to put all of this together to really build a, a secure ecosystem.
Nic Fillingham: I only heard of Confidential Computing recently. I'm sure it's not a new concept. But for me as a sort of a productized thing, I only sort of recently stumbled upon it. I did not realize that there was this gap, there was this delta in terms of data being encrypted at rest, data being encrypted in transit. But then while the data itself was being processed or transformed, that that was a, was a gap. Is that the core idea around Confidential Computing to ensure that at no stage the data is not encrypted? Is, is that sort of what it is?
Andrew Paverd: Absolutely. And it's one of the key pieces. So we call that isolated execution in the sense that the data is running in a, a trusted environment where only the code within that environment can access that data. So if you think about the hypervisor and the operation system, all of those can be outside of the trusted environment. We don't need to trust those for the correct computation of, of that data. And as soon as that data leaves this trusted environment, for example if it's written out of the CPU into the DRAM, then it gets automatically encrypted.
Andrew Paverd: And so we have that really, really strong guarantee that only our code is gonna be touching our data. And the second part of this, and this is the really important part, is a, a protocol called remote attestation where this trusted environment can prove to a remote party, for example the, the customer, exactly what code is going to be running over that data. So you have a, a very high degree of assurance of, "This is exactly the code that's gonna be running over my data. And no other code will, will have access to it."
Andrew Paverd: And the incredibly interesting thing is then, what can we build with these trusted execution environment? What can we build with Confidential Computing? And to bring this back to the, the keyword of your podcast, we're very much looking at confidential machine learning. How do we run machine learning and AI workloads within these trusted execution environments? And, and that unlocks a whole lot of new potential.
Nic Fillingham: Andrew, do you have any advice for people that are m- maybe still studying or thinking about studying? Uh, I see so you, your initial degree was in, not in computer engineering, was it?
Andrew Paverd: No. I, I actually did electrical engineering. And then electrical and computer engineering. And by the time I did a PhD, they put me in a computer science department, even though-
Nic Fillingham: (laughs).
Andrew Paverd: ... I was doing software engineering.
Nic Fillingham: Yeah. I, so I wonder if folks out there that, that don't have a software or a computer engineering degree, maybe they have a, a different engineering focus or a mathematics focus. Any advice on when and how to consider computer engineering, or sort of the computing field?
Andrew Paverd: Yeah. Uh, absolutely. Uh, I think, eh, in particular if we're talking about security, I'd say have a look at security. It's often said that people who come with the best security mindsets haven't necessarily gone through the traditional programs. Uh, of course it's fantastic if you can do a, a computer science degree. But if you're coming at this from another area, another, another aspect, you bring a unique perspective to the world of cyber security. And so I would say, have a look at security. See if it's something that, that interests you. You, you might find like I did that it's a completely fascinating topic.
Andrew Paverd: And the from there, it would just be a question of seeing where your skills and expertise could best fit in to the broad picture of security. We desperately need people working in this field from all different disciplines, bringing a diversity of thought to the field. And so I, I'd highly encourage people to have a look at this.
Natalia Godyla: And you made a, quite a hard turn into security through the PhD suggestion. It, like you said, it was one course and then you were off. So, uh, what do you think from your background prepared you to make that kind of transition? And maybe there's something there that could inform others along the way.
Andrew Paverd: I think, yes, it, it's a question of looking at, uh, of understanding the system in as much detail as you possibly can. And then trying to think like, like an attacker. Trying to think about what could go wrong in this system? And as we know, attackers won't respect our assumptions. They will use a system in a different way in which it was designed. And that ability to, to think out of the box, which, which comes from understanding how the system works. And then really just a, a curiosity about security. They call it the security mindset, of perhaps being a little bit cautious and cynical. To say-
Natalia Godyla: (laughs).
Andrew Paverd: ... "Well, this can go wrong, so it probably will go wrong." But I think that's, that's the best way into it.
Natalia Godyla: Must be a strong follower of Murphy's Law.
Andrew Paverd: Oh, yes.
Natalia Godyla: (laughs).
Nic Fillingham: What are you watching? What are you binging? What are you reading? Either of those questions, or anything along in that flavor.
Andrew Paverd: I'll, I'll have to admit, I'm a, I'm a big fan of Star Trek. So I've been watching the new Star Trek Discovery series on, on Netflix. That's, that's great fun. And I've recently been reading a, a really in- interesting book called Atomic Habits. About how we can make some small changes, and, uh, how these can, can help us to build larger habits and, and propagate through.
Nic Fillingham: That's fascinating. So that's as in looking at trying to learn from how atoms and atomic models work, and seeing if we can apply that to like human behavior?
Andrew Paverd: Uh, no. It's just the-
Nic Fillingham: Oh, (laughs).
Andrew Paverd: ... title of the book.
Natalia Godyla: (laughs).
Nic Fillingham: You, you had me there.
Natalia Godyla: Gotcha, Nick.
Nic Fillingham: I was like, "Wow-"
Natalia Godyla: (laughs).
Nic Fillingham: " ... that sounds fascinating." Like, "Nope, nope. Just marketing." Marketing for the win. Have you always been Star Trek? Are you, if, if you had to choose team Star Trek or team Star Wars, or, or another? You, it would be Star Trek?
Andrew Paverd: I think so. Yeah.
Nic Fillingham: Yeah, me too. I'm, I'm team Star Trek. Which m- may lose us a lot of subscribers, including Natalia.
Andrew Paverd: (laughs).
Nic Fillingham: Natalia has her hands over her mouth here. And she's, "Oh my gosh." Favorite Star Trek show or-
Andrew Paverd: I, I have to say, it, it would've been the first one I watched, Deep Space Nine.
Nic Fillingham: I love Deep Space Nine. I whispered that. Maybe that-
Natalia Godyla: (laughs).
Nic Fillingham: ... it's Deep Space Nine's great. Yep. All right, cool. All right, Andrew, you're allowed back on the podcast. That's good.
Andrew Paverd: Thanks.
Natalia Godyla: You're allowed back, but I-
Nic Fillingham: (laughs).
Natalia Godyla: ... (laughs).
Andrew Paverd: (laughs).
Nic Fillingham: Sort of before we close, Andrew, is there anything you'd like to plug? I know you have a, you have a blog. I know you work on a lot of other sorta projects and groups. Anything you'd like to, uh, plug to the listeners?
Andrew Paverd: Absolutely, yeah. Um, we are actually hiring. Eh, well, the team I work with in Cambridge is, is hiring. So if you're interested in privacy preserving machine learning, please do have a look at the website, careers.microsoft.com. And submit an application to, to join our team.
Natalia Godyla: That sounds fascinating. Thank you.
Nic Fillingham: And can we follow along on your journey and all the great things you're working at, at your website?
Andrew Paverd: Eh, absolutely, yeah. And if you follow along the, the Twitter feeds of both Microsoft Research Cambridge, and the Microsoft Security Response Center, we'll, we'll make sure to tweet about any of the, the new work that's coming out.
Nic Fillingham: That's great. Well, Andrew Paverd, thank you so much for joining us on the Security Unlocked Podcast. We'd love to have you come back and talk about some of the projects you're working on in a deep-dive section on a future episode.
Andrew Paverd: Thanks very much for having me.
Natalia Godyla: Well, we had a great time unlocking insights into security, from research to artificial intelligence. Keep an eye out for our next episode.
Nic Fillingham: And don't forget to tweet @MSFTSecurity. Or email us at firstname.lastname@example.org with topics you'd like to hear on a future episode. Until then, stay safe.
Natalia Godyla: Stay secure.