Security Unlocked 10.28.20
Ep 4 | 10.28.20

How to Catch a Villian With Math


Nic Fillingham: Hello and welcome to Security Unlocked, a new podcast from Microsoft, where we unlock insights from the latest in news and research from across Microsoft security, engineering and operations teams. I'm Nic Fillingham.

Natalia Godyla: And I'm Natalia Godyla. In each episode, we'll discuss the latest stories from Microsoft security, deep dive into the newest threat Intel, research and data science.

Nic Fillingham: And, profile some of the fascinating people working on artificial intelligence in Microsoft security. If you enjoy the podcast, have a request for a topic you'd like covered or have some feedback on how we can make the podcast better.

Natalia Godyla: Please contact us at or via Microsoft security on Twitter. We'd love to hear from you. Hi Nick. How is it going?

Nic Fillingham: Hi Natalia, it's good. But bit of a first world problem here at Chateau Fillingham. I left a packet of dark chocolate covered mangoes open at my desk all throughout the weekend, and so now I'm inundated with the perfume of dark chocolate covered mangoes, which is a double edged sword. It's fantastic. And it's also terrible. It is better than the smell of acoustic foam, which you and I have both invested in, in order to make our microphone sound a little bit better. I will take dark chocolate covered mangoes over acoustic foam.

Natalia Godyla: It sounds like I have something to learn from you. Maybe I should go get a bag of mangoes and open them on my desk, just leave them there. That's our podcasting Nic.

Nic Fillingham: We're four episodes in and this is the wisdom that we have. We have devised being professional podcasters.

Natalia Godyla: I feel like with that hot tip, we should dive into episode four.

Nic Fillingham: This is episode four, which means we've had three episodes out there, which means people have been listening and downloading and rating and sending us tweets and emails with their feedback. And thank you so much to everyone that's listens, thank you so much everyone that has rated, that has sent us a tweet or sent us an email, we're reading every single one of them. We're actively following up on all of them and adding them to our editorial calendar for topics we can cover on future episodes. So thank you.

Natalia Godyla: Yes, definitely want to second that. And with that, our first segment of episode four is a great one. We have three different perspectives on the episode. We speak with an expert on statistics and machine learning, and expert on threat analytics and expert on security research, as we dive into Microsoft 365 Defender and all of the great technology that underpins that product.

Nic Fillingham: We learned a lot about Microsoft 365 Defender, and also how to interview three guests at once. That was some interesting logistical challenges, but I think we got there.

Natalia Godyla: I think we earned our badge for it.

Nic Fillingham: We unlocked an achievement with that one, and keeping with the theme of numbers and math, our expert that we're going to talk to is Dr. Anna Bertiger, who is a PhD and a post-doctorate in math. And I learned a new word, genuinely learned a new word, combinatorics, not even sure if I'm saying it right, but Dr. Anna will explain what that word is. Combinatorics, had you heard that word? I'd never heard that word.

Natalia Godyla: I had not.

Nic Fillingham: The teaser is that is a fancy word for counting things. And I'm quoting Dr. Anna there. We're going to learn how Dr. Anna approaches finding villains with math. That was a pretty cool conversation.

Natalia Godyla: She's got an incredible passion for the intersection of those disciplines, and it's great to hear how she partners with security research in order to apply her knowledge of mathematics to our detections and to security as a whole.

Nic Fillingham: Enough jibber-jabber, let's get on with the episode. Who says jibber-jabber?

Nic Fillingham: So Mike, Cole and Justin, welcome to the Security Unlocked podcast. This is our first episode where we're going to be interviewing three guests at once. Thank you in advance for your time. I'd love to start, if you could just give a brief introduction of yourself, your role, what that means day to day at Microsoft. Mike, if we might start with you.

Mike Flowers: Yeah, definitely. Mike Flowers and I am a security researcher within the Microsoft threat protection team, my day to day business is to try and bring together the different alerts that each of our sub components are exposing and bringing together into a single incidence that our customers can then look at to be able to get the whole picture of what's going on as part of an attack.

Nic Fillingham: Great and Cole.

Cole Sodja: Hi, my name is Cole Sodja. I'm a statistician. I work also in the Microsoft threat protection team. Role-wise, I primarily serve for helping implement machine learning for security applications, but pretty much what I spend my time on is one, collaborating with people like Mike and Justin, to understand kind of the threat landscape and identify attacks to model. Two, a lot of preparing and analyzing the data needed, so we could actually model those attacks appropriately using machine learning. And three, pretty much then just writing the code around the machine learning implementing itself, and then working with engineers to deploy it into our products such as MTP.

Nic Fillingham: Excellent, welcome. And Justin.

Justin Carroll: Hey, I'm Justin Carroll. I'm a threat analyst for the threat intelligence global engagement and response team. My role is essentially threat hunting typically across end point data looking for new or novel behaviors that are associated to known or suspected activity groups. Or new behavior that we may have interest in and providing intelligence on those behaviors that we're seeing, or new techniques to the different protection product teams to help inform them for detection opportunities or understanding what threats are doing and how the threat landscape is changing.

Nic Fillingham: Excellent. Welcome to the podcast to the three of you. The three of you are co-authors on a blog post from June the 10th. That talks about how attack modeling is used to find and stop lateral movement in the MTP product, which has been recently renamed to Microsoft 365 Defender, MTP did stand for Microsoft threat protection. Mike, perhaps starting with you, I wonder if you could help kick us off with just an overview of what was discussed in this blog and what's an introduction to that technique.

Mike Flowers: Definitely. So when we take a look at the different incidents across our customers, one of the things that we noticed is that when dealing with lateral movement, a lot of them had key characteristics that we could use to be able to bring together those different parts into a single incident. And so leveraging a lot of the real-world cases provided by Justin and also leveraging some ML models from Cole, we're able to bring all those signals into a single place. So that way our customers can take a look at those attacks in a single view.

Nic Fillingham: Identifying lateral movement feels like it's obviously a very complex challenge, but also a pretty critical one upfront. What is obviously a basic question here, but let's cover it. What are some of the most challenging elements in identifying lateral movement as part of a bridge?

Justin Carroll: I can at least speak to some of what we're seeing is a lot of the fact that the techniques used by the attackers, aren't really all that different from what administrators do. Most good attackers are trying to look like administrators when they're doing these attacks, or administrators that are great at their job, per se. Differentiating the legitimate behaviors that you're seeing that are associated with these protocols, such as SMB or WMI versus the malicious ones can be kind of challenging, because there is so much noise that you have to suss out quite a bit to infer, what is the main differentiator from this and sets it apart as far as malicious. And it's particularly challenging when you have multiple different machines and sometimes the attackers box isn't visible to you on telemetries, or you're only getting half the equation. So you're trying to piece together this multi part incident and figuring out all of it when you don't necessarily have the complete picture.

Mike Flowers: One other key part about that is worth mentioning is that, a lot of times we'll end up seeing connections being made throughout a network whereby not only are they part of an attack, but they might not necessarily result in actual activity happening on the remote end. We'll see scanning, for example, happening in a network and in cases like that, the remote end won't actually have code execution on it, but it's still worthwhile to be able to see that type of telemetry. In that sense, what we're really trying to do is to bring together both components. So being able to see the type of telemetry, but also be able to bring in particularly, the instances in which a code execution does happen on the remote end.

Natalia Godyla: How are we using ML to help solve some of these challenges?

Cole Sodja: Really the challenge is how do you identify, or rather quantify legitimate behavior from the attack and that's where ML will help. There's two things we do. One is we do a form of supervised learning, where in essence, we create labeled data of attacks. For example, people like Justin will give us examples of actual attacks, that will provide some labels and then we'll take the data associated to these attacks and basically encode them in two features. Think of feature like the way we represent basically an attack, is in the form of a graph.

Cole Sodja: The features form nodes of this graph, and features are stuff like, what are the network connections? What users are logging into the different machines? Are there any alerts on these machines? What are the different files that were dropped on the machine? What are the commands that are running on these machines? What's their parent child relationship? We take all these features basically, and we'll train a model to learn which combination of features actually correlate with the attack. In this case, we're looking at attacks that had an element of lateral movement. So we'll compute, what's the probability of observing lateral movement, given all these features and the examples we feed to the model. That's one way we use ML. The other way is through anomaly detection. Per what Justin was saying, where you have an administrator

Cole Sodja: Who's making, let's say connections. We can build a model to learn what's normal for this particular account. Let's say that's an administrator making connections. How frequently do they make network connections to other machines? What do they do? Do they use scanning tools? How do they use the scanning tools? So we'll also employ anomaly detection, which is more unsupervised. We don't have labeled data there just to quantify what is normal. And that will also be used as an indicator to help basically filter out or remove the cases that are legitimate from what are actual attacks.

Natalia Godyla: So, can you talk us through one of the attacks in the wild, the attacks that we're using to educate our ML algorithms?

Justin Carroll: Yeah, so this attack is one that we've kind of seen quite often in the security space, more and more, is a human adversary hands on keyboard attacks, where they gain entrance to a network. Often for this group, for instance, it's typically through remote desktop brute forcing when they do this. So in this instance, what makes that a little bit difficult is they're typically brute forcing a local administrator account. So when they land on the machine, they are an admin, which gives them capabilities of tampering with antivirus solutions. It makes credential dumping very easily.

Justin Carroll: They're not really restricted, right? In essence, if you are an administrator on a machine, you own that device and they don't have to, in this case, typically use many exploits or anything fancy. Once they have done the credential dump, as in the case that we saw, they can actually use those credentials typically with, if it's a server machine that they're landing on, at some point a domain administrator or somebody with some elevated privileges will have logged onto that machine, that's quite likely, and they can dump the credentials on that box and then use those credentials to continue their attack.

Justin Carroll: Or other times what they'll actually do is look for password files and text documents, which is also quite common. And the attack that we found, they dumped credentials and then did significant scanning in the environment to find vulnerable targets with the main goal of distributing ransomware widely across the network. They then used a combination of sysinternals tool psexec and the windows management instrumentation, so WMI, to execute remote commands and code on the other devices in the network. And from that point on ransomed, many of the machines.

Nic Fillingham: That was an example of an attack that you or your team found that had actually occurred. And then from that example, you were able to sort of perform a post-mortem and work out sort of what the attacker did. And then that formed intelligence that fed back into the machine learning model to sort of learn how these kinds of attacks would happen in the future. Is that correct?

Cole Sodja: That's correct. So basically, we get examples like this. You can think of it as a cold start problem, where initially, we don't have any information or labels on this type of attack. Justin, for example, discover this attack, we get one or two labels, we're able to build a graph with these labels as an attack graph, essentially to start training the model, then what the model will do to continue learning, it will go use these nodes now we built in the graph, for example, as Justin said, prudential dumping to be a node in the graph, how they did lateral movement over WMI psexec could be another node of the graph scanning and so on, we'll build these graphs. Then the model, basically we'll go search historical data, looking for these nodes and bring back additional examples that the model feels is similar to this example.

Cole Sodja: If it basically looks the same with high confidence, that is the probability that this exact attack is higher relative to any other attack out there or any other example of another attack, it will actually create labels itself and it will expand the graph based on cumulating, additional information in the graph. In those cases, if we didn't have those as nodes, the model will actually add those as nodes and it will keep that and then compute probabilities of those. And again, if there's higher likelihood that those nodes are associated to real attacks or attacks like this, it will retain them. And if not, it will then learn how to filter them or compute them as very low likelihood. And they won't receive a lot of weight in the actual construction or prediction. So, that's how we train the model through these examples.

Natalia Godyla: And what's new about this technique? What were we doing before this technique, or what were security operations teams doing before?

Cole Sodja: So, before this technique was available, each of the alerts that was happening on the different devices were silent. So along those lines, if a ransomware attack happened within an organization that had, let's say, 10 devices, that each of those 10 devices would have separate alerts in them based off of what they're able to detect. And what we're trying to do is to bring together all 10 of those incidents into a single one. So, that way you can go into that one place to take a look at it all together.

Natalia Godyla: So one centralized location, that makes sense. And what's next for the team then as you evolve the product?

Cole Sodja: Oh sure, from an ML side, there's two things. So one, we already started to work on and have some success, but it's ongoing. It's currently not implemented in our product. It's more of a proof of concept or pilot right now is classifying threat actors, like the [inaudible 00:16:53] example. So when we see attacks, rather than just correlating what we observe in the attack, we could actually start computing the probability that this attack is this known threat actor. And given that, we could start asking questions like what's the probability given that we believe it's this threat actor that we're actually going to see ransomware in the coming stage of the attack or some other type of objective from the attacker. So those are two active areas of research and stuff we plan on integrating into our product at some point in the future, it's the classification of threat actors that we're tracking and predicting the attack stages.

Cole Sodja: What's going to come next in the attack, given the intel we have about the threat actor, that's one. The other one is basically expanding these types of correlations beyond lateral movement. We've had quite a bit of focus recently on human operated ransomware, but there's other things we plan on doing, integrating or extending this type of framework for better correlations of these alerts that still are hard to correlate and end up in silos. So it's something we want to extend this framework to just better correlate alerts that are probable as part of the attack, but we can't infer it like deterministically.

Nic Fillingham: I'm just wondering, are you seeing adversaries change their approach or their techniques in response to the success really, I guess, of these new tools and techniques in the product?

Justin Carroll: Yeah, you'll frequently see them changing techniques depending. So it kind of depends. Adversaries most often use what works and typically in lots of instances where they're trying to deploy, for instance, ransomware on numerous targets quickly, right? Like they want to have high confidence that something's going to work. So in that instance, they only change techniques when they hit a roadblock, essentially when that no longer becomes valid or they're being stopped too quickly in their attack to fully execute it. We have seen quite a few different adversaries actually specifically switch to different techniques and have registry files named, things that basically indicate frustration with the way our products are stopping them. So they get very frustrated at Defender for instance, so they will try and use different tools and actually maim them, as you can tell that they are quite antagonized by how we are constantly monitoring them and trying to stay basically one step ahead of them.

Nic Fillingham: Damn you, Defender WXA, something like that?

Justin Carroll: A little bit more explicit, but yeah, a lot more explicit. But yeah, so we do see them modify quite a bit, but it kind of depends. I know with some of the more recent threat actor tracking that Cole and I and Mike have been kind of digging in and working on, we kind of see a slow progression over time where you'll see some techniques kind of drop off for a bit and then eventually, sometimes they resurface again, it just kind of depends on what is the most effective for them to get their job done.

Nic Fillingham: And it feels like utilizing machine learning as a tool here in this process has one of the additional benefits there to your point, Justin, is if an attacker decides to revert back to... Obviously it's good at identifying variations on attacks, but if they want to revert back to something they haven't done in many, many years, you're not asking a human analyst to then dig back into their dusty cobweb memory bank. The machine learning model has that they're sort of somewhat instantaneously

Justin Carroll: The advantage of ML as well, tied to what you're saying, as far as understanding that old techniques are still part of the model. So it knows how to handle them. Most attackers typically aren't altering all of their techniques, right? It's different sub components of the attacks so that either have been made more difficult by different product changes or things like that. The advantage of the ML is you're able to find those attacks where overall 70 to 80% of it is the same. And then you can use that surfaced information to know what they've changed to then put it back into the model to continue to modify with that. So unless the attacker completely changes from the ground up, which often they just don't do, you have a really good method of kind of keeping your finger on the pulse.

Natalia Godyla: So it's actually benefiting us in a way, because we're able to just continue to evolve the ML model because we already have that base data and can just adjust based on the subtle changes. Is that accurate?

Cole Sodja: Yeah, that's accurate. I think Justin stated it nicely. Those are really the benefits because if an attacker completely changes everything of course, which would mean we don't have any previous features to even leverage on, to start computing a probability, that's a different story. But since that's extremely rare, that's where ML is quite useful. It can continue as I said, to grow and shrink this graph and dynamically learn these probabilities and through surfacing the probabilities, we could rank them accordingly.

Cole Sodja: And that's where people like Justin could go look at it and say, "Oh, okay, yeah. We think with, let's say 65% confidence right now, it's still the same actor, but here's some new things that the ML model discovered as part of this attack" that then Justin could look at and then basically further interpret and help give that feedback back to the algorithm. So then it understands what these new features are, and how they are related. Giving that context essentially to the model, I would say is key.

Nic Fillingham: And is there anything that the customer needs to do, or the individual Security analysts or practitioners need to do, to take advantage of this technique? Or is it just sort-of baked into the product?

Mike Flowers: And these things are baked into the product. So, whenever a customer pulls up their list of incidents, they had to look at if there are any that span multiple machines, And if they contain alerts that have cited a lateral movement activity in them, then they'll automatically be brought into that single incident for them to be able to take a look at.

Natalia Godyla: How did the model originate? What was the driver for this coming to light?

Justin Carroll: A lot of how it came about was just a need on the analyst part of having a model to basically combine a wide set of disparate signals that at first glance may not appear related and required a significant amount of work to correlate all the behaviors into a meaningful fashion to understand that they were tied to one specific incident, or one actor. It came about organically as data science is one of the perfect partners for security to empower each other and then working together to continually build new models and then using those models to help inform the analysts of new behaviors and allowing them to quickly find interesting incidents that may drive the intelligence conversation or understanding where we have a product alerting opportunities. It's a very natural collaboration that is extremely effective.

Cole Sodja: I will just add one thing to that. So, one thing data science brings, it's not just like the methodologies, if you will, in terms of how we design the right tool for the jobs, there's an exploration phase. So, one thing like Justin was mentioning is you have this huge space of signals to search through, and yeah, we have some previous examples and there is also what we like to call the "unknown unknown," stuff we haven't seen, even the threat experts missed. For example, because they are kind of weak signals in themselves. So, it's searching through this large dimensionality and then correlating them all and returning essentially what a model or what the scientist believes is to be indications of attacks that we might have missed, or a part of an attack that we capture, but we didn't completely get the whole story of the attack.

Cole Sodja: And that's where that collaboration becomes quite natural. So, we will explore, then we'll go back and have a discussion. We'll review, and that will be feedback into how we further explore and we'll keep going, generating new examples from that, and so on. Eventually, that will lead to the definition of the model, actually.

Nic Fillingham: There's almost always massive, massive numbers behind the scenes here, and I know a lot of our audience like to learn or to hear about the immense scale that's happening behind the scene. Anyone got a big number you want to throw at us to impress as to the scale and output of what this can do?

Mike Flowers: We do generate tens of thousands of alerts every single day for our different customers. And what I find to be particularly awesome about the work that we've done with this project is bringing together, or picking out those alerts within that giant set, to be able to filter it down to the select 30, 40, 50 alerts that are part of a single incident that's happening within a given work and making it so, that way we're able to classify it.

Mike Flowers: So, that way it's all part of one attack and bring it together for the end analyst. So, I would say taking that number hundreds of thousands, even of different alerts across the entire timeframe and taking out the less than a hundred that are relevant to this specific attack.

Natalia Godyla: Great. Well, thank you for that. And thank you, Cole, Mike, Justin for joining us today, it was great to walk through all the great work you're doing.

Cole Sodja: Thank you. My pleasure.

Justin Carroll: Yeah, thank you.

Mike Flowers: Thanks. Happy to be here.

Natalia Godyla: And now let's meet an expert in the Microsoft security team. To learn are more about the diverse backgrounds and experiences of the humans, creating AI and tech at Microsoft.

Nic Fillingham: Doctor Anna Bertiger, thank you so much for joining us. Welcome to the Security Unlocked podcast.

Dr. Anna Bertiger: Thank you so much for having me.

Nic Fillingham: If we could start with, what is your title and what does that really mean in sort-of day-to-day terms? What do you do at Microsoft?

Dr. Anna Bertiger: So my title is Senior Applied Scientist, but what I do is I find villains.

Nic Fillingham: You find villains, so how do you find villains?

Dr. Anna Bertiger: So I find villains in computer network. It's all the benefits of a job as a superhero with none of the risks, and I do that using a combination of security expertise and mathematics and statistics.

Nic Fillingham: So, you find villains with math?

Dr. Anna Bertiger: Yes, exactly.

Nic Fillingham: Got it. And so, let's talk about math. What is your path to Microsoft, because I know it heavily involves math? How did you get here? And maybe what other sort of interesting entries might be on your LinkedIn profile?

Dr. Anna Bertiger: So I got here by math, I guess. So, I come from academic mathematics. I have a PhD in math, and then I had a postdoctoral fellowship in the Department of Combinatorics and Optimization at the University of Waterloo in Waterloo, Ontario, Canada.

Nic Fillingham: Could you explain what that is, because I heard syllables that I understood, but not words?

Dr. Anna Bertiger: So that, is the Department unique to the University of Waterloo. So, Optimization is maximizing, minimizing type problems.

Nic Fillingham: Got it.

Dr. Anna Bertiger: And Combinatorics is a fancy word for counting things.

Nic Fillingham: Combinatorics?

Dr. Anna Bertiger: Yeah. Which you can do in fancy and complicated ways. And so, so that's what I did when I was not going to make mathematician is I counted things in fancy and complicated ways that told me interesting things frequently about geometry. And, then I decided that I wanted to see the impact of what I did in mathematics, in the real world, in a timeframe that I could see. And, not on the sort of like you think of beautiful thoughts, it's really lovely.

Dr. Anna Bertiger: It's a lot of fun. And, then hopefully someone uses them eventually. And so, I looked for jobs outside of academia and then one day a friend that Microsoft sent me a note that said, "If you like your job, that's great. But if you don't, my team wants to hire somebody with a PhD in Combinatorics." And I said, "That's me!" And so, it took a while I flew out for an interview, they asked me lots of questions. I, when I'm interviewing for a job, I evaluate how cool the job is by how cool the questions they ask me are. They ask me interesting questions, that's a good sign. If they ask me boring questions, maybe I don't want to work there.

Nic Fillingham: Do you remember any of the interesting questions? Anything stick out?

Dr. Anna Bertiger: Yeah. So they asked me that team was involved in the anti-credit card fraud system at Microsoft. So, someone is typing your credit card number into Microsoft's website. Are you going to call up and say, that was fraud? If the answer is yes, we don't want to, we don't want to complete that sale. If the answer is no, then we would like your money. And so, they asked me a bunch of questions about how you get the right data for, for credit card fraud. So like how, how do you know, how do you get a bunch of labeled data for credit card fraud that says this is fraud, this isn't fraud.

Natalia Godyla: Was there something that drew you to the cybersecurity industry? When your friends showed you this job, did you see security and go, "Yeah, that's cool."

Dr. Anna Bertiger: So, I didn't actually see security in that job. Like that team didn't only work on fraud. We worked on, we also worked on a bunch of marketing related problems, but I really loved the fraud, related problems. I really loved the adversarial problems. I, I like having an adversary. I view it as this like comforting, friendly thing that like you solve the problem. Don't worry. They'll make you a new one.

Nic Fillingham: Haha.

Dr. Anna Bertiger: It's true.

Nic Fillingham: So, hang on so, you, you go to bed at night, and sleep soundly knowing that there are more villains out there?

Dr. Anna Bertiger: I mean, I would kind of like to get rid of all the villains, but also like they're building me some really cool problems.

Nic Fillingham: Yeah, you're a problem solver in there throwing some good challenges at you.

Dr. Anna Bertiger: Right. I'll like make the world a better place. School of thought. I would like them all to disappear off the face of the planet on the like entertaining me portion. Their problems are pretty good. And so I worked a bunch on, on credit card fraud related problems on that team. And at some point a PM joined that team who had a, who was a cybersecurity person who had migrated to fraud. And I said, "Well, I'm not a cybersecurity person." He said, "Oh no, you are. It's a personality type. And it's you." And then I worked at some other things, worked on some other teams at Microsoft, did some windows quality-related things. And I, it just wasn't as much fun. And I found my way back to cybersecurity. And I've been here since.

Natalia Godyla: And how do you apply your academic background to that role today? What do you see transfer the most?

Dr. Anna Bertiger: So, I think a lot about mathematics and statistics on graphs. So, maybe it's networks of computers and I'm looking for surprising connections, that's something I think about a bunch and surprising connections might be that people are weird, or it might be that someone who doesn't know your network and doesn't behave like the people who are usually in your network are, is they're making connections between computers and that is lateral movement. So that, suggests there's some advanced human actor in your network.

Nic Fillingham: So how do you use math to determine if it's just, "Oh no, this person is doing something funky, but benign." Versus bad actor, a lateral move.

Dr. Anna Bertiger: So that, is sort of the secret sauce of cybersecurity expertise. So, I, the math tells you, this is weird. This is not typical, but the math doesn't tell you whether it's good or bad, the math just tells you it's atypical. And, so then, you hope to look for atypical along an axis where atypical is likely to also be poor behavior, is likely to also be someone malicious. And that, is about working with people who are cybersecurity experts, working with threat researchers, working with threat intel, and trying to find the right access to work along for. Oh yeah. If it were weird in this way, that's probably bad. And you talk to them, you try something, rinse and repeat, a lot.

Natalia Godyla: How do you use AI or ML's tools to solve some of these problems?

Dr. Anna Bertiger: So the AI and ML is about learning what's normal and then when you say, Hey, this isn't normal. That might be malicious. Someone should look at it. So our AI and ML is human in the loop driven. We don't act on the basis of the AI and ML the way that some other folks might and there's certainly security teams that have AI and ML that makes decisions and then acts on them on its own. That is not the case. My team builds AI and ML that powers humans who work in security operation centers to look at the results. And so I use ML to learn what's normal. Then what's not normal, I say, Hey, you might want to look at this because it's a little squiffy looking and then a person acts on it.

Nic Fillingham: So what are some of those techniques, AI and ML, obviously very broad terms, they could have quite a wide scope, what are some of the techniques or approaches that you use mostly? Is that even an answerable question or do you use everything in the tool belt?

Dr. Anna Bertiger: I mean, I most prefer the technique that solves the problem, but that said, I do have favorites. And so I use a lot of statistical modeling to figure out what's normal. So fit a statistical distribution to some numerical data about the way the world is working and then calculate a P value, that you might remember from stat one, if that's something you've done, to say, oh yeah, well, there's only a 10th of a percent chance that this many bytes transferred between these pair of machines under normal behavior. Someone should look at that. That's a lot of data moving. And there, I like to use a group of methods called spectral methods. So they're about if I have this graph, I have a bunch of vertices and I can have edges between them. I can make a matrix that has a one in cell IJ. If there's an edge between vertex I and vertex J. Let me know if I'm getting too technically deep here.

Nic Fillingham: You are, but keep going.

Dr. Anna Bertiger: And then now I have a giant matrix. And so I can apply all the tools of linear algebra class to it. And one of the things I can do is look at its eigenvalues and eigenvectors. And one way you might sort of compress this data is to project along the eigenvectors corresponding to large and absolute value. Eigenvalues. And now we can say things like, all the points that are likely to be connected end up close together and we can try and learn something about the structure, the network, and what's strange. And we've done a bunch of research in that direction. That is stuff I'm particularly proud of.

Natalia Godyla: So I know you mentioned this is very human in the loop, so you're bringing this to somebody and they now have the information that they can make a determination based on. What about plugging it back into the Microsoft Solutions? Are we using this information to inform our products as well? Or are you focused really on empowering our security folks?

Dr. Anna Bertiger: Well so, here, the security folks are our customers. This is the product we are selling them is the alerting us that something is wrong, products. So sometimes it's security folks at Microsoft, I've written things that went to the hunters that power Microsoft Threat Experts that they look at and say, eh, not so much, Anna, or sometimes, this is really gold. I mean, and they have more tolerance than many for, well, it can be lousiest sometimes as long as it's gold, sometimes also. And then, also I've written things that go to our customers via the products we sell.

Natalia Godyla: What are you most interested in solving next? What are you really passionate about?

Dr. Anna Bertiger: I'm really passionate about two things. One of which is sort of broadly speaking, finding villains, finding bad guys. So part of what I do is dictated by what they do, right? They change their games, I have to change mine too. And then also I have a collection of tools that I think are really mathematically beautiful, that I'm really passionate about. And those are spectral methods on graphs and sort of graphs in general. And so I'm really passionate about finding good applications for those. I'm passionate about understanding the structure of how computers, people, what have you, connect with each other and interact and how that tells us things about what is typical and what is atypical and potentially ill-behaved on computer networks and using that information to find horrible people.

Dr. Anna Bertiger: I think I've stopped being surprised by what our adversaries can do, because they are smart people who work hard. Sometimes I'm disappointed in the sense of, damn, I thought I solved that problem and they're back. But, I mean, and that's mostly just, you feel like the sad balloon three days after the party.

Natalia Godyla: At the end of the day, why do you do what you do?

Dr. Anna Bertiger: I think there are two reasons I do what I do. The first, which is, I want to make the world a better with the ways I spend my time. And I think that catching horrible people on computer networks makes the world a better place. And the other, which is that it's really just a ton of fun. I really do have a lot of fun. We think about really cool things, neat concepts in computing, and beautiful mathematics. And I get to do that all day, every day with other smart people. Who wouldn't want to sign up for that?

Natalia Godyla: You've called mathematics beautiful a couple of times. Can you elaborate? What do you find beautiful about math? What draws you to math?

Dr. Anna Bertiger: I find the ideas in math really beautiful. And I think that's a very common thing for people who have a bunch of exposure to advanced mathematics, but isn't a thing we filter to folks in school as well as I would like. That if you think about the Pythagorean theorem. So that's a theorem that most people learned in high school geometry that says-

Nic Fillingham: I know that one.

Dr. Anna Bertiger: The square of the lengths of the two sides of a right two legs of a right triangle equals the sum together equals square, the hypotenuse length. And if you-

Nic Fillingham: Correct.

Dr. Anna Bertiger: That is a fact, okay? And if you learn it as a piece of trivia, then you go, okay, that's the thing that I need to know for the test and you write it down and you put it on a flashcard or whatever. But what I think is really beautiful is the idea of how do you think that up? And the sort of human ingenuity and figuring out that that's true and the beautiful ways you can show that that is true, for sure. There are some really, really beautiful ways to be able to prove to yourself that that is true.

Nic Fillingham: And is that math or is that human ingenuity? Is that the human mind, is that sort of creativity or is it altogether?

Dr. Anna Bertiger: It's sort of both. I mean, the things that I love about math are the creativity and the new ideas. And so to me, those are very wrapped together and, sort of, math is as much about there's some saying about truth and beauty, and math is about those things.

Nic Fillingham: Changing topics, sort of slightly, are you all math all the time? Do you have a TV show you're binging on Netflix? Do you have computer games you like to play? Are you a rock climber? What's the other side of the math brain?

Dr. Anna Bertiger: So the other side of the math brain for me is things that force my brain to focus on something that is entirely not work. And so I really love horses and I have a horse and I love spending time with her. And I love riding her. She's both a wonderful pet and just a thrill to ride.

Nic Fillingham: What's her name?

Dr. Anna Bertiger: I call her Elsa, but on paper, her name is Calloway's Blushing Bride.

Nic Fillingham: Wow.

Dr. Anna Bertiger: I didn't give her either of those names.

Nic Fillingham: Do you think of horse riding in mathematical terms? Do you sort of think about velocity and angles and friction and all that kind of stuff?

Dr. Anna Bertiger: No.

Nic Fillingham: Or is it-

Dr. Anna Bertiger: No.

Nic Fillingham: Just organic?

Dr. Anna Bertiger: I really think about horseback riding in terms of sort of what it feels like. It's the opposite of sort of dry and technical.

Nic Fillingham: Awesome.

Natalia Godyla: Well, Anna, it was a pleasure to have you on the show today. Thank you for sharing your love of math and horses and hopefully we'll be able to bring you back to the show another time.

Dr. Anna Bertiger: Thank you so much for having me.

Natalia Godyla: Well, we had a great time unlocking insights into security from research to artificial intelligence. Keep an eye out for our next episode.

Nic Fillingham: And don't forget to tweet us @msftsecurity or email us at with topics you'd like to hear on a future episode. Until then, stay safe.

Natalia Godyla: Stay secure.