Unpacking the New ML Threat Matrix
Nic Fillingham: Hello, and welcome to Security Unlocked. A new podcast from Microsoft, where we unlock insights from the latest in news and research from across Microsoft security engineering and operations teams. I'm Nic Fillingham.
Natalia Godyla: And I'm Natalia Godyla. In each episode, we'll discuss the latest stories from Microsoft security, deep dive into the newest threat intel, research and data science.
Nic Fillingham: And profile some of the fascinating people working on artificial intelligence in Microsoft security. If you enjoy the podcast, have a request for a topic you'd like covered, or have some feedback on how we can make the podcast better.
Natalia Godyla: Please contact us at email@example.com or via Microsoft security on Twitter. We'd love to hear from you. Hi Nic. Welcome back. How were your holidays?
Nic Fillingham: Yes. Thank you, Natalia. Welcome back to you as well. Mine were great. You know, normally you drive somewhere or you fly somewhere, you go visit people, but this was all the FaceTimes and the Zooms and the Skypes, staycation, but it was still nice to eat too much and drink too much over the holiday period. How about you?
Natalia Godyla: Yes, it was... to quote my boss. "It was vegetative." It was definitely just... well actually you know what? I did have a big moment over the holidays. I got engaged.
Nic Fillingham: Oh, what!
Natalia Godyla: I know.
Nic Fillingham: Congratulations.
Natalia Godyla: Thanks.
Nic Fillingham: That's amazing.
Natalia Godyla: I feel like it was absolute relaxation, really high point during the five minute proposal. And then we went back to our natural state and just absolute relaxation, lots of video games.
Nic Fillingham: Hang on. So were you both sitting on the couch, playing some switch, eating your 95th packet of Doritos, and then all of a sudden your partner pauses and says, "You want to get hitched?"
Natalia Godyla: There was a little bit more pomp and circumstance to it. Though I think that would have been very fitting for us.
Nic Fillingham: Wow! Good on you guys. That's awesome.
Natalia Godyla: I'm sure that like us, everyone has forgotten what they were doing at work, and I'm sure also what this podcast is doing. So why don't we give everyone a after the holiday refresher?
Nic Fillingham: So just before the holidays, we partnered with Petri who run the Petri.com site Thurrott.com. First Ring Daily, a bunch of other great blogs, podcasts, email newsletters, and so welcome to all our new listeners who've come to us from Petri, from Throughout from First Ring Daily. Yeah. So what is security unlocked? Well, first and foremost, Natalia, and all your co-hosts, we are Microsoft employees and we will be interviewing, and we do interview on this podcast, other Microsoft employees, but we talk about security topics that hopefully are relevant to all security professionals and those who are interested in the state of cybersecurity.
Nic Fillingham: And what we'll do in each episode is the first half is we'll pick a sort of a recent ish topic and we'll speak to a subject matter expert or an author of a recent blog post and ask them about the thing that they're working on, or that they've announced in the AI and ML space, hopefully try and demystify some new terms or concepts that may be either nascent or sort of difficult to wrap one's head around. And then in the second half...
Natalia Godyla: We talk to again, another Microsoft security expert, this time more focused on the individual and their path to cybersecurity. So we'll ask them about what interested them about cyber security, what compelled them to join the industry, what jobs they've had, how they've come to Microsoft or their current role. In addition, we also have a new announcement about the podcast, which is we'll be switching to a weekly cadence. So prior to this, we were bi-weekly, now more goodness coming your way.
Nic Fillingham: More pod in your pod app. What is the collective receptacle for pod? What is it? More pods in your cast, more cast in your pod?
Natalia Godyla: More beans in your pod.
Nic Fillingham: I like that. More beans in your pod. And I think the other thing that's worth reiterating Natalia is if you have a cyber-security topic you would love to learn more about, or a perspective you'd like to hear from, please let us know, we'll go after it for you and try and bring that to a future episode.
Natalia Godyla: Yes, absolutely. We're really thankful to everyone who has reached out thus far and just keep it coming.
Nic Fillingham: On today's episode in the first segment, which we call our deep dive, we speak with Ram Shankar Siva Kumar, whose title I will not give away in the intro because we talk about it in the conversation. And it's an awesome one. Ram works in the Azure Trustworthy ML team. And he's here to talk to us about a blog post that Ram co-authored with Ann Johnson that announces a new adversarial ML threat matrix that has been built and published up on GitHub as a collaboration between Microsoft, MITRE, IBM, Nvidia, Bosch, a bunch of other organizations as a sort of open source approach to this upcoming sort of nascent threat category in adversarial machine learning. And it was a great conversation. And then after that, we speak with...
Natalia Godyla: Justin Carroll of the Microsoft Threat Intelligence Global Engagement and Response team. He started in networking very on the ground and only got his education in cybersecurity later in his career, which I think to anybody out there, who's looking to transition to security, who has a different background in security and is wondering whether they can make it, you can. He also chats a little bit about what inspired him to join cybersecurity. Some of it came from video games, which is a theme we're seeing again and again.
Natalia Godyla: So he had a unique spin on vigilantism within video games and ensuring that those who had an unfair advantage by using mods were checked and tried to level the playing field for all the rest of the players of that game. And of course we touch on Ninja Turtles, which is really the highlight of the episode. I think, with that on with the pod.
Nic Fillingham: Ram Shankar Siva Kumar, thank you for joining us on Security Unlocked.
Ram Shankar Siva Kumar: Hey, thanks for having me, Nick and Natalia. Really appreciate it.
Nic Fillingham: So we're going to talk about a blog post that you co-authored with the wonderful Ann Johnson. The title is, it's a great title. I'll get straight to the point. Cyber attacks against machine learning systems are more common than you think. Before we get into that, though, I just have to ask, you list your title as data cowboy, which is fantastic. I would love data cowboy, anything cowboy. I would love that for my title. Could you explain to people, what does a data cowboy do and what is the Azure Trustworthy ML group?
Ram Shankar Siva Kumar: Oh, totally. First of all, this is like every kid's dream is to be Woody from Toy Story. It's just like, I realize it in my own way. So when I joined Microsoft in 2013, there really wasn't an ML engineer position. So my boss was like, "You can be whatever you want. You can pick your own title." I was like, "Yes, Toy Story comes to life." So it was like, this is a brown version of this Woody that you kind of get. So basically what the Trustworthy Machine Learning group does is our promise to Microsoft is to essentially ensure we can enable engineers and customers to develop and deploy ML systems securely. So it's kind of a broad promise that we make to Microsoft and our customers.
Nic Fillingham: Got it. I would love to come back to just the data cowboy one more time. Tell me what you do. I mean, I have visions of you riding around the office on a hobby horse. Lassoing errant databases. Tell us about your day to day. What does it look like?
Ram Shankar Siva Kumar: Yeah. So what really happens is that, like I said, I really wish I can ride it on my office, now I am at my home and my 500 square foot apartment- definitely not recommended. But most of the time we end up doing is this wonderful Hiram Anderson who's part of our team, he's militantly looking at how we can detect attacks on machine learning systems. So really working with him and the rest of the Microsoft community to kind of keep our eyes and ears on the ground, see like what sort of attacks on machine learning systems we are seeing, our various different channels and trying to see how we can detect and respond and remediate those sort of attacks. So that's the first one big one. The second thing is like I get to work with a wonderful Will Pears. So I get to work with him to think about actively attacking red teaming Microsoft's machine learning system. So even before our attackers can look at, exploit the vulnerabilities Will and Hiram go and actively attack Microsoft ML systems.
Natalia Godyla: So how does the work you do connect to the different product groups. So as you're identifying these cyber attacks, are you then partnering with our products to build those into the detections?
Ram Shankar Siva Kumar: Yeah, that's a great question. So one of the things I really like about Microsoft is that super low slake to meet with somebody from another product team. So the amazing Mira Lane who heads the Azure Cognitive Services, really worked very closely with her. And I believe you ever had a Holly Stewart in your podcast as well, so worked very closely with her team. So it's really a big partnership with working with leaders from across Microsoft and kind of shopping around what we're doing and seeing how we can kind of help them and also learn from them because they also have sensors that necessarily might not have.
Nic Fillingham: Let's talk about this blog post. So you and Ann both announced this really interesting sort of consortium of 11 organizations, and you're releasing an adversarial ML threat matrix. It's open source, it's on GitHub. Very exciting. Tell us about it.
Ram Shankar Siva Kumar: So the goal of the adversarial ML threat matrix is essentially to empower the security analyst community so that they can start thinking about building detections and updating their response playbooks in the context of protecting ML systems. And one of the things that's kind of like we want to be mindfully different is the attacks that we see to this framework with, all these techniques, we kind of only put the ones that Microsoft and MITRE jointly vetted that were effective to be against production machine learning systems.
Ram Shankar Siva Kumar: So first of all, the whole area of attacking machine learning systems goes all the way back to 2004. In fact, you can find Daniel Loud, whose Twitter handle is Dloud on Twitter today. He continues to work on this super cool fields and there's a wonderful timeline by this other researcher called Battista Bisho that he also linked to the blog, but he can basically see that this work has gotten immense academic interests for the last 16 years. And especially in the last four years after a very seminal paper was released in 2014.
Ram Shankar Siva Kumar: So when a lot of people think about spiel, they think of as, oh, this is something that is really theoretical. This is something that... Oh, Great, you're working in academic setting, but no, that's not true. There are marquee companies, who've all had their ML systems subverted for fun and profit. So the whole point of this blog post with MITRE and this whole corpus of industry organizations was, this is real. Attacks on machine learning systems is real, you need to start thinking about this.
Ram Shankar Siva Kumar: Gartner released a report on 2019 saying, 30% of all cyber attacks in 2022 is going to involve a tax on machine learning systems. So this is not a pie in the sky. Oh, I'll get to it when I get to it. 2022 was a year and a half, it's a year away from now. So we got together in this blog post to really empower our security analysts community and help them orient for this new threats.
Natalia Godyla: Can you talk a little bit more about what exactly is the adversarial ML threat matrix and how you envision security analysts using this tool?
Ram Shankar Siva Kumar: Yeah, totally. So one of the things that before we even put this matrix together, we kind of conducted a survey of 28 organizations. We spoke to everybody from SMBs to governments to large organizations and we spoke to the security analyst Persona, as well as the MLG person. I asked them, "Hey, how do you think about securing ML systems? This is a big deal. What are you doing about it?" And they were like, "Well, we don't have the tools and processes in place to actually go and fix these problems." So the first thing we realized is that we wanted the security analysts community to be introduced to adversarial ML as a field, try to condense the work that's happening in a framework that they already know. Because the last thing we want to do is to put another framework another toolkit on their head.
Ram Shankar Siva Kumar: And they're just going to be like, "Nope, this is not going to work out. This is one more thing for them to learn." So we took the MITRE's attack framework. So this is something that was again, bread and butter for any security analyst today. So we took the attack framework and we kind of said, "Hey, we've been really cool." If you took all the ML attacks and put it in this framework, and that's exactly what we did. So if you look at our track matrix, it's modeled after the MITRE attack framework.
Ram Shankar Siva Kumar: So the wonderful folks from MITRE's ML research team and us, we got together and we basically aligned the attacks on machine learning systems, along reconnaissance persistence, model evasion, ex-filtration. So if you look at the top of our matrix, the column headers are essentially tactics and the individual ones are techniques.
Ram Shankar Siva Kumar: So let's say that an attacker wants to gain initial access to a machine learning subsystem, let's say that's her goal. So she has a couple of options to kind of execute her goal. She has a couple of techniques in her kit. The first thing is that she can just send a phishing email to an ML engineer. That's very valid. Phishing is not going to go away. The second thing that she can do is she can take a pre-trained ML model available that people generally download and she can backdoor it. So the whole point of this attack matrix is to A, build a common corpus of attack techniques and attack tactics in a framework that a security analyst already has knowledge of.
Natalia Godyla: Are you seeing any trends? What's most common to combine.
Ram Shankar Siva Kumar: Oh, that's a great question. So before I just step into this, I first want to tell you about this attack called model replication. So the easy way to think about this and Natalia, I will get to this, I promise.
Natalia Godyla: I love the excitement. I'm so ready for it.
Ram Shankar Siva Kumar: We're going to take a little detour like Virgil and Homer. So essentially the best way to think about model replication is that open AI is a very famous ML start up. And they last year released a model called GPT-2, and they said, "Hey, you know what? We're not going to release the entire model immediately. We're going to release it in a stage process." We're going to just... because we want to do our own verification and before they could release the entire model, these spunky researchers, so I love that. They're still cool. Vania Cohen. And I know this other person's name is Skylion with a O, they replicated GPT-2 it was like 1.5 billion parameter model, and they've leased it on the internet on Twitter. And they call it open GPT-2. And I love their tagline, which is GPT-2 of equal or lower value.
Ram Shankar Siva Kumar: So even before the company could release, they replicated the ML model based on the data sets that were available based on the architecture. And they basically at the end of the day, and we also references our case study is that they basically tweaked an existing model to match GPT-2 and they publish that for everybody to use. No, it does not have the same accuracy or the same metrics as the original GPT-2 model. But the fact that an attacker can even replicate a ML model using publicly available data sets and having some insights about the architecture is something for people to think about.
Ram Shankar Siva Kumar: So now to come back to your excellent question. So what exactly is a common pattern? So what essentially we see attackers doing is that they go interact with the machine learning system, attackers might send some data. They might get some responses back and they keep doing that enough amount of time. And they now have sufficient data to replicate the ML model. So the first step is that they go and replicate the ML model and from the ML model that they have replicated, they go do an offline attack. Because now they their own ML model, they try to evade this ML model and then they find a way to evade the ML model. And they take the examples of the test points that evade the ML model and now evade the online, the real ML that's out there taking that and then boom, fooling the real online ML model. So that's a common data point, but three case studies in our adversarial ML GitHub page that actually kind of shows this.
Nic Fillingham: So the sort of takeaway from that. If your data set is public, don't make your ML architecture public and or vice versa.
Ram Shankar Siva Kumar: That's a great question. And I've been thinking about this a lot, first of all, we definitely want to be transparent about the baby builder ML models, right? Marcus Sanovich, Oh gosh, he's such an amazing guy. But for the last so many years in RSA has been like militantly, been talking about how we build our ML models for security purposes, because we want to give insights into our customers about how we actually built ML models. And the data sets are machine learning as a field, it has as norms of opening up our data sets. In fact, one can attribute the entire deep learning revolution to Dr. Fei-Fei Li's image in a dataset which really sparked this whole revolution. So, I really don't want anybody to think that being open with our data sets or being open with our ML platforms is a good idea.
Ram Shankar Siva Kumar: Because even if you think of traditional cyber security, right? Security by obscurity is never a good strategy. So the way we want to push people to think about is how are you thinking about detection? How are you thinking about response? How are we thinking about remediation? So really trying to take the assumed breach mindset and feeding it into your ML systems is how we want to push the field towards. So if you take away anything from this is continue to be opening your systems for scrutiny, because that's the right thing to do, that's the norms that we've set. And that's important to advance research in this field and think about detection strategies and think about, and assume breach strategies for building ML systems.
Ram Shankar Siva Kumar: We wanted to distinguish between traditional attacks and attacks on ML systems. So the one thing that I want to think about is the threat matrix contains both traditional attacks and attacks on ML systems. Whereas the taxonomy only contains attacks on ML systems. The second difference is that, like I said, the matrix is meant for security analysts. This one is meant for policymakers and engineers. The third that's the more important difference is that in the context of the threat matrix, essentially we are only putting attacks that we have validated against commercial ML systems. It's not a laundry list of attacks. We're not trying to taxonomize.
Nic Fillingham: I wonder if you could talk about the approach and the philosophy here for putting this on GitHub and making it open to the community. How do you hope folks will contribute? How would you like them to contribute?
Ram Shankar Siva Kumar: Yeah, absolutely. So Miguel Rodriguez, who runs the MITRE, who we collaborated with, wonderful team over there before putting this out on GitHub, there was a little bot of angst, right? Because this is not fully baked product. This is something that 13 organizations found useful, but doesn't mean everybody in the community might find useful. And I think he said something to the effect of-
Nic Fillingham: It's almost as if you're a cowboy.
Ram Shankar Siva Kumar: Yeah. There you go, herding people. It was like, we're putting this out, acknowledging this is a first cut attempt. This is a living document. This is something that we have found useful as 13 organizations, but we really are hoping to get feedback from the community. So if you're listening to this podcast and you're excited about this, please come and contribute to this matrix. If you think there are attacks that are missing, if you would like to spotlight a case study on a commercial ML system, we are super looking to get feedback on this.
Ram Shankar Siva Kumar: And we also kind of realized that we wanted a safe space almost to talk about attacks on ML systems. So we were like, you know what? We're just going to have a little Google groups. And the membership of the Google groups is extremely diverse. You've got philosophers that are interested in adversarial machine learning. We've got people who are looking from various perspectives, joining our Google groups and kind of like giving us feedback and how we can make it better.
Natalia Godyla: Yeah. As you mentioned, there are tons of different perspectives coming into play here. So how do you envision the different roles within the community interacting? What do you think needs to happen for us to be successful in combating these threats?
Ram Shankar Siva Kumar: Yeah. This is a great question. The one thing that I've learned is that this topic is immensely complex. It's mind boggling to wrap the different personas here. So I'll just give you a rundown, right? So, so far we know that policymakers are interested in securing ML systems because every national AI strategy out there is like, securing ML systems is top priority for them. ML engineers are thinking about this, academic researchers. There were like 2000 papers published in the last, I want to say five or six years on this topic. So they are like a hotbed of research we want to rope into. We've got security analysts from these companies that we're talking to are interested. Csos are also thinking about this because this is a new threat for them. So as a business decision maker, how should they think about this?
Ram Shankar Siva Kumar: One thing that I got an opportunity with Frank Nagle, who's a professor at HBS. We wrote up piece at Harvard Business Review talking about, is it time to insure ML systems. ML systems are failing so if you're ML powered like vacuum cleaner burns a home down, what do you do about it? We try and rope in the insurers to come in participate in this. So, Natalia this is such a green field and the only way we're going to like get ahead to really get people excited and try for clarity together as a community.
Nic Fillingham: How would an ML powered vacuum cleaner work?
Natalia Godyla: I was going to say that sounds like a 2020 headline, ML powered vacuum cleaner burns down house and threat.
Ram Shankar Siva Kumar: Oh my gosh. So, okay-
Nic Fillingham: Man bites dog.
Ram Shankar Siva Kumar: There you go. It's funny because this was not an example that I made up. I wish I did. I know. Yes, Nic. I see, yes.
Nic Fillingham: What?
Ram Shankar Siva Kumar: Yes.
Nic Fillingham: All right.
Ram Shankar Siva Kumar: This is a well-documented paper called a concrete problems in AI safety. And they talked to the most it's like Final Fantasy. Everything that needs to go wrong is going wrong. So, they're like robots that are burning down homes, breaking things that they can clean up. So if your machine learning system is not trustworthy, there are going to be problems. And you really need to think about that.
Nic Fillingham: I can't even get my kettle to boil.
Ram Shankar Siva Kumar: But the thing that really worries me is ML applications used in health care. You keep seeing headlines like machine learning systems being used by radiologists, amidst radiologists when it comes to identifying Mulligan tumors and things like that. There's a fantastic work by Samuel Finlayson from Harvard. He show that if you take an x-ray image, just take it and slightly rotate it and you give it to the ML system. It goes from very confidently thinking that it's malignant to very confidently judging it's benign. And that is really scary.
Ram Shankar Siva Kumar: In the beginning of the podcast, we spoke a lot about how an adversary can subvert machine learning systems for fun and profit. Oh boy, there is an entirely separate world of how machine learning systems can fail by themselves. What we call unintentional failure modes. And trust me, you will want to go live in the middle of the North cascades in a cabin after you read that work. It'd be like, I am not getting anything ML powered until they figure this out. But the good news is there're extremely smart people, including Hiram and Will from my team who are looking into this problem. So you can feel a little bit like a shore that they're the true Avengers out there.
Natalia Godyla: I love all the head nods from Nic. I feel like it underscores the fact that we only know a percentage of the knowledge on ML. So we just need a community behind this. No one company person can know all of it.
Ram Shankar Siva Kumar: Absolutely. Oh my gosh. Yeah. When we open the adversarial ML threat matrix Google group, we now went from zero. We felt like nobody's going to join this Google group. It's going to be like a pity party where I'm going to email Michel from MITRE and he's going to respond back to me. But no, we went from zero to 150 right now over just the last four days.
Natalia Godyla: Ram, thank you for giving us all of this context on the adversarial ML threat matrix. So what's Microsoft's continued role. What's next for you in ML?
Ram Shankar Siva Kumar: First of all, we are hiring. So, if you'd like to come and join us, we are looking for developers to come and join us in this quest. So please email anybody, even Nic, and he can forward his resume.
Nic Fillingham: Do you need to have a cowboy hat? Is a cowboy hat a necessity?
Ram Shankar Siva Kumar: Not at all. We will accept you for who you are.
Natalia Godyla: Do you provide the cowboy hats?
Ram Shankar Siva Kumar: We will provide everything. Anything to make you feel comfortable. So we are growing and we'd love to work with the folks. With the adversarial ML threat matrix, like I said, we really are looking for feedback from the community. We really think that like Natalia very correctly pointed out this is a problem so big that we can only solve it if we all come together. So please go to our GitHub link. I'm sure Nic and Natalia might put the link to it. We'd love to get their feedback.
Ram Shankar Siva Kumar: The second thing is if you kind of are... We are especially looking for people to come in at case studies, if you think we're missing a tactic, or if you think that you've seen an attack on a ML system on a commercial Ml system, please reach out to us and we'd be happy to include that in the repository.
Nic Fillingham: If your autonomous vacuum cleaner has attempted to undermine democracy, let us know.
Ram Shankar Siva Kumar: And the one thing that I want everybody to take away is that when we did our survey, 25 out of 28 organizations did not have tools and processes to kind of secure the ML systems. So if you're listening to this podcast and you're like, "Oh my gosh, I don't have a guidance." Do not feel alarmed. You're tracking with the majority of the industry. In fact, three organizations, all of whom were large in our survey even thought about this problem. So there are tools for you and processes that we put out. So in our docs at Microsoft.com, there's a chat modeling guidance, there's taxonomy, there's a bug bar that you can give to your incident responders so that they can track bugs. And for the security analysts community, there is the adversarial ML chat matrix. So please go read them and please give us feedback because we really want to grow.
Natalia Godyla: I love it. Thank you for that. That's a great message to end on.
Ram Shankar Siva Kumar: Awesome. Thank you, Nic and Natalia for having me. Really appreciate it. This was really fun.
Natalia Godyla: And now let's meet an expert in the Microsoft security team to learn more about the diverse backgrounds and experiences of the humans, creating AI and tech at Microsoft. Today, we're joined by Justin Carroll, threat analyst on the Microsoft threat intelligence, global engagement and response team. Well thank you for joining us, Justin.
Justin Carroll: Thanks for having me.
Natalia Godyla: Well can we kick things off by you just sharing your role at Microsoft. What does your day to day look like?
Justin Carroll: So my role is related to threat hunting across large data sets to find advanced adversaries and understand what they're doing. Look for detection opportunities and communicate out the behaviors of the specific threats that we're finding to partner teams or to our customers to help them understand the threat landscape and kind of staying on top of what attackers are doing.
Natalia Godyla: That's super interesting. And can you talk a little bit about any recent patterns that you've identified or interesting findings in your last six, eight months?
Justin Carroll: Well, it's been a busy six or eight months, I would say, because everybody's been very busy with COVID. We've been seeing quite a large increase in human-operated ransomware and stuff like that. So I've been working really hard to try and figure out different ways to try and surface their behaviors as early as we can to customers to help them take action before the ransom happens. And we've been seeing quite a few other different really advanced adversaries compromising networks.
Justin Carroll: A lot of it's kind of the same old, same old, just more of it, but it's always interesting and there's never a shortage of new findings each day and kind of moments of, "Oh, that looks like this, or they're doing this now." Awesome. Great.
Natalia Godyla: You mentioned you're constantly trying to find new ways to identify these faster. What are the techniques that you're trying to use to find the threats quicker?
Justin Carroll: There's a whole bunch of different ways that you kind of try and surface the threats quicker. Some of it's research and reading other people's work and blogs and stuff like that. I tend to live in the data most of all, where I'm constantly looking at existing attacks and then trying to find similar related behaviors or payloads or infrastructure and pivoting on those to try and attempt to find the attack, to be ready to find it as early as possible. And what's called the kill chain.
Justin Carroll: So from the time that the attacker gets in the network, how quick can we find them before they've had a chance to conduct their next set of actions? So whether if they're stealing credentials or something like that, can we surface them before they've had a chance to do the credential theft and then kind of always trying to move earlier and earlier in the kill chain to understand how they got there. And then what are some of the first things that they did when they did get there and how do we surface those next?
Justin Carroll: Because a lot of those are a little bit more difficult to surface because it can kind of tend to blend in with a lot of other legitimate activities.
Nic Fillingham: What kind of tools do you use Justin? Are you in network logs and sort of writing queries, is there a big giant futuristic dashboard that you sit in front of and you have virtual reality gloves moving big jumps of numbers left and right. Well, what are the tools of your trade?
Justin Carroll: So one of the tools that we use a lot, there is a bunch of data that's stored... Customer facing, it's usually called Azure data Lake. It's these huge databases with large amounts of information where you can construct queries with what's called KQL, I believe it's Kusto query language. So there's a specific tool for kind of deep diving into all of that data across our many different sources. And then using that to basically structure and create different queries or methods of finding interesting data and then kind of pivoting on that data.
Justin Carroll: Then in addition, I've built some of my own tools to kind of help improve my efficiency or automate some of the stuff that I have to do all the time and then just to make me faster at hunting for the things that I'm looking for.
Nic Fillingham: Is it an AI version of yourself? Is it a virtual Justin?
Justin Carroll: No. We work with the ML team to try and share as much knowledge with them as possible. There is no tool for an AI Justin, as of yet.
Nic Fillingham: Well, let's back it up a bit. So one of the things we would like to do in these interviews with the security SMEs, I'm not even sure if we've explained what an SME yet. We call it a Subject Matter Expert. That's an acronym. We use a lot here at Microsoft. I think it's pretty broadly known, but if you've heard of SME or SME, that's what it means.
Nic Fillingham: Now, you and I, we crossed paths about a year ago for the first time when Jessica Payne, who actually hasn't been on the podcast yet, Jessica introduced me to you and she said, "You have to talk to Justin." And she gave me three sort of very disparate, but intriguing bits of data about you. She said, "Justin used to climb telegraph poles. He is a big Star Wars fan and is in a metal band." And I'm sure I've gotten those three things slightly wrong. Could you kind of talk about your journey into the security space and then sort of how you found yourself working for Microsoft. But first of all, these three things that Jessica told me are any of them true?
Justin Carroll: Mostly they are. So some of these will kind of combine for the telephone climbing aspect. I used to work for a wireless internet provider that had leases or specific towers, cell phone towers or other towers on top of mountains, essentially, where we would have wireless radio dishes that would communicate to each other. So I was occasionally tasked with installing and or fixing said towers, which is okay if you are fine with heights, I wasn't at first, but you just kind of get used to it. And you kind of realize once you're above 20 feet, it really doesn't make any difference. If you fall, it's going to hurt, but climbing a tower in the winter and in the wind and where you can barely feel your hands and all that wasn't great.
Justin Carroll: I was a pretty big Star Wars fan growing up as a kid, even more of a Ninja Turtle fan. And as for metal, I used to be in a band with some friends and have been playing guitar for 25 or 26 years. And music has been a very huge part of my life and remains to be.
Nic Fillingham: I think we'll circle back to Ninja Turtles. I'm not going to let that one go, but so let's talk about your path into security. So was this you're working for the wireless internet provider was this your first job. Was this mid career. Where does that fit in your sort of LinkedIn chronology? And at what point did you use formerly into insecurity?
Justin Carroll: So it's been a long and winding road to get here I would say. So the internet provider was what I would guess I'd call my first career job of sorts. I had started there in my early 20s and worked for them for about... sorry my cat is right in front of the microphone. One second.
Nic Fillingham: There's a cat there.
Justin Carroll: She wanted to say her piece. So I worked for the internet company for just under a decade. I used to do some networking type fun stuff in Halo 2, to kind of maybe garner a little bit of an advantage, I guess I would say, and use those learned skills to land that first job. And I did that for quite a while, but realized I was kind of stuck in this job. It was in a city that I didn't want to live in. And I had kind of maxed out my capabilities there. I had attempted to move to Portland because I wanted to have a bigger city experience. I applied to 254 jobs, got one interview for basically an office tech support role was the only position I got hired, but it wasn't feasible to live in Portland.
Justin Carroll: So after quite a bit of soul searching and realizing that basically nobody cared that I had eight years of on the job experience because I didn't have a college degree. There were not any doors open for me for the most part. I then decided to take a pay cut and go get a job at a university that was just a city over and work full-time and go to school for a degree in cybersecurity while working full-time for the university doing kind of technical work for them, helping them understand their... Sorry, my cat is a whole thing right now.
Nic Fillingham: Your cat's just trying to interject with like don't. Hey, you glossed over that Halo 2 thing, you better to come back to that.
Justin Carroll: Aria, come here.
Nic Fillingham: We're leaving all this in, by the way.
Natalia Godyla: Yeah. We're very much enjoying it.
Justin Carroll: So kind of advising the university on different technologies that they could use for their students. So I did that for about three and a half years while going to school and then graduated top of my class and applied for another 150 some odd jobs and mostly the Seattle area this time and was about to give up because even though I now had a degree and almost 10 years of experience, it still wasn't enough. And everybody that I kept losing to had between 10 and 20 years experience. And it just wasn't an option for folks with less specific cybersecurity experience to kind of enter the field.
Justin Carroll: There were a lot of walls that were put up. I had a friend of a friend who worked for cybersecurity at a company somewhere in Arizona, who I'd never met. And he decided to go out of his way, even though I'd never met him and looked for some cybersecurity type jobs in my area that he thought maybe I'd be good for and helped me look at my resume and stuff like this. And that helped me land a vendor role for Microsoft, where I kind of started my path and career towards cybersecurity specific stuff.
Justin Carroll: I had basically given up at that point on ever working in cybersecurity and had kind of thought that it just wasn't meant for me. So that was kind of a big break and a guy almost closed the application to apply for the job and then figured what's the worst they can say is no, that is kind of how I finally got to Microsoft and cybersecurity, where I was able to work as a vendor for the team evaluating kind of telemetry. And I was kind of given an opportunity to learn a lot and that eventually transitioned into when a position became available, where I started working full-time as a Microsoft employee and went from there.
Natalia Godyla: So what in your soul search brought you to cyber security? Was it your background, the fact that you already had those foundations as a network admin, or was there something in particular in the cybersecurity world that just attracted you?
Justin Carroll: I'd always found it fascinating. When I started university, they just launched the cybersecurity program. The quarter that I started there, and one of my friends who was a computer science major, basically called me up immediately and was like, "Hey, they just launched this. You need to do this." And there's the very popular culture aspect of it where everybody thinks it's fascinating and you sure there was a little bit of a grab with that. But I like learning how computers work and I like kind of the constant problem solving nature of everything. And the first class I took on it I was hooked and still remains that day where it's just, it's fascinating and it's really fun to just kind of continually work to see what attackers are doing. But I also, there's a huge aspect of it like I like helping people. I think it's important and having a role where I'm able to help millions or even potentially billions of people through better detections or stopping malware. It feels pretty great.
Nic Fillingham: What other aspects Justin, of your path to security, your path to Microsoft, do you feel you're sort of bringing forward? I want to ask about you very briefly mentioned something about Halo 2 and I want to know what that was. And then I wonder if there were other sort of dare I say, sort of maybe unorthodox or non-traditional things that you worked on where you learned a bunch of bunch of tools or tricks of the trade that you're bringing forward to your work right now.
Justin Carroll: So Halo 2 was a fun one. Back in those days, there were lots of what were called modders, who would mod their Xbox's to gain an unfair advantage. So I would use my networking know-how basically, and learned a lot of it too, when encountering a modder to kick them out of the game. I think it was possibly a little frowned upon, but I was tired of having cheaters constantly win, so I did a lot of research and I didn't know a whole lot about networking at that point, but I tried to not use it as a competitive advantage, but more to just level the playing field, but it was a great way to learn how firewalls worked and network traffic and building more on my understanding of computers.
Justin Carroll: And then, kind of, that set a foundation for me, of understanding, there's always going to be stuff that I don't know and what I have done, but I did it all through college and continued all the way till basically getting full-time employment at Microsoft was I set up a lab environment and I would set up servers and clients and I would attack them and monitor the logs on my own little private lab on my machine and see what worked, what didn't, try and figure out why it worked, what didn't and try and build different tools to see how I could make it more effective or deal with different issues.
Justin Carroll: Just kind of both playing attacker and defender at the same time on my network, all by myself, essentially and kind of learning from all of that data was massively important and anybody who's looking to get into security, I highly recommend both learning how to attack, on a safe, your own little lab environment where you're not hurting anybody. And what's it like to try and defend and find those attacks because both sides are-
Nic Fillingham: Red Justin versus blue Justin.
Justin Carroll: Exactly. Yes.
Natalia Godyla: You noted earlier that just the sheer amount of data can be overwhelming, especially as you moved through your career and then came to Microsoft where we have billions of signals. So the same transition happens from Halo to now just the sheer scale and scope of your role and the amount of good that you can do. So, how did you handle that overwhelming amount of information, amount of impact that you can have?
Justin Carroll: So when I was first brought on one of the things that made a significant difference was I had somebody that kind of instructed me in a lot of the ways of kind of how to work with the data, but I was also given quite a bit of an area for trial and error. So there was lots of opportunity to fail and to learn from what didn't work and to kind of keep building on that. And then any time that I got stuck or I would kind of just do everything I could to attempt to solve the problem or work with the data. If I kind of hit a wall that I couldn't climb on my own, I could go to him and then we would solve it together. So it was kind of both a mentoring and a guidance thing, but also kind of given that ability to experiment and try and learn. So that was kind of one of the biggest ways of learning to pivot on that data and understand it and consume it.
Justin Carroll: And then honestly, collaboration with other folks on my team and other team was massively instrumental to be able to kind of learn what they had already learned or pass on my knowledge to them. And just that constant sharing and understanding because there is so much data, it's quite impossible almost to be an expert at all of it. So having those folks that you can reach out to you that are experts in each basically set of their data. So you can understand what the data is trying to tell you, because that's one of the things that is particularly difficult is to take the data and actually glean understanding from it. The data is trying to tell you something, you just need to make sure you're interpreting the message correctly.
Natalia Godyla: How do AI and ML factor into your role into helping you manage this data and collaborating with other teams.
Justin Carroll: So I work quite a bit with a lot of different data science folks on a few different teams to either use a lot of the models that they're creating to kind of a source, a lot of the malicious information or a particular attackers or stuff like that. And then also collaborating back in sharing my knowledge and intelligence to them to say, this is what an attack looks like. This is what it should look like in the data and kind of giving them the ideas and signals for what they should be looking in their data to kind of train those models.
Justin Carroll: It's really important to have that partnership between security and data science for AI and ML to kind of help them understand the security sphere of it. And then they can kind of take the real math and data prowess that they've got and turn our knowledge into ML or AI to detect and surface a lot of these things.
Nic Fillingham: If it's possible, Justin, how would you sort of summarize your guidance to other Justin Carroll's that are out there that are... They want to get into security, they're fascinated by cybersecurity in sort of a macro sense, but they feel either don't have a degree or they're not even sure what they should go study or they're trying to work at, how can they translate their current sort of career experience and sort of skills? Can you summarize that into some guidance of what folks should do to try and break in?
Justin Carroll: Sure. One, if you're in school, remember that school is not going to teach you a lot of the stuff that you need to know. It's lots of taking what you're learning and building upon it outside. So if it's cybersecurity, that's an interest, try and experiment and fail. Cyber security is huge. There are so different facets of it. Find out the thing that kind of scratches the itch and piques your interest. For me, that was setting up a lab, right? Where I could play both the attacker, the defender, the person monitoring logs, the person setting up all the configurations to try and stop the attacks and was able to kind of see all different aspects of the industry.
Nic Fillingham: So just jumping in, was that literally just a bunch of VMs on your machine or did you have multiple PCs sort of networked together? Just very quickly, what did that look like? How accessible is setting up a lab? I guess I'm what I'm asking.
Justin Carroll: It is pretty accessible. So while I was in college, it was actually multiple machines and I had four different machines and I set up a router that you can pick up for 50 bucks and a smart switch that I could mirror the traffic on to understand everything for 100 bucks. So there's a little bit of cost. That was kind of my college setup. And as I was kind of learning where I at that point, it made a little more sense to do it with actual machines and for extra clarity. My college was only a couple of years ago. I did not go to college young. So the next route that I did once I headlined did my vendor role and was kind of like security is for me and I want to keep building on it.
Justin Carroll: I did it all with VMs. So I just had a desktop computer that had okay specifications and I configured two clients, the domain controller, server on the device and then a mail server. And then basically you just connect to each client and then network them all together. So at that point you can use VirtualBox, you can use lots of different stuff. So the availability of doing that, it's actually pretty good. There isn't a lot of overhead costs or anything like that. You just have to have a okay computer.
Natalia Godyla: What about resources to learn how to do all of that? Are there organizations or sites that someone could turn to, if they're interested in starting to do some of this starting to experiment with what they're interested in?
Justin Carroll: Honestly, I would say one of the best resources that I had throughout was YouTube. It was a great place to get walkthroughs for every different thing. So like I wanted to learn how to set up a VM and configure it with networking to another VM. I turned to YouTube. I wanted to learn how to attack the VM using Kali Linux, YouTube. And there's a whole bunch of different channels out there that specifically focus on that. And then the other thing is because it's so much more open for creators to share content. You can find people who are at a similar level or maybe just a few steps ahead of you. So you can really kind of join along with other people.
Justin Carroll: There are a few websites for coding, I think one's called hacking the box as far as attacking different things. And that was also kind of fun where a lot of the devices that need to be attacked we're already pre-configured for you. But for me, honestly, a lot of the fun was setting up those devices and then learning what I did that worked and didn't and what allowed it to be attacked and what I could do to stop that.
Natalia Godyla: Quick plug Microsoft security also has a YouTube channel in case somebody would like to get any, how to content on our products.
Nic Fillingham: Natalia may or may not have been involved in that channel, just full disclosure there.
Natalia Godyla: Yeah. I couldn't help myself. But it is also great to hear that you found people to work with in the community as well. That's something that's been noted by a few of our guests, like Michelle Lamb, that as she was entering the space, she found mentors. She found conversations, people readily available to either work on a problem alongside her, or just answer questions. So I'm glad that you've also been able to turn to the community for that. So what's next for you? Is there a new challenge that you'd like to solve?
Justin Carroll: Definitely want to work on the toolkit that I'm building and kind of continue that growth. It's been interesting to kind of see the hurdles I run into. And even last week I ran into one that felt insurmountable and was able to chat with one of the devs and solve in a few minutes and learned a whole lot and going forward, now I have that in my pocket. And then both-
Nic Fillingham: Hang on. Did you say you went from found a new challenge, thought all this is insurmountable and then a few minutes later you solved it?
Justin Carroll: With a little support from people that knew how to solve the problems. So collaborating with like one of the other devs on the team and basically having him kind of explain the part it felt like a giant wall, but really once you kind of have somebody to break it down a little bit for you, it was just like, "Oh, okay. I see what I'm missing here." And then it was just like, "Got it. Okay. Moving forward."
Nic Fillingham: Oh, I see. So that that's more an endorsement. Yeah, I got it.
Justin Carroll: Yeah. Yeah. It's more an endorsement of others teaching abilities and just kind of those times of being able to reach out to others for when you really get stuck and how much of a difference it can make. I had spent an hour on something and was just like, this is ridiculous. This should work. Why isn't it working? What's wrong with me. I'm not smart. And then just chatting with them a little bit and then figuring it out and then like, "Oh, okay. Oh, okay. That's actually pretty simple." I wasn't thinking about it in the right way and kind of getting that other perspective.
Justin Carroll: And then what's next kind of going forward is a kind of continued partnership with a lot of the data science folks to, I think we've only scratched the surface in many ways as an industry on how data science and cybersecurity can work together. So I am very excited to kind of see what kind of stuff we can accomplish, whether it's, you know, surfacing attacks shortly after they happen, very early in the kill chain or understanding related behaviors and trying to understand who the might be, or I think most of all, the intent of the attack or adversary.
Justin Carroll: Intent can sometimes be a very difficult to suss out, even for SOCs and their entire center. They have all these folks that are trying to figure out what happened. Why did it happen? What does it actually mean? So if we can have data science that can provide a lot of context on that, through understanding existing attacks and modeling what future ones might look like, I think there's some pretty exciting opportunities there.
Nic Fillingham: All right, I'm doing it. We're coming to Teenage Mutant Ninja Turtles. You're a fan. How much of a fan are you, Justin?
Justin Carroll: I'd say quite a fan. I do have a couple of figurines and a mint package unopened from '87 I think, something like that. And then have a Ninja Turtles tattoo on my back of Raphael. So that was kind of one of those moments where I was trying to think about what steps I wanted to take forward in life and things like that. And I had kind of thought about what are the things that actually make me happy?
Justin Carroll: This was probably my mid 20s quarter life crisis kind of thing. And I was like, "I always liked the Ninja Turtles as a kid." They always brought me great joy. I still get excited about watching them. The movies are definitely a guilty pleasure. I realized they're not great. But now I'm talking about the original movies, not the new ones. We won't talk about the new movies. And it was just one of those like, "Yeah, I identify with this. This is a huge part of my life. It's been around since I was... it was started the year I was born." So I was just like, "All right, let's do it." And haven't regretted it at all.
Nic Fillingham: I was going to ask who your favorite turtle was, but you've obviously... If you've inked Rafaelle on your back so that question is moot. I'm a Donatello guy. I've always been a Donatello guy.
Justin Carroll: I would think of myself as Raf, but really I'm more of a Donatello. Ralph was kind of the cool guy with a little bit of an attitude, but really I was Donatello. When I was 10 dressed up for Halloween, I was Donatello. I'm definitely Donatello with a little bits Raf thrown in for good measure.
Nic Fillingham: Well, this has been a blast. Thank you, Justin, for walking us down, Teenage Mutant Ninja Turtle memory lane, and Halo 2 memory lane and sharing your story with us. It was great. Wonderful to get your perspective. Great to have you as a part of the threat hunter team here at Microsoft and contributing in all the ways that you do. Thanks for joining us. I'm sure we'll talk to you again at some point on the Security Unlocked podcast, but keep doing you Cowabunga, dude.
Justin Carroll: Thanks very much for having me. I appreciate it. It was great to talk to you all.
Natalia Godyla: Well, we had a great time unlocking insights into security from research to artificial intelligence. Keep an eye out for our next episode.
Nic Fillingham: And don't forget to tweet us @msftsecurity or email us at firstname.lastname@example.org with topics you'd like to hear on a future episode. Until then stay safe.
Natalia Godyla: Stay secure.