CSO Perspectives 4.13.20
Ep 2 | 4.13.20
Alexa, are you actually self-aware? (And, does it matter?)
Transcript

Rick Howard: [00:00:12] My name is Rick Howard. You are listening to "CSO Perspectives," my podcast about the ideas, strategies and technologies that senior security executives wrestle with on a daily basis. Today, I am going to talk about artificial intelligence, or AI, the difference between AI and machine learning - one of my pet peeves - and how both of these ideas are related to SIEM, EDR and XDR. Now, I'm not an expert in artificial intelligence by any means. What I am is a network defender. Mostly, what I know about AI came out of pop culture, movies like the original "Blade Runner" with the Roy Batty robot and his famous line - all those moments will be lost in time like... 

0:00:58:(SOUNDBITE OF FILM, "BLADE RUNNER") 

Rutger Hauer: [00:00:58]  (As Roy Batty) Tears in rain. 

Rick Howard: [00:01:03]  ...Or the HBO "Westworld" robot Dolores saying... 

0:01:06:(SOUNDBITE OF TV SHOW, "WESTWORLD") 

Evan Rachel Wood: [00:01:06]  (As Dolores Abernathy) These violent delights have violent ends. 

Rick Howard: [00:01:09]  ...Or William Gibson's cyberpunk literature with his AI named Neuromancer and the author's famous opening line to the book with the same title - the sky above the port was the color of television turned to a dead channel. Man, I wish I could write like that. In recent years, though, modern scientists have started to talk about the possibility of creating a real, honest-to-goodness AI within this century. Some respected Silicon Valley entrepreneurs and big thinkers like the late Stephen Hawking, Elon Musk and Bill Gates have all said that this is indeed a possibility within this century. Some, like Ray Kurzweil, the famous futurist, have said that it could happen as early as 2050. 

Rick Howard: [00:01:50]  So what does it mean to have a real, honest-to-goodness AI? Well, this will be the moment when we stuff so much intelligence into a software program that it becomes sentient or self-aware, and then it can change its own internal algorithms, improve its own code. Once we get there, we will have created another life form. Science fiction writers started calling this milestone in computer automation "the singularity," which, for me, is best represented in the old Arnold Schwarzenegger movie "The Terminator," when Skylab wakes up and decides that humans are not necessary and tries to wipe us out. More recently, though, you could look at Scarlett Johansson's portrayal of an AI named Samantha in a movie called "Her." She plays a kind of souped-up version of Siri that connects to all the other Siris in the world, and together, the Siri collective becomes so advanced that they leave their masters to pursue another plane of existence. Now, that is a singularity. 

Rick Howard: [00:02:46]  Big thinkers notwithstanding, none of us knows for sure when all that will happen. But we do have a test of sorts that will help us decide when we get there. It is called the imitation game, and it was devised by my all-time computer science hero Alan Turing back in 1950. And by the way, if you have not watched the movie "The Imitation Game," with Benedict Cumberbatch playing Dr. Turing, you're in for a treat. Besides dramatizing the Allies' code-breaking efforts at Bletchley Park during World War II, Cumberbatch delivers the most succinct and compelling description of what AI is that I have ever run across. 

0:03:22:(SOUNDBITE OF FILM, "THE IMITATION GAME") 

Benedict Cumberbatch: [00:03:23]  (As Alan Turing) Would you like to play? 

Rory Kinnear: [00:03:25]  (As Detective Nock) Play? 

Benedict Cumberbatch: [00:03:26]  (As Turing) It's a game, a test of sorts for determining whether something is a machine or a human being. 

Rory Kinnear: [00:03:36]  (As Nock) How do I play? 

Benedict Cumberbatch: [00:03:37]  (As Turing) Well, there's a judge and a subject, and the judge asks questions and, depending on the subject's answers, determines who he is talking with, what he is talking with. And all you have to do is ask me a question. 

Rick Howard: [00:03:54]  So according to this Turing test, this imitation game, we don't quite yet have a general-purpose AI, and what I mean by that is an AI that can walk around in the world and humans can't tell if it is a machine or not. But what we do have is a set of specific AIs designed for very small knowledge domains. These could pass the Turing test or, if not, could come very close. Now, for example, "Dota 2" is an astoundingly complex online computer game in which two teams of five players compete to seize and destroy the opposing team's base. In a 2017 demonstration tournament, an AI beat one of the premier professional players - and I'm going to totally mangle his name - Danylo Dendi Ishutin. And according to WIRED magazine, it is common practice now that commercial airlines use software robots to land their planes - under human supervision, of course. But I bet you didn't know that was happening. 

Rick Howard: [00:04:49]  The point is, for these specialized knowledge domains, we are already passing the Turing test. And we can see that in other knowledge domains, we will be able to do it in the very near future, like in autonomous cars and personal assistants like Siri and Alexa. These systems are almost there but not quite. We are accomplishing these milestones using a coding technique called machine learning. And for certain problems in the security space, machine-learning techniques do very well. But machine learning is not AI, at least not yet, and to understand why, let me take you back to the dinosaur days of yesteryear, the time before there was an internet and not everybody carried a high-speed computing device in their back pocket that we all laughably call a phone. 

Rick Howard: [00:05:34]  As an aside, calling an iPhone a phone is kind of like calling Elon Musk's Mars rocket an Uber ride. The phones you carry around with you are 1.5 million times faster than the computers that power the Voyager 1 and Voyager 2 spacecraft. And we call them phones - please. 

Rick Howard: [00:05:52]  Well, when I wrote code back in those dinosaur days, if I wanted the computer to do something, I had to think of it beforehand and tell it what to do. If there were a hundred possible things I wanted it to do, I had to code each and every one of them. By the way, I was never very good at coding. I could do it. I mean, I could get the programs to work, but my stuff was never elegant. I hit my programs over the head with a hammer. I could recognize elegant code and admire it when I saw it, but I could never emulate it. It was kind of like opera for me. It's kind of what the reason I went into management. But coming back to the modern day, machine-learning algorithms use coding techniques that allow programs to sift through large piles of data over and over again, learning how to determine the right answer without explicitly being told how to do it by the programmer. 

Rick Howard: [00:06:39]  Some of these machine-learning techniques like deep learning and supervised versus unsupervised learning and reinforcement learning - they all sound complicated, but they are really just variations on the same theme. The point is for certain problems in the security space, these techniques do very well. For example, most antimalware vendors have been collecting malware samples for over a decade. Many of them have petabytes of file samples that they know are malicious and petabytes of files that they know are benign. Soon, those collections will be in the exabyte range. This situation is the perfect use case for machine learning. The current set of machine-learning algorithms can look at a brand-new file, a file that they have never seen before, and classify it as malicious or benign with over 97% accuracy, and they can do it with speed by simply looking at the attributes of the file. They don't have to run it in a sandbox or have any preknowledge of it in the form of a signature. That is the power of machine learning. 

Rick Howard: [00:07:38]  But the key to all this is large data sets. It is tough to tune your machine-learning algorithms when your sample size is small. We tried to. We tried to use SIEMs to do it - these security information and event management systems. The idea was that we would stuff all of our telemetry from our security stack and networking equipment into the local SIEM installed behind the firewall and then write SIEM logic that would discover abnormal and unknown adversary activity. Well, that never worked. We could never store enough information in the SIEM to make that worthwhile because the SIEM vendors made their money on hard drive space that they sold us, and they were very expensive. 

Rick Howard: [00:08:16]  So network defenders like me started to make decisions on what not to store. We're not going to store PCAP data because, man, that's a lot. We're not going to store telemetry data for entire segments of our network because we don't have the hard drive space for it. And we are only going to store data for, say, three weeks or so before we start to overwrite it all. And we had to have SIEM experts on staff to write SIEM logic. Most of us didn't have those people. Local SIEMs turned out be pretty good for forensics tools. You could use them to find out what happened after the fact. But they were never very good at finding unknown adversary activity. 

Rick Howard: [00:08:52]  Enter the cloud. As the industry has started to pivot away from on-prem security devices to cloud-delivered security services, it all of a sudden became pretty cheap, comparatively, to store all of your telemetry, from your security stack and your networking equipment into a cloud environment - a cloudy SIEM, if you like. There are two schools of thought at the moment about how to do this. The first school of thought is the security vendor cloudy SIEM environment. These are the security vendors that automatically collect the telemetry from the services they have already sold you into their own cloud infrastructure. 

Rick Howard: [00:09:31]  They run their own machine-learning algorithms for you, and when they find new bad guys, they send their prevention controls to stop them down to their already deployed product stack in your environments. All you have to do is decide to use them. The downside is that they mostly only work for their own products' telemetry. They don't know how yet to consume their competitors' data. And since most of us have multiple security vendors in our environments, we don't get the complete picture. That's changing. Vendors are working on that. But we are a long way from that being a universal concept. 

Rick Howard: [00:10:02]  The second school of thought is the SIEM in the cloud model. Take the idea of the old on-prem SIEM and move it to the cloud. These solutions can consume most any kind of security vendor and network vendor telemetry. The downside is they still rely on the customer to sift through the data themselves looking for bad guys, maybe even running their own machine-learning algorithms and then writing code to deliver the new prevention controls to their deployed security stack. Today, these cloud SIEM vendors cannot do that for their customers. That is also starting to change. They are rushing to figure out how to do that, but we are a long way from that being universal, too. 

Rick Howard: [00:10:41]  So think of both schools of thought as being on opposite ends of the spectrum. They are both trying to race to the middle, where they can automatically - can consume telemetry for all devices and automatically generate and deliver prevention controls to their customers' infrastructure. They are just approaching it differently. My guess is that the security vendor cloudy SIEM environments have the edge because they can already deliver prevention control to their own product set. 

Rick Howard: [00:11:08]  Regardless, whichever school of thought gets to the middle first wins, and there is big money to be made for the winner. Whomever wins, though, the benefits to the network defender community are staggering because we will finally be able to completely use machine-learning algorithms to detect and prevent previously unknown adversary behavior. Here's why. You may have heard of a set of technologies called EDR or endpoint detection and response. EDR vendors collect all the telemetry off your endpoints, store them in the cloud and run machine-learning algorithms on the data looking for bad guys. The setup is very similar to looking for just malicious files on the endpoint, only the EDR vendors broaden the telemetry collection to all endpoint changes - pretty cool stuff. 

Rick Howard: [00:11:52]  Well, why would you only look on the endpoint for this kind of data? We know that the adversaries have to run a sequence of actions along the intrusion kill chain to succeed with their mission. This includes network and endpoint activity from delivery of malicious code to command and control to lateral movement to exfiltration. Wouldn't you want all of that telemetry collected in the cloud so that your machine-learning algorithms can do something with it? Well, that is what XDR is. The X stands for everything - everything detection and response. The point is that you collect all of the telemetry along the intrusion kill chain and develop machine-learning algorithms to search for adversary activity. XDR is a relatively new tool in the toolbox. But that is the direction the entire industry is moving toward, regardless of which SIEM school of thought that will eventually win this fight. 

Rick Howard: [00:12:45]  What is the impact of the network defenders? Well, you may have noticed that in all of this discussion, there is no workable on-prem SIEM solution. If you're going to collect all the data you need for your XDR machine-learning algorithms to work, the only viable way to do that is to store it in some vendor's clouding infrastructure. In other words, you're going to have to get comfortable storing your telemetry data in one or more security vendor data centers. Now, take a deep breath. I know this is a hard pill to swallow. I already hear you saying things like, I'm not going to let some third party store my network data. I'm going to keep trying to do it the old way because the risk is too high. 

Rick Howard: [00:13:22]  Well, I'm here to tell you that the risk of not using the cloud so that you can take advantage of these automatic machine-learning capabilities is far greater than the risk of storing most of your networking data there. Our cyberspace adversaries have automated their attacks. We are never going to win against them if we cannot automate our response. And if you are a government network defender, you are starting to panic a little bit at this point because you can see that this is the only way forward, but you do not see a clear path on how to get there. 

Rick Howard: [00:13:51]  Governments are really sketchy about putting their information in the cloud. But the governments and the military are going to have to get comfortable storing their data in public cloud infrastructure - the Amazons, the Googles and the Microsofts of the world. I know that sounds really scary to the government crowd, but it is inevitable. You can fight it, but you're going to eventually end up there in the long run. As Kyle Reese said to Sarah Connor in the original "Terminator" movie... 

0:14:15:(SOUNDBITE OF FILM, "THE TERMINATOR") 

Kyle Biehn: [00:14:15]  (As Kyle Reese) Come with me if you want to live. 

Rick Howard: [00:14:17]  Store your data in the cloudy environment if you want to live in the future. So that's a wrap for my thoughts on AI and machine learning, SIEMs and XDR. 

Rick Howard: [00:14:29]  If you agree or disagree with anything I have said, hit me up on LinkedIn or Twitter, and we can continue the conversation there. "CSO Perspectives" is edited by John Petrik and Tim Nodar and executive produced by Peter Kilpe, sound designed and mixed by the insanely talented Elliott Peltzman. And I am Rick Howard. Thanks for listening to "CSO Perspectives," and be sure to look for more Pro Plus content at thecyberwire.com/pro website.