CSO Perspectives is a weekly column and podcast where Rick Howard discusses the ideas, strategies and technologies that senior cybersecurity executives wrestle with on a daily basis.
Alexa, are you actually self-aware? (And, does it matter?)
The forecast is cloudy, but that’s a good thing.
Artificial intelligence, or AI, represents a family of technologies that senior security executives wrestle with daily. It’s often mentioned in the same breath as machine learning (“ML”), but failing to see the differences between AI and ML is one of my pet peeves. Both of them are related to SIEM, EDR, and XDR.
Artificial intelligence in popular culture: Commander Data or the Terminator?
I am not an expert on artificial intelligence by any means, but I am a network defender. And like many people, the picture I’ve formed of AI comes largely from pop culture, from movies like the original Blade Runner, with the Roy Batty robot and his famous last line, “I have seen things like tears in the rain. All those moments will be lost in time, like tears in rain." Or the HBO WestWorld robot Dolores saying, “He said, ‘These violent delights have violent ends’" Or William Gibson’s cyberpunk literature, with his Neuromancer AI and the author’s famous opening line in the novel of the same title, “The sky above the port was the color of television turned to a dead channel.” (And I wish I could write like that.)
In recent years, however, scientists have talked about the possibility of creating a real, honest-to-goodness AI, and doing so within this century.Some respected Silicon Valley entrepreneurs and big thinkers like Elon Musk, Bill Gates, and the late Stephen Hawking have all said that’s indeed a possibility within this century. Some, like futurist Ray Kurzwell, say it could happen between 2030 and 2050.
So, what would it mean to have a real, honest-to-goodness AI? This will be the moment when we stuff so much intelligence into a software program that it becomes sentient or self-aware, and it can change its own internal algorithms, and improve on its own code. Once we get there, we will have created another life form.
Science fiction writers and transhumanists have started calling this milestone in computer automation “the singularity,” which, for me, is best represented in the old Arnold Schwarzenegger movie The Terminator, when Skynet wakes up (“achieves self-awareness”) decides that humans are not necessary, and tries to wipe us out. More recently though, you could look at Scarlett Johansson's portrayal of the AI Samantha in a movie called Her. She plays a kind of souped-up version of Siri that connects to all of the other Siri’s in the world and together, the Siri collective becomes so advanced, that they leave their masters to pursue another plane of existence.Now that’s a singularity. See the Verge for a summary of thoughts on the singularity.
Strong AI and the Turing Test.
But, big thinkers notwithstanding, none of us knows for sure when that will happen, but we do have a test of sorts that will help us decide when we get there. It’s called the Imitation Game, and it was devised by my all-time computer science hero Alan Turing in 1950. (And by the way, if you haven’t watched the movie, The Imitation Game, with Benedict Cumberbatch playing Turing, you’re missing out. Besides drama involving the Allies’ code-breaking efforts at Bletchley park during World War II, Cumberbatch delivers the most succinct and compelling description of what AI is that I have ever run across. The Imitation Game, or the Turing Test, basically asserts that, if you had two beings, one a human and one a machine, answering questions behind a blind, and if you couldn’t tell which was the human and which the machine, then, functionally, there was no difference between them. The machine might as well be thinking.
So, we’re not quite there yet for a general purpose AI. What we do have is a set of pretty good AIs for very small knowledge domains that could pass the Turing Test. For example, Dota 2 is an astoundingly complex online computer game in which two teams of 5 players compete to besiege and destroy the opposing team’s base. In a 2017 demonstration tournament, an AI beat one of the premier professional players, Danylo “Dendi” Ishutin.
And, according to WIRED, it’s common practice now that commercial airlines use software robots to land their planes (under human supervision, of course). The point is that for these specialized domains, we’re already passing the Turing test and we can see that in other knowledge domains, we will be able to do it in the very near future. Consider autonomous cars and personal assistants like Siri and Alexa. These systems are almost there but not quite.
We are accomplishing these milestones using a coding technique called machine learning, and for certain problems in the security space, machine learning techniques do very well. But machine learning is not AI, at least not yet.
To understand why, let me take you back to the dinosaur days of yesteryear, the time before there was an Internet, and not everybody carried a high speed computing device in their back pocket that we all laughingly call a “phone.” (As an aside, calling an iPhone a “phone” is like calling Elon Musk’s Mars rocket an Uber. The phones you carry around with you are a million-and-a-half times faster than the computers that power the Voyager 1 and 2 spacecraft. And we call it a “phone.” Please!)
Well - when I wrote code back in those dinosaur days, if I wanted the computer to do something, I had to think of it beforehand and tell it what to do. If there were a hundred possible things I wanted it to do, I had to code each and every one of them. By the way, I was never very good at coding. I could do it; I mean I could get programs to work, but my stuff was never elegant. I hit my programs over the head with a hammer. I could recognize elegant code and admire it when I saw it, but I could never emulate it. It’s kind of like opera for me, and it’s why I went into management.
Machine learning: the reality.
But to return to the issue, machine learning algorithms use coding techniques that allow programs to sift through large piles of data over and over again, learning how to determine the right answer without explicitly being told how to do it by the programmer. Some of these machine learning techniques, like deep learning, supervised vs unsupervised learning, and reinforcement learning, sound complicated but they are really just variations on the theme. (WIRED has an account.)
The point is, for certain problems in the security space, these techniques do very well. For example, most anti-malware vendors have been collecting malware samples for over a decade. Many of them have petabytes of file samples that they know are malicious and petabytes of files that they know are benign. Soon those collections will be in the exabyte range. This is the perfect use case for machine learning.
The current set of machine learning algorithms can look at a brand new file, a file that they have never seen before, and classify it as malicious or benign with over 97% accuracy And they can do it with speed by simply looking at the attributes of the file. They don’t have to run it in a sandbox or have to have any pre-knowledge of it in the form of a signature. That’s the power of machine learning.
But the key to all of this is large data sets. It is tough to tune your machine learning algorithms when your sample size is small. We tried to. We tried to use SIEMS to do it, these Security Information and Event Management systems. The idea was that we would stuff all of our telemetry from our security stack and networking equipment into the local SIEM installed behind the firewall and write SIEM logic that would discover abnormal and unknown adversary activity.
That actually never worked. We could never store enough information in the SIEM to make that worthwhile, because the SIEM vendors made their money on the hard drive space that they sold us, and they were very expensive.
So, network defenders started to make decisions on what not to store:
- We aren’t going to store PCAP data because that is a lot.
- We are not going to store telemetry data for entire segments of our network because we don’t have the hard drive space.
- We are ONLY going to store data for three weeks before we start to overwrite it all.
And, we had to have SIEM experts on staff to write the SIEM logic. Most of us didn’t have those people. Local SIEMS turned out to be pretty good forensics tools. You could use them to find out what happened after the fact. But they were not very good at finding unknown adversary activity.
Enter the cloud (and the vendors).
As the entire industry has started to pivot away from on-prem security devices to cloud-delivered security services, It all of a sudden became pretty cheap comparatively to store all of your telemetry from your security stack and your networking equipment into a cloud environment, a Cloudy SIEM, if you will. There are two schools of thought at the moment about how to do this.
The first school of thought is the Security Vendor Cloudy SIEM environment. These are the security vendors that automatically collect the telemetry from the services they have already sold you into their own cloud infrastructure. They run their own machine learning algorithms for you, and when they find new bad guys, they send the prevention controls to stop them down to their already deployed product stack in your environments. All you have to do is decide to use them. The downside is that they mostly only work with their own product’s telemetry. They don’t know how yet to consume their competitor’s data And since most of us have multiple security vendors in our environments, we don’t get the complete picture. This is changing. Vendors are working on that but we are a long way away from it being universal.
The second school of thought is the SIEM in the cloud model. Take the idea of an on-prem SIEM and move it to the cloud. These solutions can consume most any kind of security vendor and network vendor telemetry. The downside is that they still rely on the customer to sift through the data themselves looking for bad guys, maybe even writing their own machine learning algorithms, and then writing code to deliver the new prevention controls to their deployed security stack. Today, these Cloud SIEM vendors cannot do that for their customers, but that’s also starting to change. They’re rushing to figure out how to do that, but we are a long way off from it being universal.
Think of the two schools of thought as being on opposite ends of a spectrum. Both are trying to race to the middle where they can automatically consume telemetry from all devices and automatically generate and deliver prevention controls to their customer’s infrastructure. They're just approaching it differently. My guess is that the security vendor cloudy SIEM environments have the edge because they can already deliver prevention controls to their own tool set.
The machine learning stakes.
Regardless, whichever school of thought gets to the middle first, wins, and there is big money to be made for the winner. Whoever wins though, the benefits to the network defender community are staggering, because we will finally be able to completely use machine learning algorithms to detect and prevent previously unknown adversary behavior.
Here’s why. You may have heard of a set of technologies called EDR or Endpoint Detection and Response. EDR vendors collect all the telemetry off of your endpoints, store them in the cloud, and run machine learning algorithms on the data looking for bad guys. The setup is similar to looking for malicious files on the endpoint only the EDR vendors broaden the telemetry collection to all endpoint changes.
But, why would you only look at the endpoint?
We know that adversaries have to run a sequence of actions along the intrusion kill chain to succeed with their mission. This includes network and endpoint activity from delivery of malicious code, to command and control, to lateral movement, to exfiltration.
Wouldn’t you want all of that telemetry collected in the cloud so that your machine learning algorithms could do something with it?
Well, that’s what XDR is. The X stands for “everything:” everything with respect to detection and response. The point is that you collect all of the telemetry along the intrusion kill chain and develop machine learning algorithms to search for adversary activity.
XDR is relatively new to the tool box, but that’s the direction the entire industry is moving toward, regardless of which SIEM school of thought that will eventually win.
The impact of machine learning on network defenders.
You may have noticed that in all of this discussion, there is no workable on-prem SIEM solution. If you’re going to collect all of the data you need for your XDR machine learning algorithms to work, the only viable way to do that is to store it in some vendor’s cloudy infrastructure. In other words, you’re going to have to get comfortable storing your telemetry data in one or more security vendor’s data centers.
Take a deep breath. I know that is a hard pill to swallow. I already hear you saying things like, I’m not going to let some third party store my network data. I am going to keep trying to do it the old way because the risk is too high.
I’m here to tell you that the risk of not using the cloud so that you can take advantage of these automatic machine learning capabilities is far greater than the risk of storing most of your networking data there. Our cyberspace adversaries have automated their attacks. We’re never going to win if we cannot automate our response.
And if you’re a government network defender, you’re starting to panic because you can see that this is the only way forward, but you do not see a clear path to get there. But governments and the military are going to have to get comfortable storing their data in the public cloud infrastructure, the Amazons, the Googles, and the Microsofts of the world.
And I know that sounds really scary to a government crowd but it is inevitable. You can fight it, but you are going to eventually end up there in the long run. As Kyle Reese said to Sarah Connor in the original Terminator movie. “Come with me, if you want to live.”
Store your data in a cloudy environment if you want to live.