Microsoft Azure through a first principle lens.

By Rick Howard

Jan 25, 2021

CSO Perspectives is a weekly column and podcast where Rick Howard discusses the ideas, strategies and technologies that senior cybersecurity executives wrestle with on a daily basis.

Microsoft Azure through a first principle lens.

Listen to the audio version of this story.

Amazon started the cloud revolution when it rolled out AWS in 2006. Microsoft followed suit with a competing service in 2010 with Azure. Google started to compete in the space with Google Cloud Platform (GCP) in 2012. There are other players in the market—Oracle and IBM come to mind—but the big three that most security executives talk about are Amazon, Microsoft, and Google.

The network defender community started to get serious about how to secure these environments around the same time the Cloud Security Alliance came online in 2009. Since then, we’ve all been trying to get our heads around the various cloud architecture ideas like Software-as-a-Service (SaaS), Platform-as-a-Service (PaaS), and Infrastructure-as-a-Service (IaaS.) We have also been struggling with how to get in sync with our IT brothers and sisters around the ideas of DevOps and DevSecOps within those infrastructures. Furthermore, not only are we trying to understand new coding techniques, like containers and serverless computing, but also how to secure them in a cloud environment.

At the same time, the cybersecurity intelligence community has been bracing for some advanced adversary group to run a campaign directed purely at cloud resources. As an aside, this hasn’t happened yet. Apart from leveraging low-hanging fruit like S3 buckets left open to the internet, the bad guys haven’t run a purely cloud-centric campaign. Looking through the Mitre ATT&CK Framework Cloud Matrix, you see some cloud data theft and denial-of-service actions, but in general, the way adversary groups break into cloud environments is by stealing credentials from on-prem locations and then using them to log in legitimately to cloud resources. The most recent example is the SolarStorm campaign run by the adversary group UNC2452.

What I have noticed is that our entire community has been running heads-down now for years, thinking tactically about the technical widgets required to get these new environments running and then flipping switches and turning dials on those widgets to provide some modicum of security. I figured it was time to take a beat and consider the strategic picture. How do you think about cloud deployments through a first principle lens? How do you implement the four keystone strategies in each environment and how do you orchestrate those strategies not only in hybrid cloud environments but also in SaaS applications, mobile devices, and data centers back at headquarters as a single system of systems?

We’re going to spend the next few weeks thinking about those questions and we’re going to start with Microsoft Azure.

What is cloud computing?

Before we get to Microsoft, however, let’s talk about some cloud basics that apply to all cloud service providers. The first thing to note is that they offer some kind of networking infrastructure designed for their customer’s automation workloads. These come in the form of infrastructure (IaaS) or platform (PaaS) subscriptions. The idea is that managing hardware networking and server infrastructure shouldn’t have to lie at the core to your business. CIOs benefit from offloading that management burden to a third party so they can concentrate on developing software that will make their businesses competitive.

The second thing to note is that all cloud providers offer software as a service (SaaS) products to help you manage your workloads in their environments. Sometimes they provide them as part of the infrastructure service and sometimes you have to pay extra for them.

I bring this up because it might be useful to consider IaaS, PaaS, and SaaS subscriptions as individual products that are managed by different product management teams within the larger company. Depending on how old they are, you could consider some of them to be startup products. In other words, some are more mature than others.

AWS has been around for over a decade. That’s a mature service. Dell’s cloud offering launched in 2019; it’s probably not as mature. Google launched “Cloud Identity” as a SaaS product in 2018. Microsoft launched Azure Active Directory in 2019. These products may be fantastic, but they’re only three years old. How mature can they be? Just because they have a big brand name over them doesn’t mean that they’re ready for prime time. That’s especially true for security products. Amazon released their AWS Network Firewall in 2020. You can’t expect that product to have the same feature set and maturity that the traditional firewall vendors like Checkpoint, Cisco, Palo Alto Networks, and Fortinet have in theirs. The cloud providers have been bolting on IT and security services without any real thought about how they might work together in some form of strategic plan. And all of those services are in various states of maturity.

What’s interesting is that CIOs and CSOs tend to use the cloud service provider’s SaaS services. Mind you, these are the same leaders who probably wouldn't deploy a prevention platform for the bulk of their security services on prem because they don’t want to have a single vendor handling everything and they want to ensure that they have deployed the best of breed security tools. But when they move to the cloud, somehow those two best practices don’t apply anymore.

With those basics explained, let’s look at Microsoft Azure.

The basics of Microsoft Azure.

In its simplest form, clients rent virtual network spaces through subscriptions that are bound to geographical regions. The smallest number of IP addresses you can have for any specific virtual network is three. The largest is almost 17 million. By default, all IP addresses in a virtual network are private, meaning that the Internet can’t touch them and any virtual host using that IP address can’t get to the Internet. You can make them public by using the Source Network Address Translation (SNAT) protocol and placing them behind the Azure load balancer SaaS product.

You can create multiple subnets within your virtual network. This will be important later, when we establish Network Security Groups (NSG) for each subnet to enhance our zero trust posture. Think of NSGs like mini stateful inspection firewalls that allow blocking rules between subnets based on IP addresses, ports, and tags.

To connect two different virtual networks, Microsoft provides an ExpressRoute circuit capability that isn’t transitive. That means that if a hub virtual network has an ExpressRoute circuit to two different spoke virtual networks, spoke 1 and spoke 2, the two spoke networks can’t talk to each other. They can both talk to the hub virtual network, but can’t pass traffic directly to each other. This hub model is important to understand because if you were going to insert a security stack between the two spoke virtual networks in the Azure model, this is how you would do it.

You can also connect a virtual network back to your on-prem networks using a similar ExpressRoute circuit idea. And, if you want to get fancy, you can use the ExpressRoute circuit to connect to your SASE vendor. The SASE vendor establishes something called a “Meet Me” location in their data centers that has multiple peering connections to the Microsoft Regional Network Gateway (RNG) supporting your virtual networks.

One quick note about the “Azure Active Directory” SaaS product: “Azure Active Directory” isn't “Active Directory.” I know. That’s confusing. Let me explain. “Azure Active Directory” is an unfortunate marketing name for a product that provides federated identity management services using “Active Directory” as the authoritative source. What makes this even more confusing is that you can install an “Active Directory” service on a virtual host running in an Azure virtual network that, in order to provide federated services, will have to talk to the “Azure Active Directory” SaaS product.

Names of things: Don’t get me started. They are so important, maybe not in the heat of the initial assignment moment perhaps, but later, when things get complicated, having well-thought-out names for things can avoid heaps of confusion down the road. Case in point, the way the industry uses the same names to discuss adversary groups, adversary campaigns, and malware. Talk about confusion.

A look at Microsoft Azure through the lens of security first principles.

At the end of this essay, I provide a list of Microsoft Azure security tools that customers could deploy in their cloud environments: CASB, Data Classification, DDOS Protection, Federated Identity Management Services, SIEM and Data Analytics, Web Application Firewall, and XDR. Presumably, Azure customers would use these products to deploy our four first principle strategies.

Clouds and resilience: Microsoft Azure's strong suit.

This is where Microsoft, and pretty much all cloud providers, shine. The gap between the relative simplicity of creating system backups and high availability situations in cloud environments compared to the headache-inducing complexity of doing the same in your own data centers is wide. But for Microsoft Azure, there are a few concepts to understand.

Remember, a virtual network sits in a region. A region is a collection of data centers—Microsoft calls them availability zones (AZs)—that can only have 2 ms of latency between them. If data center 1 has more than 2 ms of latency to data center 2, they are not in the same region. To protect your workloads from a single data center failure, spread them across multiple availability zones (AZs) using the Microsoft load balancer SaaS product between them.

Each availability zone (AZ) contains racks and racks of physical servers. Azure availability sets(AS)—Microsoft calles them fault domains—distributes workloads across three server racks at a time and protects customers from rack level failure.

For backups and disaster recovery, there are many options to consider but probably the simplest model to understand is to distribute the workload across multiple regions. In this model, there is no concept of a hot and cold site, or a hot and a warm site. Both sides are hot, meaning that each is allowing read/write transactions and the underlying databases in both regions are keeping themselves in sync. This is way easier to say than it is to do, and there are cost considerations to examine, but with a Microsoft Azure deployment, resilience through code is possible. You can see why DevOps teams love cloud deployments. They can build robust high availability solutions, disaster recovery operations, and backup procedures and their infrastructure is code.

Where does Microsoft fall with respect to zero trust?

In terms of zero trust, Microsoft doesn’t have a product per se, but they do have a substantial collection of written advice around the topic that incorporates their product set. How can I say this politely? It’s a lot. The potential combinations of services is exponential. They have tools to monitor and manage identity, devices, SaaS applications, data, and networks. In 2017, they rolled out a SaaS application, called Microsoft Secure Score, that is a security dashboard that tracks things like virtual inventories, security alerts, compliance, and other things and attempts to prioritize the security todo list.

Microsoft recommends you start with complete visibility by registering all of your users, endpoints, and applications with Azure Active Directory. I admit, that’s a really good first step. Then use Microsoft Secure Score to help prioritize the workload and lock everything down. It will still be a bit of a black box to get a precise sense of your zero trust posture by reviewing this telemetry yourself, but it’s a start. I don’t recommend throwing this task into the SOC as an additional duty for the already overwhelmed SOC analysts either. In order to get this done, you would need a small team to only focus on this zero trust task and who could also automate the steps as they go. By the way, I think that is the correct path anyway. If Zero Trust is a foundational stone in our first principle wall, then we might want to have some dedicated resources implementing it, not just people, but people who can code.

The one capability that Microsoft offers that has a one-to-one connection to zero trust is its Network Security Groups (NSGs.) In every virtual network that you deploy, you can also establish one or more subnets and use NSGs to prevent the subnets from talking to each other. In one simple model, you can have subnets for finance, marketing, and the development team, just to name three. You would then deploy rules on each NSG to prevent communication outside of those subnets. In that way, you would reduce the attack surface of each of those business groups and limit the potential damage from lateral movement if attackers somehow compromised any one of them.

One note about the SolarStorm supply chain attack campaign that became public in December 2020 and the zero trust strategy. The consensus from the security community is that a strong zero trust deployment might have prevented the success of the campaign. According to Microsoft's director of identity security, Alex Weinert, “Even in the worst case of SAML token forgery, excessive user permissions and missing device and network policy restrictions allowed the attacks to progress." From my understanding of how the Microsoft Azure Active Directory SaaS product works, SolarStorm victims could have configured their virtual networks to prevent the Golden SAML attack and that is a good thing.

Microsoft and Intrusion Kill Chain prevention and risk assessment.

Microsoft is silent on how to install prevention and detection controls for all known cyber adversaries with Azure. Other than the Microsoft Threat Prevention Team periodically publishing threat reports on various actors and associated campaigns, the kill chain strategy is absent. Again, this is not to single Microsoft out. No security vendor that I know of does this. The bottom line is that if you want to pursue this strategy, you will have to do that yourself either with the collection of Microsoft SaaS products or with third party security products. The same goes for risk assessments. And again, that is not a hit on Microsoft. If you are running a mature risk assessment program, you are not doing it with any security vendor product that I know.

First principles aren't yet set-it-and-forget-it in the cloud.

If you are trying to implement the four pillar strategies from my first-principle vision, you aren’t doing that with Microsoft Azure alone. Intrusion kill chain prevention and risk assessment notwithstanding, zero trust and resilience inside of Microsoft Azure are not fully formed products either. They remain just a collection of tactical tools that the customer has to manage in order to accomplish a larger strategy. And they are incomplete solutions with respect to how such strategies might be orchestrated in hybrid cloud environments and in other key places where our data might reside (like in our own data centers and employee devices). That said, this is no different from the current situation with on-prem security solutions either. No one single vendor can do it all. Security platforms can do a lot of it, but they will have to rely on the cloud service providers to supplement them, too.

One final note, the YouTube videos produced by John Savill regarding the inner workings of Microsoft Azure are well done. He not only understands how everything works, but he also understands what IT and security executives are trying to accomplish. There are links to his material in the reading section below but I would recommend anything that he publishes.

A list of Microsoft security-as-a-service (SaaS) and consulting services.

As of this writing, these are the Microsoft Azure security offerings:

Consulting services.

Azure Well-Architected Review: a review of your Azure deployment for Cost Optimization, Operational Excellence, Performance Efficiency, Reliability, and Security.