Amazon AWS security: resilience, zero trust, intrusion kill chain prevention, and risk assessment.

By Rick Howard

Feb 8, 2021

CSO Perspectives is a weekly column and podcast where Rick Howard discusses the ideas, strategies and technologies that senior cybersecurity executives wrestle with on a daily basis.

Amazon AWS security: resilience, zero trust, intrusion kill chain prevention, and risk assessment.

Listen to the audio version of this story.

In the last essay, I did a deep dive into Microsoft Azure and first principles. In the process, I learned that cloud providers have developed a vocabulary to describe the concepts of how their services work. The concepts shared among all cloud providers are similar, but many of them have different names and offer subtle differences in capability, which isn't confusing at all.

Comparing basic networking concepts in Microsoft Azure and Amazon AWS.

For example, customers rent space on each cloud provider’s network. Microsoft calls them “virtual networks.” Amazon calls them “Virtual Private Clouds” (VPCs).

The IP addresses that support these ephemeral networks are private, meaning that any virtual host within can’t get to the internet, and that any host on the internet can’t see the virtual hosts. In order to allow access to and from the internet, cloud providers use an old legacy networking trick called network address translation, or NAT. According to the GeeksforGeeks website, “The idea of NAT is to allow multiple devices to access the Internet through a single public address…. it is a process in which one or more local IP addresses are translated into one or more Global IP addresses and vice versa.” Amazon calls its version a NAT Gateway and Microsoft calls its version a Source Network Address Translation (SNAT).

Both companies have the same concept and name for organizing their data centers. A region is a physical location around the world where one or more data centers are clustered. An availability zone (AZ) is a logical construct of one or more discrete data centers. For both companies, each availability zone has its own subnet that the customer can divide into multiple smaller subnets. Within those subnets, customers can deploy virtual workloads. Microsoft calls their workloads “Azure Virtual Machines” or “Azure Instances,” and Amazon calls theirs “Amazon Elastic Compute Cloud” or “Amazon EC2.”

To reduce the attack surface between subnets, both Microsoft and Amazon have this idea of a simplified virtual stateful inspection firewall. Microsoft calls theirs a “Network Security Group” (NSG) and Amazon simply calls theirs a “Security Group.” For example, let’s say that you have one subnet dedicated to finance and another dedicated to devops. Security administrators can restrict access to each subnet by IP and port, which is similar to the old hardware stateful inspection firewalls that were invented back in the 1990s.

In order to connect their customer’s virtual instances back to their traditional on-prem data centers, Azure has a SaaS application called “ExpressRoute” and AWS has a SaaS application called “DirecConnect.”

Storage is also a SaaS application that sits outside the customer’s cloud infrastructure. Microsoft calls their storage service “Azure Storage,” and Amazon calls theirs” Simple Storage Service” (S3). Colloquially, the network defender community has defaulted to calling the AWS Simple Storage Service “S3 buckets” and these have been in the security news since the initial launch of the service back in 2006. Many organizations don’t configure them properly, and mistakenly leave them open to anybody on the internet, thus permitting anybody to copy the data within. According to the Register’s Shaun Nichols, as of August 2020, “Leaky AWS S3 buckets are so common, they're being found by the thousands now – with lots of buried secrets.” Azure Storage has its leaky moments too, but the tech news media don’t report as many.

The key point to remember is that both companies offer a mix of virtual infrastructure capability and supporting SaaS applications that include security tools. Some SaaS tools come with the subscription, but others you have to pay for. With SaaS security applications, network defenders can try to orchestrate their four first principle strategies: intrusion kill chain prevention, zero trust, resilience, and risk assessment. But before I talk about that, let’s go over some Amazon details that are unique to the AWS service.

Amazon Web Services (AWS) basics.

AWS has a notion of Network Access Control Lists (NACLs) that provide stateless controls between subnets. By stateless, I mean that they don’t keep track of the back-and-forth communication between two endpoints. If you want a NACL rule to block a specific IP address between the finance subnet and the devops subnet, you have to install two rules: one to the blocked IP and one from the blocked IP.

AWS also specifically requires virtual router configuration. Just because you allow communication between subnets with security groups and NACL rules, you still have to insert the communications routes in the router. Microsoft Azure automatically configures the routes for the customer. Logically then, the communications path between two AWS subnets is this

EC2 workload in subnet 1
through the Security Group for subnet 1 (Optional)
through the NACL for Subnet 1 (Optional)
through the router for the VPC (Mandatory)
through the NACL for Subnet 2 (Optional)
through the Security Group for Subnet 2 (Optional)
to the EC2 workload in Subnet 2

If you want to insert a security stack between the subnets, let’s say a virtual firewall, you would create a third subnet and adjust the router to point to it instead of subnet 2.

Amazon AWS through a first principle lens.

With those AWS basics out of the way, let’s see how Amazon helps us deploy our four first principle strategies: resilience, zero trust, intrusion kill chain prevention, and risk assessment.

Amazon AWS resilience.

As I said in the Azure essay, resilience is where most cloud providers shine. They make it relatively easy to establish redundant workloads and redundant data storage in geographically separate locations. Compared to how the IT and security communities have been doing this for years with our own physical data centers (the ones we own and maintain ourselves), virtual data centers from the likes of Microsoft, Google, and Amazon take the burden of data center management off the plate. We can do it all with software using provider SaaS applications and APIs, which is probably the most compelling argument to adopt the devops and devsecops philosophies.

As with Azure, Amazon logically organizes AWS into physical regions around the world and offers one or more data centers in each for customer use called availability zones. For critical workloads, as with Azure, there are many resiliency options to consider in AWS. The most compelling, from my view, is establishing redundant workloads in different regions that keep each other updated automatically; a hot-hot model compared to a hot-cold model or even a hot-warm model. The hot-hot model provides another degree of resilience in case a data center in one region fails for whatever reason. The other region will just pick up the slack and, if done correctly, can scale workloads automatically until the crisis is rectified.

For backups, AWS offers something called “Amazon EBS snapshot” that function similarly to how traditional backup systems have worked back on prem. They’re incremental backups meaning that the system only saves the changes since the last snapshot. AWS also offers a database backup SaaS application called CloudEndure Disaster Recovery that continuously replicates EC2 workloads (including operating system, system state configuration, databases, applications, and files) into low-cost S3 buckets.

For backup encryption, AWS offers a server-side encryption service, meaning that once the data are on the backup system, the server will encrypt it. For ransomware protection, AWS offers an immutable storage service called S3 Storage Lock that once the data is stored, it can’t be changed, deleted or encrypted for a specified period of time in a write-once-read-many (WORM) model.

All of these SaaS applications and infrastructure capabilities are fine, but as the AWS documentation points out, these are resiliency tools, and not a resiliency solution. You still have to design your plan yourself and implement it.

Amazon AWS zero trust.

According to Amazon’s Mark Ryland and Quint Van Deman, a best practice is to apply the general purpose zero trust idea to workloads: “If two components don’t need to talk to one another across the network, they shouldn’t be able to, even if these systems happen to exist within the same network or network segment.” You configure that within the AWS Security Groups’ SaaS application. Ryland and Van Deman point out that security groups can be used for north-south traffic, meaning network traffic in and out of the security group, as well as east-west traffic, meaning network traffic between workloads on the same subnet. For north-west traffic between two VPCs, you can also establish a PrivateLink between them that no other VPC will have access to.

You can perform a similar zero trust function with AWS APIs by installing your applications behind an Amazon API Gateway and limiting who and what can access it. This also has the side benefit of providing application distributed denial of service (DDoS) protection.

Amazon does offer a web application firewall called AWS WAF but it is specifically designed to protect web applications against attacks like SQL injection and cross-site scripting. I want to be clear here. AWS WAF isn't a next-generation firewall that you might get from one of the big security platform companies like Cisco, Checkpoint, Palo Alto Networks, or Fortinet. These vendors designed their platforms from the beginning to enforce zero trust rules at the application layer, rules, for example, like one allowing the devops team to use the github application, but prohibiting the finance department from doing so. You aren’t getting that done with the AWS WAF.

Of course, you can’t really do zero trust without a robust identity and access management system. Most organizations that have been around for a while have deployed Microsoft’s Active Directory back at headquarters and in their data centers. You can use the AWS AD Connector to connect AWS apps to your on-premise directory. But if you’re a small organization that, as they say, is cloud-native, meaning that you don’t really have any on-prem infrastructure, you can use the AWS Identity and Access Management (IAM) system as a starting point.

As with Microsoft Azure, it’s a really good idea to install some sort of two-person control for critical functions and to watch closely what is being done with the root accounts for AWS services. According to Louis Columbus at Forbes Magazine, he recommends that AWS administrators “Vault AWS Root Accounts and Federate Access for [the] AWS Console,” and then audit everything with two SaaS applications, AWS CloudTrail and Amazon CloudWatch. He also says that if you haven’t implemented “Multi-Factor Authentication Everywhere,” you should stop what you are doing right now and get that done.

Wise words indeed.

Amazon AWS intrusion kill chain prevention and risk assessment.

For intrusion kill chain prevention, Amazon offers some rudimentary tools that might help in this effort but they are bare minimum. AWS has an intrusion detection SaaS application called “Amazon Guard Dog,” a netflow collection service called VPC Flow Logs, a security cloud analytics service called AWS Security Hub, and a simplistic XDR capability with Amazon Detective. But like Microsoft, Amazon doesn’t seem to embrace the intrusion kill chain idea. There is no literature that talks about how AWS embraces the strategy on their website or even if they can map their services to the MITRE ATT&CK framework.

The same goes for risk assessment offerings too. With the telemetry collected from the AWS SaaS applications I mentioned above for the intrusion kill chain strategy, plus the telemetry from the zero trust and resilience SaaS applications, you might be able to calculate the risk probability yourself, but you’ll have your work cut out for you. Remember, we’re trying to calculate the probability of material impact due to cloud cyber attack within the next three years or so. There is no easy button for that within the AWS cloud offering.

The bottom line on security in the cloud.

After deep dives into two of the three big cloud providers (Microsoft and Amazon), I have come to the preliminary conclusion that network defenders can reasonably design and deploy our resilience and zero trust strategies with some degree of rigor in cloud environments. I will reserve judgement until I look at Google in a couple of weeks, but that looks like the general direction this analysis is taking. On the other end, cloud providers have only rudimentary capabilities for deploying the intrusion kill chain and risk assessment strategies, if they have any at all. If you’re going to deploy all four first principle strategies in the cloud, which you know you should, you’re going to have to supplement the cloud security SaaS offerings with other third-party solutions.