Risk Forecasting with Bayes Rule: A practical example.
N2K logoSep 19, 2022

CSO Perspectives is a weekly column and podcast where Rick Howard discusses the ideas, strategies and technologies that senior cybersecurity executives wrestle with on a daily basis.

Risk Forecasting with Bayes Rule: A practical example.

Listen to the audio version of this story.

This is the third and final essay in my series about how we should reframe our calculating models about cybersecurity risk. In the first essay, I talked about the Book, “Superforecasters” by Dr. Tetlock and how it changed my mind that it’s possible to forecast the probability of complex questions using some basic techniques like Fermi estimates. 

Since the beginning of these essays, I‘ve said that the absolute cybersecurity first principle is reducing the probability of material impact due to a cyber event. If that’s true, and I think it is, then all of us have to be able to calculate our organization’s current probability of material impact and what it will be when some new thing happens (people, process, or technology). I made the case that, from the beginning,  we’ve all been too overwhelmed by that problem.  The complexity seemed beyond our ability to calculate and because of that complexity, we punted. We waved our hands at the problem and said it couldn’t be done, that there were too many variables, that we’ll just have to be satisfied with qualitative estimates (like high, medium, and low), build our color coded heat maps for the board, and be done with it. After studying Tetlock’s book, I changed my mind. I realized that rough estimates that are in the same ballpark as a detailed calculation are probably good enough to make resource decisions with.

In the second essay, I explained the mathematical foundation as to why Superforecasting techniques work. It comes from Bayes’ Theorem, coined by Thomas Bayes, in the 1700s. I explained his billiard table thought experiment where an assistant would roll a cue ball onto a billiard table and a guesser would estimate the cue ball’s location based on new evidence. The assistant would roll subsequent balls onto the billiard table and inform the guesser of its location relative to the cue ball. The guesser would revise her prior cue ball estimates based on this new evidence. Essentially, Bayes allows superforecasters to estimate an initial answer (the prior) by whatever means (staff informed guesses, basic outside-in stats about the industry, etc) and then adjust that estimate over time as new evidence comes in.

In the cybersecurity case, that new evidence comes in the form of how well we are doing in implementing our first principle strategies. In other words, if the outside-in estimate this year, the prior, that any U.S. company will be materially impacted by a cyber attack is 32%, what would that estimate be for your organization if you have a fully deployed zero trust program; or a robust intrusion kill chain prevention implementation; or a highly efficient and practiced resilience operation; or some combination of all them? 

Let’s figure it out.  

The Prior: A Fermi outside-in forecast. 

In order to calculate our first estimate of the probability of material impact to our organization this year, the first question (the prior) we should probably answer is what is the probability that any company would get hit with a material impact cyber attack. In this analysis, I'm going to restrict my calculations to U.S. organizations for two reasons. The first is that U.S. data on breaches is relatively abundant compared to other countries. The second is that most of my readers are from the U.S. 

You recall from the last essay that Enrico Fermi, the Italian American physicist, was famous for his rough estimates about outcomes that were remarkably close to the real answers generated by physical experiments. That’s what we're going to do here. 

Let’s start with the FBI’s Internet Crime Report of 2021.  In that study, the FBI’s Internet Crime Complaint Center (IC3) said that they received just under a million complaints (847,376). Let’s assume that all of those represent material losses. That’s probably not true, but let’s assume that for now. 

But the IC3 also estimated that only 15% of organizations actually report their attacks. So, how many total should there be? Doing the math, that means that over five and a half million (5,649,173) US organizations got hit with a cyber attack that year. That said, my assumption is that there are many reasons why organizations don't report their cyber incidents to the FBI and the main one might be that the incident didn’t turn out to be material. As a conservative estimate then, let’s assume that only 25% of the potential unreported incidents were material. That number is probably way smaller but it is good enough for now. 

The number of unreported material complaints then is equal to what the IC3 expected, the five and a half million (5,649,173),  minus,  the just under one million known reported complaints (847,376). Doing the subtraction, that number is just over four and half million (4,801,747). With my assumption that only 25% of the unreported complaints were material, 25 percent of just over four and half million (4,801,747) is an estimated 1.2 million (1,200,449). So, the total number of material complaints is the known reported complaints, just under one million (847,376), plus the estimated unreported complaints, 1.2 million (1,200,449) for a total of just over 2 million (2,047,825). 

Hold that number in your head for a second.

I’m  assuming that no organization is getting hit twice in the same year too. That’s probably not true either but for now, let’s roll with it. Let’s also assume that any nation state attacks that caused material damage will be included in the IC3 stats.

The question then arises about how many organizations exist in the United States that could potentially report to the IC3. We know from stats published by the U.S. Census Bureau in 2019 that the United States had 6.1 million (6,102,412) registered companies. Employee sizes for that group range from five to over 500. For the moment, we will assume that employee size doesn’t matter in our forecast. We know that’s probably not true either but we will list it as an assumption and look for data later that will inform the assumption one way or the other. We will also assume that the number includes NGOs (Nongovernmental Organizations).

Further, according to the National Center for Education Statistics, in 2020, there were 128 thousand (128,961) total schools for Public and Private prekindergarten, elementary, middle, secondary, postsecondary, and other schools. For the postsecondary schools, that’s a mix of four year and two year programs of various student sizes. It also represents a mix of student sizes for the elementary schools. We will also assume that student body size doesn’t matter for this forecast.

Interestingly, we don’t really have an official number of sanctioned federal government entities. According to Clyde Wayne at Forbes in 2021, there is no official, authoritative list maintained by any one of them. In other words, no one U.S. federal government entity is officially tasked with keeping track of all the other federal agencies. I know that sounds crazy, but apparently it’s so. He lists eight different reports, from the Administrative Conference of the United States to the Federal Register Agency List, that estimate the number range of government agencies from 61 to 443 depending on how they count it. Let’s take the average: 274 as a starting point.

Finally, from the US Census Bureau in 2017, 90 thousand (90,126) local governments existed in the United States. Assume that the size of the local government doesn’t matter for this forecast either.

To summarize then, within the United States, there are 

  • 6.1 million (6,102,412) registered companies
  • 129 thousand (128,961) schools
  • 274 federal government agencies
  • 90 thousand (90,126) local government organizations (state, city, county, etc)
  • Total: 6.3 million (6,321,773) US Organizations

all of which could register a material report to the FBI’s IC3. With our assumption that 2 million (2,047,825) organizations should have reported to the IC3 in 2021, that’s roughly a 32% chance (2 million reported breaches divided by 6.3 million total organizations) that any officially recognized organization in the United States could have had a material cyber attack that year.

Before we call that our official Bayesian prior though, let’s check our assumptions:

  • All of the just under a million complaints (847,376) to the IC3 were material. 
  • Only 25% of the estimated unreported incidents to the IC3 were material.
  • Any nation state attacks that caused material damage will be included in the IC3 stats.
  • No company gets hit more than once in any given year.
  • The number of employees or students of an organization doesn’t matter for the forecast.
  • The total number of companies listed by the US Census Bureau includes NGOs.
  • The average (274) of existing federal organizations taken from eight different reports is close enough.

Those are some big assumptions. But I would argue that for this first estimate, this first Bayesian prior, it’s probably good enough. This is us rolling the cue ball onto the billiard table and making a first guess as to where it is. Using Fermi’s outside-in forecast, a technique used by Dr Tetlock’s superforecasters described in his book of the same name, for any organization in the United States, the probability of material impact due to a cyber incident this year is 32%, just over a one and three chance.

Let me say that again. In the general case, for any United States organization, there is a one in three chance of experiencing a material cyber event this year.

But wait, what about me? 

I hear what you’re saying. That’s all great and fine Rick, but I'm special. I work for a small startup making concrete. There is no way that there’s a 32% chance that the company will be materially impacted by a cyber event this year. It must be way lower than that. Or, I work for a Fortune 1000 company. There is no way that there is only a 32% chance. It has to be much bigger than that. Your 32% chance has no meaning to me. It doesn’t help me at all.

Well, OK, but remember, that first prior is just the assistant rolling the cue ball onto the table and asking us to make the first estimate about its placement.  The next thing we’re going to do is check our assumptions. We’ll be looking to collect new evidence about those assumptions and adjust our 32% forecast up or down depending on where the evidence leads us. For example, if we found sometime in the future that my assumption about unreported material events in the IC3 report was 10% vs 25%, we would adjust the probability down. On the other hand, if we found that the actual number of federal organizations was really 80 vs the average 274 that we used, then we would adjust the probability up. Just like Tetlock’s superforecasters do on a regular basis, keep your eye on your assumptions.

The next step is to continue to collect new evidence. We’re going to roll more balls onto the billiard table. Two research reports published by the Cyentia Institute will help us in this round:

  • "Information Risk Insights Study: A Clearer Vision for Assessing the Risk of Cyber Incidents”
  • “IRIS Risk Retina - Data for Cyber Risk Quantification” 

I’ve been thinking about how to calculate cyber risk for a while now, and these two Cyentia reports are the closest thing I have found that matches my thinking around superforecasters, Bayesian philosophy, and Fermi estimates. In the first paper, Cyentia partnered with Advizen (a Zywave company) who provided the breach data set for Fortune 1000 companies in the past decade. I have high confidence in the data set since it’s public knowledge who all the Fortune 1000 companies are and because of compliance reasons, the data breach reporting is robust.

The first finding that is important to our study is that for the past five years, just under one in four  Fortune 1000 companies get hit each year by a material cyber event. That number is slightly lower than our first Bayesian prior of one in three. But Cyentia pulled their analysis apart by looking at the odds of ranked quartiles. In other words, they looked at the odds for top 250 firms, then the next 250, etc. It turns out that if your company is in the Fortune 250, you are five times as likely to have a material breach than if you are in the bottom 250. From their report:

Fortune 1000 chances of getting breached:

  • Fortune 250: a one in two chance
  • Fortune 251 to 500: a one in three chance
  • Fortune 501 to 750: a one in five chance
  • Fortune 751 to 1000: a one in ten chance

They did a similar analysis for calculating the chances of a Fortune 1000 company experiencing multiple attacks in the same year. This goes to answer one of our Bayesian assumptions.

Fortune 1000 chances of multiple breaches in the same year:

  • Fortune 250: a one in three chance
  • Fortune 251 to 500: a one in seven chance
  • Fortune 501 to 750: a one in twelve chance
  • Fortune 751 to 1000: a one in twenty four chance

Fortune 1000 Loss Exceedance Curves

The last thing from their report to consider is that they calculated different probabilities for different loss scenarios. They use a graph called a Loss Exceedance Curve, which according to Bryan Smith at the Fair Institute, “... is a way to visualize the probability of the loss exceeding a certain amount … The x-axis plots the annualized loss exposure for the given risk scenario considered in the analysis. The y-axis plots the probability of a loss being greater than the intersection with the x-axis, from 0 to 100%.” What that means is that there is a different probability for different values of loss. From the Cyentia report

  • 25% for any loss what-so-ever
  • 14% chance of losing $10 Million or more
  • 6% chance of losing $100 Million or more.

This is important when it comes to risk tolerance. For some fortune 1000 companies, a seven and fifty chance of losing $10 million is an acceptable risk. For a handful of them, that’s just couch cushion money. For others though, that 14% chance of losing $10 million might be too much to bear compared to all the other risks their leadership team is trying to avoid. The reason to use loss exceedance curves is to give the leadership the option to choose. When we were using qualitative heat maps with our high, medium, and low assessments, there was no way for company leadership to evaluate whether or not the risk was within their tolerance or not. Loss exceedance curves give them a visual reference of where their tolerance falls. 

All companies chances of getting breached

Cyentia then combined three data sets from Advizen, Dun & Bradstreet, and the U.S. Census Bureau for breaches reported for all companies in the United States, not just fortune 1000. They admit in the report that compared to the Fortune 1000 data set, it’s not as robust but they still have high confidence in it being the best available. The report has a section where they forecast the probability of a material breach for each commercial sector (Construction, Agriculture, Trade, etc). They conclude that there is a less than 1 in 100 chance for any company regardless of sector to have a material breach this year but with caveats. In an email conversation with Wade Baker, the co-founder of The Cyentia Institute, he said that “Since each sector is composed of mostly smaller firms, it pulls the typical probability down dramatically.” 

I’ll say. The contrast between Cyentia’s one percent compared to my IC3 forecast of 32% is quite large. But Wade says that the more accurate forecast comes from the size of the organization, not the sector. In the report, they show quite the large probability gap among revenue groupings:

Revenue vs Probability

  • Less than $1B in annual revenues (where most organizations live): < 2% 
  • Between $1B and $10B: 9.6% 
  • Between $10B and  $100B: 22.6% 
  • Greater than $100 billion: 75%

But they also point out that larger organizations are more likely to report a breach, over 1,000 times more likely compared to small (<$10M) businesses, so the probabilities are probably skewed in that direction.

How do you incorporate this new data?

Which begs the question, how do you incorporate this data into your forecast? How do you use the prior forecast, 32%, with this report? First things first, if you’re working for a Fortune 1000 company, I would throw out the generic forecast that I just did from the FBI’s reporting. Cyentia’s report on Fortune 1000 companies is way more precise, and the data set is so robust, that I feel confident that those forecasts are more accurate for Fortune 1000 companies than my generic forecast for any and all companies using FBI data. Also, the second report I listed, “IRIS Risk Retina - Data for Cyber Risk Quantification” is all about non-profits. If I was working for a non-profit, I would use that report to establish my prior.

But, if you don’t work for a fortune 1000 company or a non-profit, say you’re Marvel Studios, how do you absorb this new data about revenue size into your forecast? If we were inclined to throw this into Bayes’ Algorithm and do the math, we could. But we’re doing Fermi estimates here. They will likely be good enough. 

According to Zippia (a company that tracks analytics about companies) Marvel Studios made almost $116 Million(115.7) in revenue in 2021. That puts them in the “Less than $1B'' in annual revenues (where most of us live). According to Cyentia, that type of company has less than a 2 in 100 chance of having a material breach. That’s a big gap compared to my IC3 prior of 32%. 

D oes that make you want to reduce the prior or increase it? Since Cyentia’s forecast is lower than my IC3 forecast, logic says that I would lower it. But how much? Do you lower it all the way down to 2 percent? You could if you feel that the Cyentia report is so strong that it overwhelms the IC3 analysis like it did for the Fortune 1000 companies or the non-profits. You could absolutely do that. But, the authors of this analysis say in the report that the data is not as robust as the Fortune 1000 data. And, I like my IC3 analysis. I feel confident in it. 

Remember, the concept behind Bayes is that it’s a measure of your belief, your personal confidence. For me then, it’s not a complete replacement. I would adjust the IC3 prior down some, say to 15 percent, and start looking for more evidence to help support the change.

One technique used by Tetlock’s superforecasters when making these adjustments is asking themselves how confident they are in the change. In their minds, they want to be at least 95% confident that the adjustment is correct, not 100%, but mostly. I know that’s an abstract way to think about it. How can you be 95% confident about something? How would you rate the difference between 95% and 85%? I know I can’t do that. One trick they use is asking themselves to make a bet. Would they bet $100 that this adjustment was correct? A bet implies some risk. You may be totally sure about something when you make a bet but you’re not 100% sure. So, if you are so positive about your adjustment that you’re willing to bet $100 on it, that’s a good approximation for being 95% confident. If not, back the adjustment off a point or two. With my new prior, 15%, I wouldn’t bet $100 of my own money that 15% is the correct number. What about 17%? Ok, I would bet $100 on that.

To recap then, I used two different frequentist data sets. I used the FBI IC3 data and some fermi estimations to find the initial prior. I then used the Cyentia report to make an adjustment to that initial forecast. The bottom line is that, for Marvel Studios, I'm forecasting the probability of material impact this year as 17 percent, or just under a one in five chance.

It’s a gut call but remember also that this is still outside-in analysis; a Fermi prediction. This forecast has nothing to do with the Marvel Studios actual defensive posture (inside-out). In other words, it doesn’t take into consideration any defensive measures that Marvel Studios has deployed to strengthen its posture in terms of cybersecurity first principles.  We’ll look at that next.

An inside-out analysis: the first principles.

With outside-in analysis, I have demonstrated how network defenders can take an initial estimate and adjust it as new evidence comes in. We took the IC3 prior and adjusted it with the Cyentia data. We can repeat the process now with inside-out analysis. In other words, we can use our outside-in forecast as the new prior and then estimate how well we have deployed our first principle strategies in turn and adjust the prior up or down based on that new evidence. That means we have to assume some things. 

Let’s assume that if we fully deploy each of our first principle strategies, then the impact is a reduction in risk probability to our organization by some amount. Let’s assume these values:

  • Zero Trust: 10%
  • Intrusion Kill Chain Prevention: 10%
  • Resilience: 15%
  • Automation: 5%

These are best guesses on my part and that’s why they’re assumptions. You might use different numbers and that’s perfectly fine. Over time, the superforecaster in me will look for new evidence that will validate or invalidate those values. But for now, the Fermi analyst in me says they are close enough. And remember, in this model, you only get the full probability reduction if you have completely deployed each strategy. Most network defenders, even those that work for robust security organizations, don’t have any of these strategies fully deployed.

An inside-out analysis: the Contoso Corporation.

To see how this works, let’s analyze a company through this first principle lens. My good friend, Todd Inskeep, has been reading these CSO Perspectives’ essays, listening to the accompanying podcasts, and providing me feedback since we started some two years ago . He suggested that we use the Contoso Corporation as the case study. 

For those that don’t know, the Contoso Corporation is an imaginary company that Microsoft uses to explain to potential customers about how to deploy its set of products. They explain that the company “is a fictional but representative global manufacturing conglomerate with its headquarters in Paris.” Think Fujitsu, but French. Since Microsoft analysts have put a lot of work into the back story of how the Contoso Corporation is architecturally deployed, I don’t have to make one myself that has enough detail to be useful. Further, I don't have to pick on a real company like Marvel Studios for this analysis.

Here’s a summary of the Contoso Corporation (See the references section for a link to the Contoso Corporation web page).

Contoso Corporate Overview

  • The Paris office has 25,000 employees; each regional office has 2,000 employees. 
  • It has a large sales and support organization for more than 100,000 products.
  • It has an annual revenue of $35 Billion (Similar to Fujitsu).
  • It is not a Fortune 1000 company or a non-profit organization.

Contoso Tech Overview

  • Uses Microsoft 365 for office applications (email, word processing, spreadsheets, etc)
  • Is currently transitioning from data center operations to cloud based operations but it's years away from completing the transition. 
  • Customers use their Microsoft, Facebook, or Google Mail accounts to sign in to the company's public web site.
  • Vendors and partners use their LinkedIn, Salesforce, or Google Mail accounts to sign in to the company's partner extranet.
  • Has deployed an SD-WAN to optimize their connectivity to Microsoft services in the cloud. 
  • Has deployed regional application servers that synchronize with the centralized Paris campus datacenters. 

Contoso zero trust overview

  • Uses on-premise AD DS forest for authentication to Microsoft 365 cloud resources with password hash synchronization (PHS), but it also uses third party tools in the cloud for federation services. 
  • Has deployed special rules for senior leadership, executive staff, and specific users in the finance, legal, and research departments who have access to highly regulated data.
  • Collects system, application, and driver data from devices for analysis and can automatically block access or patch with suggested fixes. 
  • Requires MFA (multi factor authentication) for their sensitive data. 
  • Categorizes data into three levels of access. 
  • Deploys DLP (Data Loss Protection) services for Exchange Online, SharePoint, and OneDrive.
  • Designated people execute global system administrator changes and only receive time-based temporary passwords with their AD DS Privileged Identity Management (PIM) system.

Contoso resilience overview

  • Data is encrypted at rest and available only to authenticated users. 

Contoso intrusion kill chain overview

  • Contoso uses Microsoft Defender Antivirus on the endpoint.

Adjusting the Outside-In Forecast first.

Since the Contoso Corporation is a global manufacturing conglomerate and not an entertainment company like Marvel, we need to start over with our outside-in fermi estimate using the FBI’s IC3 data. Our first prior is 32%. But, according to Cyentia, there is a 22% chance that Contoso (annual revenue of $35 Billion) will be impacted by a material breach this year; just over a 1 in 5 chance. 

The question is then, how far down do you adjust the 32% prior with this new information? I still have high confidence in my own IC3 outside-in analysis. I have less confidence in the Cyentia data with the caveats I have already explained, but it’s still a good forecast. I would bet $100 of my own money that the actual probability of material impact is a good 5 points below my generic prior. So let’s set the prior to 27%.

Using the 27 percent as our current prior, the next step of incorporating new evidence (more balls on the billiard table) is to assess how well the Contoso Corporation is doing in implementing our cybersecurity first principle strategies. Based on how well or poorly they are deployed will impact our forecast up or down.

An inside-out analysis: First principle strategies.

Zero Trust:  8% out of a possible 10% reduction adjustment. The Contoso Corporation as described has a strong IAM (Identity Access Management) program that consists of IGA (Information Governance and Administration), PIM (Privileged Identity Management) and PAM (Privileged Access Management). They provide their customers, contractors, and employees with single sign on capability and MFA (multi factor authentication) for sensitive data. For vulnerability management, they have a strong program for Microsoft products but it’s a lot less strong for any third party applications. There is no mention of an SBOM (Software Bill of Material) program but they do track devices, applications, and operating system patch levels for Microsoft products. There is no discussion of a software defined perimeter.  With all of that, the Contoso Corporation is well along its zero trust journey. They still have a ways to go, but it’s mature.

Intrusion Kill Chain Prevention:  1% out of a possible 10% reduction adjustment. The Contoso Corporation doesn’t really think about specific adversary tactics. It has a security stack of mostly Microsoft Security products and it has the capability to deliver telemetry from that stack to a SOC (Security Operations Center), but there is no mention that Contoso has a SOC, an intelligence group, a red/blue/purple team, or a desire to share adversary playbook intelligence with its peers. I'm giving them a 1% reduction since Contoso uses Microsoft Defender Antivirus for automatic endpoint protection from malware, but really, they have no intrusion kill chain prevention program.

Resilience:  1% out of a possible 15% reduction adjustment. The Contoso Corporation does have a healthy encryption program that works with its multi-level zero trust program. That said, I found no mention of any crisis planning, backup programs, incident response capability, or even the incipient beginnings of a Chaos Engineering capability. The Contoso Corporation might well deflect an inexperienced ransomware crew, but any attack from a professional crew will likely materially impact it. 

Automation:  0% out of a possible 5% reduction adjustment. The Contoso Corporation doesn’t mention anything about its  SRE (Site Reliability Engineering), DevSecOps, or even its Agile development program. It mentions nothing about securing its own code nor even trying to track the components it's using from open source. They are getting no benefit from automation that I can see.

With all of those adjustments, I would bet $100 that the Contoso Corporation has a 17% chance of being materially impacted by a cyber attack this year; just under a 1 in 5 chance.

What now? Are we within the risk tolerance of the business?

If I was the Contoso CSO, there are several next steps to consider, assumptions to validate. The first thing to do is to confirm the dollar amount of what is material for the company. With annual revenues of $35 billion, is a $10 million loss material? $100 million? Something bigger? Something smaller? And how do you determine that number? That would be several one-on-one conversations with the CFO, the CEO, and perhaps key members of the board. And, by the way, that number will likely change over time as the fortunes of the company goes up and down. Make sure you're checking in with senior leadership annually to confirm the number.

I would definitely take the Cyentia loss exceedance curve for Fortune 1000 companies as a baseline, find the value on the curve, and adjust my forecast up or down depending. For example, Cyentia says that for Fortune 1000 companies, there is a 14% chance of losing $10 million or more. If $10 million is the Contoso indicator for materiality, I would adjust our current prior of 17% down one or two points to 15%, a 3 in 20 chance. 

The next step is to determine if the current forecast is in the tolerance of the leadership chain. If it is, if they think that a 3 in 20 chance is an acceptable risk to the business, then nothing needs to be done here in terms of significant new investment in people, process, and technology. The infosec team needs to maintain and perhaps become more efficient in executing its zero trust, intrusion kill chain, resilience, and automation tactics, but we’re not going to roll out some new initiative. On the other hand, if senior leadership is uncomfortable with the 3 in 20 chance and demands that I get it under 10%, or a 1 in 10 chance, I have some planning to do.

I would look at resilience first. Contoso’s resilience plan is weak and some improvements in basic meat and potatoes IT functionality (like automated backups, practice restorations, crisis planning, and incident response) could significantly reduce their risk compared to the other first principal strategies that might cost a lot more to implement. After all, getting good at intrusion kill chain prevention is not cheap. That said, let’s not forget to keep track of the cost for reducing risk to below 10%. If the spend to accomplish that task is greater than the $10 million loss we were trying to prevent, perhaps we should go back to the drawing board and come up with a cheaper plan.

Risk forecasting wrap up.

I have been thinking about finding a better way to convey cyber risk to the board for a long time, almost a decade. I kept struggling with my lack of knowledge about statistics and kept trying to rely on the frequentist view that I needed more data, that I needed to count all the things. But I knew deep down that this wasn’t the path, that there had to be a better way.

Dr. Tetlock’s book on superforecasting opened my mind to the idea that infosec professionals didn’t need precision answers to make resource decisions about security improvements. We could make good-enough estimates, Fermi estimates, back-of-the-envelope estimates, that would take less time and the answers would be close enough to be impactful. And then I learned that Bayes Rule was the mathematical foundation that explained why superforecasting techniques worked.

Working through the examples in this essay for Marvel Studios and the Contoso company, you may feel queasy that I am basing cyber risk forecasts for multi-million dollar companies on Kentucky windage. I get it. It’s tough to let go of the frequentist mindset. But I will just remind you that way smarter people than you and I, like Alan Turing, used these techniques to solve more complex problems than calculating cyber risk. Maybe you should try it. Besides, the old way of collecting all the data and using qualitative heat maps hasn’t really worked since we started doing it some 20 years ago. Perhaps it’s time to consider a change.


2019 SUSB Annual Data Tables by Establishment Industry,” by US Census Bureau, 27 May 2022. 

A Guide to the Changing Number of U.S. Universities,” by Josh Moody, US News & World Report, 27 April 2021.

Announcing Loss Exceedance Charts in the FAIR-U Training App,” by Bryan Smith, Fairinstitute.org, 2022.

"Author Interview: 'Security Metrics: A Beginner’s Guide’ Review'," by Rick Howard, The Cyberwire, the Cybersecurity Canon Project, Ohio State University, 2021.

Book Review: 'How to Measure Anything in Cybersecurity Risk," by Steve Winterfeld, the Cybersecurity Canon Project, Ohio State University, 2021.

"Book Review: "How to Measure Anything: Finding the Value of ‘Intangibles’ in Business,” by Rick Howard, Cybersecurity Canon Project, Palo Alto Networks, 19 July 2017.

Book Review: 'Measuring and Managing Information Risk: A FAIR Approach',” by Ben Rothke, the Cybersecurity Canon Project, Ohio State University, 2021.

Book Review: 'Security Metrics: A Beginner’s Guide’ Review," by Ben Smith, the Cybersecurity Canon Project, Ohio State University, 2021.

"Book Review: 'Security Metrics: Replacing Fear, Uncertainty and Doubt," by Rick Howard, The Cybersecurity Canon Project, Ohio State University, 2021.


Census Bureau Reports There Are 89,004 Local Governments in the United States,” by  the U.S. Census Bureau, 30 August 2012.

Comments and Observations about Risk,” by Todd Inskeep, Linked In, 2022.

Fermi Estimations,” by Bryan Braun, 4 December 2011.

Fermi Problems: Estimation,” by TheProblemSite.com, 2022.

From Municipalities to Special Districts, Official Count of Every Type of Local Government in 2017 Census of Governments,” by the US Census Bureau, 23 January 2020. 

"Information Risk Insights Study: A Clearer Vision for Assessing the Risk of Cyber Incidents," by the Cyentia Institute, 2020.

How Many Companies Are There in the United States That Fall into Each of the Following Revenue Categories: <26M, 26M - 100M, 100M - 250M, 250M - 500M, 500M - <1B, 1B - 2B, 2B - 5B, 5B - 10B, 10B & Above?,” by  Wonder, 2017. 

How Many Federal Agencies Exist? We Can’t Drain the Swamp until We Know, ” Clyde Wayne, Forbes, 10 December 2021.

"How to Measure Anything in Cybersecurity Risk," by Douglas W. Hubbard, Richard Seiersen, Published by Wiley, 25 April 2016.

"Internet Crime Report," by the FBI, 2021.

IRIS Risk Retina - Data for Cyber Risk Quantification,” by Cyentia Institute, 9  August 2022.

Kentucky Windage,” by Thjothvitnir, Urban Dictionary, 11 September 2007.

Marvel Entertainment Revenue,” by  Zippia, 14 December 2021. 

"Measuring and Managing Information Risk: A Fair Approach," by Jack Freund and Jack Jones, Published by Butterworth-Heinemann, 22 August 2014.

Metrics and risk: All models are wrong, some are useful,” By Rick Howard, CSO Perspectives, the CyberWire, 30 March 2020.

Microsoft 365 for Enterprise for the Contoso Corporation,” by Microsoft 365 Enterprise, September 2022. 

Microsoft Shows How Contoso Corporation Is Implementing Azure Services,” by Pradeep, MSPoweruser, 17 October 2016. 

"Security Metrics: A Beginner’s Guide," by Caroline Wong, Published by McGraw-Hill Companies, 10 November 2011.

"Security Metrics: Replacing Fear, Uncertainty, and Doubt," by Andrew Jaquith, Published by Addison-Wesley Professional, 1 March 2007.

“Superforecasting II: Risk Assessment Prognostication in the 21st Century [Paper], [Presentation]” by Rick Howard and Dave Caswell, RSA Conference, 5 March 2019.

"Superforecasting: The Art and Science of Prediction,” by Philip E. Tetlock and Dan Gardner, 29 September 2015, Crown.

The NCES Fast Facts Tool Provides Quick Answers to Many Education Questions,” by the National Center for Education Statistics, 2019. 

"The Theory That Would Not Die," by Sharon Bertsch McGrayne, Talks at Google, 23 August 2011.

The Theory That Would Not Die: How Bayes' Rule Cracked the Enigma Code, Hunted Down Russian Submarines, and Emerged Triumphant from Two Centuries of Controversy,” by Sharon Bertsch McGrayne, Published by Yale University Press, 14 May 2011.

Wade Baker, PhD, Partner, Cyentia Institute email to author. 13 September 2022.

What Counts as a ‘Business’? It Might Not Be What You Think It Is,” by Todd Kehoe,  Albany Business Review, 11 April 2019.

Why Businesses Don’t Report Cybercrimes to Law Enforcementm,” by Dan Swinhoe, CSO Online, 30 May 2019.