Open Data, and OSINT as an attack vector.
Open data and OSINT
By Dr. Jan Kallberg
Nov 29, 2021

Democratic governments face a dilemma: on the one hand open data are invaluable to creating a robust, informed civil society; on the other hand, they're an invaluable source of open source intelligence for hostile intelligence services. It's an unusually difficult risk management challenge.

Open Data, and OSINT as an attack vector.

Democratic governments globally, at both national and local levels, are pursuing open government. This means they’re providing access to public information and delivering it to their constituents as machine-readable data. Open data are freely available, machine-readable, and cover large sections of the public entities’ operations. While this can contribute to governmental transparency, it also increases a society’s attack surface: if everyone can access the data, so can the adversary.

Open government data can be aggregated, mined, and analyzed to create insights into the inner workings of government. Open Source Intelligence (OSINT) has grown the last decade rapidly, accelerating the previous years; with increasing volumes of public data release, more information published online, access to open geodata, social media, and combined aggregated information is an unprecedented opportunity for an adversary.  

The dilemma between democratic openness and OSINT security poses a wicked problem with no straightforward solution. Either way, it becomes a dilemma to expose the attack surface or limit an open democratic society. 

Democracies need openness to survive as institutions. The government’s voluntary dissemination of public sector information, to include open data initiatives, is intended to strengthen democracy, to lower costs by inviting higher levels of competition, and to increase a society’s understanding of the public sector through transparency and accountability. On the flipside, the government can be studied in detail by releasing massive datasets to the unexpected benefit for the adversary’s OSINT. The adversary can add data from offensive covert cyber operations, as well as from the deep and dark web, and together with OSINT from public releases combine these to reach high granularity in their quest for information that can be analyzed into high-quality intelligence.

The core cyber security issue is the creation of a national attack surface when thousands of entities release data without coordination or thorough security review. The United States has 50 states, about 3 200 counties, and thousands of local government entities of significant size – cities, districts, utilities, and transit authorities. There is considerable inconsistency across the nation in the ways in which state and local governments release data.

Open Data varies in their level of detail, in how easy it is to extract the data in a machine-readable format, in the accompanying security features, and in the level of access. This complexity should raise genuine concerns about avenues of approach available to attackers, especially given the federated nature of IT systems, linked together on the back end. 

The standard mitigation solution is to assess and classify the data for public release. Still, because the attack surface is built from aggregated elements of the data available, the solution requires the classifying authority to understand how the future adversary will use the data, which is unlikely. The alternative is to limit the volume of disseminated data to the public, but that has a ”democratic cost.” The democratic doctrine assumes that, by default, it is beneficial for the constituency to be well-informed, to have access to primary knowledge of the public sector, and this of course can validate the appropriateness of a public-funded program. 

The democratic assumption is correct as it supports the pillars of an equal society and enables the acceptance and buy-in from the governed. A democratic government that limits access to public information that earlier has been accessible to the public will face challenges to legitimacy, authority, and confidence in how government performs its duties. The balance between governmental openness and cyberdefense relies on a risk assessment of open data policy itself, and how to mitigate these risks while maintaining transparency and openness. In reality, the government must accept a level of risk to ensure democratic longevity and vitality by disseminating Open Data; therefore, the residual risk will inevitably be significant. Conversely, increased risk acceptance entails damages from cyber breaches as the risk tolerance is high. So the critical question to solve – how do we keep the risk of accepted damages at a manageable level. To add to the wicked problem, adversaries are not easily managed. The balance between these two – openness and security – is not easily struck. 

Note on the author: Jan Kallberg, Ph.D., is a research scientist at the Army Cyber Institute. The views expressed are those of the author and do not reflect the official policy or position of the Army Cyber Institute, the Department of Defense, or the U.S. Government.