The US Federal Aviation Administration grounded all domestic flights yesterday morning following a system outage that appears to stem from an IT failure, rather than malicious activity.
Not a cyberattack, but an IT failure: the FAA's NOTAM outage.
The US Federal Aviation Administration (FAA) grounded all domestic flights early yesterday morning after an outage of the Notice to Air Missions system (NOTAM). A technical failure appears to be behind the approximately 90-minute outage, rather than the work of nefarious actors.
The reality behind the NOTAM outage.
The FAA initially reported the outage at 7:15 AM ET Wednesday, saying they were “working to fully restore” the NOTAM system, with the order of a pause to all domestic departures until 9:00 AM ET. An update an hour later resumed departures at the Newark Liberty (EWR) airport in New Jersey, as well as the Atlanta Hartsfield-Jackson (ATL) airport in Georgia, due to “air traffic congestion in those areas.” In the update released at 8:50 AM ET, Bloomberg explains, the ground stop was officially lifted, with normal air traffic operations “gradually” returning.
The New York Times reports that a later update from the FAA revealed that the preliminary investigation linked the outage to a database file that was damaged. The Wall Street Journal writes that Canadian provider, NAV Canada, saw an outage of their NOTAM system as well just after 10:00 AM ET, which was restored at roughly 1:15 PM ET. While the cause for the Canadian outage has not yet been identified, According to the New York Times, a spokeswoman for NAV Canada, Vanessa Adams, said that she did not believe there was a connection to the FAA outage, despite the coincidence.
While it may not be a cyberattack, there are other implications.
The Washington Post reports that this incident is a prime example of the increasing trend in people jumping to conclusions and equating outages with cyberattacks, even, in some cases, when presented with evidence to the contrary. “If the media and the general public are speculating, there’s no harm in that other than perhaps unnecessarily getting people agitated and adding some anxiety to people’s lives,” said Shawn Henry, chief security officer at CrowdStrike, to the Washington Post. “But that’s what happens with people and the media.”
While this may not have been a cyberattack, though, there are other implications from this incident that are less than stellar. Moody’s Investors Service wrote yesterday that while the incident may be credit neutral, the system’s exposure to cyber risk and things out of the FAA’s control was clearly in evidence. FAA systems, airline staffing systems, the Transportation Security Administration (TSA) systems, including its passenger screening system, are all third-party systems that US air operations rely on. Passenger data and operational aircraft support are rarely the responsibility of the airports, so it is not anticipated that it will “materially affect airlines' finances,” but the cyber risk from the third parties may be a bit jarring to some.
Industry commentary on the NOTAM problem.
Neil Jones, director of cybersecurity evangelism at Egnyte, discusses the impact of the technical debt and lack of cyber preparedness from this incident:
“Just like consumers who overextend themselves through the use of credit cards, the more technical debt the mission-critical airline industry accrues, the more difficult it will be to successfully resolve the mounting debt. The most recent example is the outage to the Federal Aviation Administration’s (FAA’s) Notice to Air Missions Systems (NOTAMS), which delayed flight departures across the United States.
"A great deal of attention has been paid to the impact of technical debt on specific airlines’ flight schedules and their customers’ experience. Those are extremely important factors, but overlooked areas include: 1) The crucial need for viable incident response plans and 2) The additional impact of technical debt on cybersecurity."
"For every month’s worth of technical debt that the airline industry accrues, potential cyberattackers have more time on their hands to detect flaws in existing software and develop new vulnerabilities that can jeopardize critical infrastructure. And, every technical incident that lacks a hot backup to a secondary system gives cyberattacks even more time and bargaining power. The result is that airlines face a “perfect storm” of operational, customer satisfaction and cybersecurity impacts.
"Customers are increasingly viewing cyber-preparedness as a key metric to assess whether they want to expand their business relationships with a particular company. Accruing massive amounts of technical debt can harm your customer relationships way beyond a single operational incident and ultimately affect customers’ travel decisions.”
An outdated system and a damaged database file: a recipe for disaster.
Added 10:00 AM, January 13th, 2023. Computing wrote Friday morning that the FAA continues to attribute Wednesday’s NOTAM outage to a damaged database file. A source speaking to CNN claimed that air traffic controllers recognized the system issue on Tuesday afternoon, intending to reboot the system during less congested hours, on Wednesday morning. The reboot took place as planned, though the system still "wasn't completely pushing out the pertinent information that it needed for safe flight, and it appeared that it was taking longer to do that," according to CNN’s source, which led to the eventual grounding order. A senior government official cited aging infrastructure as a contributing factor, noting that the system is “30 years old and not scheduled to be updated for another six years,” according to NBC News.
Update: The FAA attributes its January NOTAM outage to a contractor error.
Added, 11:30 PM, January 20th, 2023. The Wall Street Journal reports that the FAA has traced the cause of the NOTAM outage to an error committed by IT contractors during synchronization of backup files. "The Federal Aviation Administration said Thursday that a contractor working for the air-safety regulator had unintentionally deleted computer files used in a pilot-alert system, leading to an outage that disrupted U.S. air traffic last week," the Journal wrote. "The agency, which declined to identify the contractor, said its personnel were working to correctly synchronize two databases—a main one and a backup—used for the alert system when the files were unintentionally deleted."