CrowdStrike Was a Godsend
I think the CrowdStrike incident of July 18, 2024 was a godsend. We got to experience the warning of a devasting cyber incident and we got an almost instantaneous fix. It was a taste of what could happen. It could have been much worse, and I think that is a leasson we should explore.
In the early hours of July 18, 2024, Windows computers around the world suffered the dreaded Blue Screen of Death. The issue seemed random and was crossing boundaries everywhere. In the first couple of hours, it felt like the “Big One” had arrived.
Many of us in information technology live with the fear that one day, a Zero Day attack will catch us in a way that could disastrously affect life as we know it. (OK, maybe not all of us think this way, but bear with me.) Certainly, if we have any responsibilities for the resilience of our organizations, this fear was magnified on this day.
We dodged the Big One on July 18th 2024.
Fortunately, in this case, the problem stemmed from an unforeseen condition in the way that CrowdStrike (a well-respected information security provider) distributed its content updates. And more fortunately, CrowdStrike owned up to the problem, promptly published a work around and then corrected the root cause so that it won’t happen again.
Personally, I have nothing but respect for CrowdStrike and its leaders. They saved us countless unknowns by their prompt and courageous actions.
But, I’d like to explore this cyber incident from a different angle:
What if the perpetrator of this cyber issue was intent on harm? How would we have dealt with it?
How long before we found the cause? The real cause?
How much noise and panic would we have had to deal with?
Might too much elapsed time have exposed us to further attacks?
How long after we knew the cause, would we have a work-around or recovery approach, for everyone to implement?
Who would the coordinator of the resolution have been?
I could go on with the hypothetical questions, but let’s stop there for now.
(Separately, we need to recognize that the CrowdStrike incident highlights a different attack vector for bad actors – another avenue for zero day attacks: Taint software updates, such that the distributed content will cause the update to blue-screen.)
For now, let’s explore the likely responses to some of those questions above:
What if the perpetrator of this cyber issue was intent on harm?
You can be sure that if this was the case, the perpetrator would have hidden their tracks, or at least laid down some false indicators with the intent of delaying a resolution.
How long before we found the cause?
The CrowdStrike incident was the result of a problem embedded in content – not a change to code or system configuration. It occured through a normal operational procedure (the distribution of updated content).
We learned that was the case within the first couple of hours of the incident’s appearance. Additionally, we were instructed on how to avoid further issues and were able to start fixing impacted systems almost immediately.
A nefarious actor will not be so genteel. This would have the dual effect of impacting more systems and people, and prolonging recovery. One could imagine this snowballing very quickly – potentially grinding
major functions, businesses and possibly critical infrastructure to a halt. We dodged the cascading effect because of the prompt response from CrowdStrike.
It’s safe to assume that a bad actor (depending on their motive) would let the bad times roll.
How much noise and panic would we have had to deal with?
Let’s not get complacent because we survived the CrowdStrike issue. The longer it would have taken for us to find a way to stop the pain, the more noise, confusion and possibly outright panic we would have had to endure. And (again depending on the motive) it is not unlikely that a malevolent hacker might have added further noise and misinformation/disinformation to fan the flames to suit their purpose. It is conceivable that the noise, disinformation, and political realities might have added days to the availability of the solution.
Who would the coordinator of the resolution have been?
The impact of the CrowdStrike issue was global. Without somebody stepping up and definitively owning the problem and the solution, it would have been challenging to identify the culprit and the fix. It would also have taken countless hours or days to get a coordinated resolution approach in place.
The natural assumption might have been that problem was in the operating system, but there is no guessing how that might have gone – given that nothing had changed in the OS to instigate this incident.
One of the reasons that I mentioned that CrowdStrike’s leadership showed some bravery, is because it could have been very easy to hide behind a cadre of lawyers. And it is not inconceivable that some organizations might do just that, particularly if there is so much uncertainty in the air. If the incident was perpetrated by a bad actor, we could wind up fighting amongst ourselves instead of finding a productive solution.
Several governments have cyber defense functions that would likely have been engaged, but our ability to form a collective solution is not yet well enough developed. During a real cyber-attack, we would likely waste a lot of time and political capital before we would have a cohesive response to a problem that could continue to grow.
Millions of systems were impacted by the CrowdStrike issue on July 19th 2024, and fixing the impacted systems went on for many days after that. But we were able to contain the issue quickly and after the first few hours, there was no uncertainty about its cause.
I think the CrowdStrike incident was a godsend.
Let this cyber incident serve a cause.
Get your organizations started on planning to survive the bad version of this scenario. This would mean that the organization can continue to support its most critical functions despite the loss of systems that are vital to their delivery.