As the world recovers from the largest IT outage in history, it shows the danger of one point of failure in IT infrastructure
A global IT failure wreaked havoc on Friday, grounding flights and disrupting everything from hospitals to government agencies. Over all the chaos hung a question: how did a flawed update to Microsoft Windows software bring large swaths of society to a screeching halt?
The problem originated with an Austin, Texas-based cybersecurity firm called CrowdStrike, relied upon by most of the global technology industry, including Microsoft, for its Falcon program, which blocks the execution of malware and cyber-attacks. Falcon protects devices by securing access to a wide range of internal systems and automatically updating its defenses – a level of integration that means if Falcon falters, the computer is close behind. After CrowdStrike updated Falcon on Thursday night, Microsoft systems and Windows PCs were hit with a “blue screen of death” and rendered unusable as they were trapped in a recovery boot loop.
Microsoft is a juggernaut with significant market power, dominating cloud-computing infrastructure across Europe and the United States. So it wasn’t just computers that were affected, but servers and a host of other systems as well. Overwhelming requests from users, devices, services and businesses ushered in a cascading series of failures with Microsoft products – namely Azure Cloud and Microsoft 365. Failures plaguing Azure led to additional but separate disruptions with 365 services. A giant clusterfuck ensued.
So does Google. And yet they’re still being taken to court for monopolistic practices by the U.S. Department of Justice.
Monopoly doesn’t mean literally no competitors in the real world. It means no competitors worth noting because everyone has been corralled into using a single company.
Crowdstrike is big, but not that big.
About half of my clients use them; and of those, about a third are halfway through ripping them out in favor of MS defender.
(MS is definitely “that big”)
Right. That’s not the case here. Crowdstrike competes with a dozen other EDR products. Using the number of ratings as a proxy for popularity, they’re not even the most popular.
That dozen don’t seem to add up to much together considering the massive global nature of this outage.
Because “we didn’t crash” doesn’t make the news. My company wasn’t affected, so nobody cared about us.
That’s great that your company wasn’t affected.
I hope no one was trying to fly a plane to get to your company yesterday.