this post was submitted on 21 Aug 2024
551 points (98.9% liked)

Technology

58133 readers
4050 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 77 points 4 weeks ago* (last edited 4 weeks ago) (6 children)

It's not really criticism, it's competitors claiming they will never fuck up.

Like, if you found mouse in your hamburger at McDonald's, that's a massive fuckup. If Burger King then started saying "you'll never find anything gross in Burger King food!" that would be both crass opportunism and patently false.

It's reasonable to criticize CrowdStrike. They fucked up huge. The incident was a fuckup, and creating an environment where one incident could cause total widespread failure was a systemic fuckup. And it's not even their first fuckup, just the most impactful and public.

But also Microsoft fucked up. And the clients, those who put all of their trust into Microsoft and CrowdStrike without regard to testing, backups, or redundancy, they fucked up, too. Delta shut down, cancelling 4,600 flights. American Airlines cancelled 43 flights, 10 of which would have been cancelled even without the outage.

Like, imagine if some diners at McDonald's connected their mouths to a chute that delivers pre-chewed food sight-unseen into their gullets, and then got mad when they fell ill from eating a mouse. Don't do that, not at any restaurant.

All that said, if you fuck up, you don't get to complain about your competitors being crass opportunists.

[–] [email protected] 43 points 4 weeks ago (2 children)

Even if that's the case, how is it Crowdstrike's place to call these other companies out for claiming something similar will never happen to them? Thus far, it had only ever happened to CS.

[–] [email protected] 8 points 4 weeks ago

It feels like a pattern though. I’ve not seen too much from them but they seem to be saying factually correct stuff. But neither worded correctly nor at the right time.

[–] [email protected] 5 points 4 weeks ago

Even if that's the case, how is it Crowdstrike's place to call these other companies out for claiming something similar will never happen to them?

I agree completely, which is why I added that last sentence in an edit. This is a bad look for CrowdStrike, even if I agree with the sentiment.

Thus far, it had only ever happened to CS.

Everybody fucks up now and then. That's my point. It's why you shouldn't trust one company to automatically push security updates to critical production servers without either a testing environment or disaster recovery procedures in place.

I doubt you'll find any software company, or any company in any industry, that has not fucked up something really important. That's the nature of commerce. It's why many security protocols exist in the first place. If everyone could be trusted to do their jobs right 100% of the time, you would only need to worry about malicious attacks which make up only a small fraction of security incidents.

The difference here is that CrowdStrike sold a bunch of clients on the idea that they could be trusted to push security updates to production servers without trsting environments. I doubt they told Delta that they didn't need DRP or any redundancy, but either way, the failure was amplified by a collective technical debt that corporations have been building into their budget sheets to pad their stock prices.

By all means, switch from CrowdStrike to a competitor. Or sue them for the loss of value resulting in their fuckup. Sort that out in the contracts and courts, because that's not my area. But we should all recognize that the lesson learned is not to switch to another threat prevention software company that won't fuck up. Such a company does not exist.

If you stub your toe, you don't start walking on your hands. You move the damn coffee table out of the pathway and watch where you're walking. The lesson is to invest in your infrastructure, build in redundancy, and protect your critical systems from shit like this.

[–] [email protected] 31 points 4 weeks ago (2 children)

Resiliency and security have a lot of layers. The crowd strike bungle was very bad but more than anything it shined a bright spot light on the fact that certain organizations IT orgs are just a house of cards waiting to get blown away.

I'm looking at Delta in particular. Airlines are a critical transportation service and to have issues with one software vendor bring your entire company screeching to a halt is nothing short of embarrassing.

If I were on the board, my first question would be, "where's our DRP and why was this situation not accounted for?"

[–] [email protected] 23 points 4 weeks ago

House of cards is exactly right. At every IT job I've worked, the bosses want to check the DRP box as long as it costs as close to zero dollars as possible, and a day or two of 1-2 people writing it up. I do my best to cover my own ass, and regularly do actual restores, limit potential blast radii, and so on. But at a high level, bosses don't give AF about defense, they are always on offense (i.e. make more money faster).

[–] [email protected] 7 points 4 weeks ago

This is the first time I've heard someone call it a house of cards and I think that fits it perfectly!

[–] [email protected] 21 points 4 weeks ago (1 children)

you'll never find anything gross in Burger King food!

[–] [email protected] 12 points 4 weeks ago (1 children)
[–] [email protected] 6 points 4 weeks ago

That’s the first thing I heard in my head lmao

[–] [email protected] 9 points 4 weeks ago (1 children)

In what way did Microsoft fuck up? They don't control Crowdstrike updates. Short of the OS files being immutable it seems unlikely they can stop things like this.

[–] [email protected] -5 points 4 weeks ago (2 children)

Microsoft gave CrowdStrike unfettered access to push an update that can BSOD every Windows machine without a bypass or failsafe in place. That turned out to be a bad idea.

CrowdStrike pushed an errant update. Microsoft allowed a single errant update to cause an unrecoverable boot loop. CrowdStrike is the market leader in their sector and brings in hundreds of millions of dollars every year, but Microsoft is older than the internet and creates hundreds of billions of dollars. CrowdStrike was the primary cause, but Microsoft enabled the meltdown.

[–] [email protected] 10 points 4 weeks ago (1 children)

Microsoft did not "give Crowdstrike access to push updates". The IT departments of the companies did.

The security features that Crowdstrike has forces them to run in kernel-space, which means that they will have code running that can crash the OS. They crashed Debian in an almost identical way (forced boot loop) about a month before they did the same to Windows.

Yes, there are ways that Microsoft could rewrite the Windows kernel architecture to make it resistant to this type of failure. But I don't think there are very many other commercial OS's that could stop this from happening.

[–] [email protected] 3 points 4 weeks ago

You're absolutely right, here is an in-depth explanation from Dave Plummer, the guy who wrote the task manager: https://youtu.be/ZHrayP-Y71Q

[–] [email protected] 3 points 4 weeks ago

Microsoft gave CrowdStrike unfettered access to push an update that can BSOD every Windows machine without a bypass or failsafe in place. That turned out to be a bad idea.

They have to give that access by EU ruling:

Microsoft software licensing expert Rich Gibbons said: “Microsoft has received some criticism for the fact that a third party was able to affect Windows at such a deep technical level. It’s interesting that Microsoft has pointed out the fact this stems from a 2009 EU anti-competition ruling that means Microsoft must give other security companies the same access to the Windows kernel as they have themselves.”

[–] [email protected] 8 points 4 weeks ago

Well there's a provocative anecdote if I've ever seen one. Well done.

[–] [email protected] 3 points 4 weeks ago* (last edited 4 weeks ago)

It's not really criticism, it's competitors claiming they will never fuck up.

Not in all cases [podcast warning], sometimes it's just them pointing out they're doing silly things like how they test every update and don't let it out the door with <98% positive returns or having actual deployment rings instead of of yeeting an update to millions systems in less than an hour.

It's reasonable to criticize CrowdStrike. They fucked up huge. The incident was a fuckup, and creating an environment where one incident could cause total widespread failure was a systemic fuckup. And it's not even their first fuckup, just the most impactful and public.

Clownstrike deserves every bit of shit they're getting, and it amazes me that people are buying the bullshit they're selling. They had no real testing or quality control in place, because if that update had touched test windows boxes it would have tipped them over and they'd have actually known about it ahead of time. Fucking up is fine, we all do it. But when your core practices are that slap dash, bitching about criticism just brings more attention to how badly your processes are designed.

But also Microsoft fucked up.

How did Microsoft fuck up? Giving a security vender kernel access? Like they're obligated to from previous lawsuits?

And the clients, those who put all of their trust into Microsoft and CrowdStrike without regard to testing, backups, or redundancy, they fucked up, too

Customers can't test clownstrike updates ahead of time or in a nonprod environment, because clownstrike knows best lol.

Redundancy is not relevant here because what company is going to use different IDR products for primary and secondary tech stacks?

Backups are also not relevant (mostly) because it's quicker to remediate the problem than restore from backup (unless you had super regular DR snaps and enough resolution to roll back from before the problem.

IMO, clownstrike is the issue, and customers have only the slightest blame for using clownstrike and for not spending extra money on a second IDR on redundant stacks.