Repeated mistakes

When you have an incident and do a Root Cause Analysis (RCA). The ‘What?’ and ‘How?’ are usually easy to determine. Usually. But the most important question that rebuilds trust and shows that you and your teams are in control, is ‘Why?’.

You need to understand and communicate the ‘Why?’ and how it won’t happen again. Sometimes the hardest ones to report are when someone just makes a stupid mistake, we’re human.

In reality ‘Bob’ uploaded/deleted/turned off the wrong thing, he’s mortified, everybody on the team is poking fun and trying to make “Doing a Bob” stick.

That turns into:

“Human error caused the issue, additional training will be provided and second eye checks performed”

When it becomes a real problem, is when it’s the third time this quarter that ‘Bob’ has made a mistake…

In April Crowdstrike released an update that wrecked some Linux systems (Debian and Rocky Linux)

CrowdStrike broke Debian and Rocky Linux months ago, but no one noticed
CrowdStrike recently caused a widespread Blue Screen of Death (BSOD) issue on Windows PCs, disrupting various sectors. However, this was not an isolated incident, CrowdStrike affected Linux PCs also.

In June Crowdstrike released an update that caused Windows CPUs to max out at 100% and required rebooting

CrowdStrike bug maxes out 100% of CPU, requires Windows reboots
“Note: This is 100% of a single core. In an 8-core system for example, an additional 12.5% of unexpected total CPU load would be experienced...”

We all know what happened this month.

EDR (Endpoint Detection and Response) Vendors have a lot of unfettered access to systems through their product. It’s not exactly a transparent part of the sector either. Nobody wants to turn off daily updates. But customers and Boards need to know the why.

Crowdstrike aren’t alone in the industry, but with that recent record they better start doing some deep and transparent explaining soon. For their own sake. Other Vendors also need to learn lessons from this.

Subscribe to Gary P Shewan

Don’t miss out on the latest issues. Sign up now to get access to the library of members-only issues.
jamie@example.com
Subscribe