Small 2025 work recap

Just like in the posts Camembert and Camembert pt. 2, I’ll talk about the work I did at my last employer, Staedion. I left Staedion in 2025. So it's more of a work recap instead of a fair 2025 recap.

I joined Staedion in 2024 together with a new colleague, after two other system administrators had left. Unfortunately, there wasn’t a proper handover, so it quickly became a deep dive, learning as we went.

Together with a database administrator, we were responsible for Staedion’s two hardware clusters, Azure, and two office networks. A pretty big bag.

The Jump-scare CAU runs

The first jump-scare was updating the hardware stacks. I’ve written some articles about working on this stack, but running into major issues during what should’ve been a simple update made us pretty anxious about continuing. Especially after hearing, after the fact, that the previous sysadmins had run into similar issues. All of this while we were still in our "proefmaand".

Still, instead of saying goodbye after getting one cold present too many and being expected by the organization to know everything about a largely undocumented environment, I chose to continue and take it as a challenge. I don’t back off from challenges, and from day one I noticed there was a lot I could improve.

I quickly learned that CAU runs weren’t an option for updating the clusters, so everything had to be done manually. That gave us full control over the update process and enough time for the nodes to resilver.

Also went for the firmware compliance, use Dell OpenManage Integration to check compliance and then manually download the drivers and firmware. Manually setting maintenance +drain etc etc.

"Fire"walls

Second jump-scare was the vulnerable firewalls that hadn't been restarted or updated for 500+ days. So it wasn't weird that our upgrade path broke the sync between the firewalls HA stack

So we had to plan a trip to the data-center to bring the second one back online.

Hack of "the brievenboer"

Don’t get me started on this one, easily one of the biggest blunders I’ve seen in my short professional life. Questionable choices by both previous and current Staedion colleagues made the incident far more impactful than initially expected. A nice welcome present. I won't go into detail as i still think they don't understand that the "problems" aren't over.

Building dashboards

As part of my containerization efforts, I started building active dashboards so we could keep a close eye on the infrastructure.

It’s a good thing they didn’t ask me to build dashboards for financial purposes. Haha

Second hardware update run

This went sideways fast. I wrote an article about it:

Not so resilient filesystem
Azure Stack HCI - or Azure Stack Local, as its now called - has been quite the ride. Yesterday we went through what should have been a fairly routine task: upgrading the firmware on our physical Dell nodes across all stacks. Now, you’d think that with Dell and Microsoft
Losing multiple virtualdisks
slowly recovering

Fully documenting the infra

I took the time to write a multipart document for current and new colleagues to understand what they are working with

Including a new documentation environment

Many Many more..

I’m done writing about it, too many things changed, upgraded, fixed, you name it.

I’m glad I don’t work at Staedion anymore. My colleagues and boss were great and we had a lot of fun, and I’m still in touch with them. But the rest of the organization is a mess. Money burned left and right.

We saved around €100k per year by improving efficiency in Azure alone. In the end, it’s a culture issue, and it wasn’t the right place for me.

What a worthless recap actually, here some nice & funny pictures:

and the fair share of protests:

Blergh,