IT administrators are struggling to deal with the ongoing fallout from the faulty CrowdStrike update. One spoke to The Register to share what it is like at the coalface.

Speaking on condition of anonymity, the administrator, who is responsible for a fleet of devices, many of which are used within warehouses, told us: “It is very disturbing that a single AV update can take down more machines than a global denial of service attack. I know some businesses that have hundreds of machines down. For me, it was about 25 percent of our PCs and 10 percent of servers.”

He isn’t alone. An administrator on Reddit said 40 percent of servers were affected, along with 70 percent of client computers stuck in a bootloop, or approximately 1,000 endpoints.

Sadly, for our administrator, things are less than ideal.

Another Redditor posted: "They sent us a patch but it required we boot into safe mode.

"We can’t boot into safe mode because our BitLocker keys are stored inside of a service that we can’t login to because our AD is down.

  • db0@lemmy.dbzer0.com
    link
    fedilink
    English
    arrow-up
    195
    arrow-down
    1
    ·
    1 year ago

    Pity the administrators who dutifully kept a list of those keys on a secure server share, only to find that the server is also now showing a screen of baleful blue.

    Lol, can you imagine? It empathetically hurts me even thinking of this situation. Enter that brave hero who kept the fileshare decryption key in a local keepass :D

    • sugar_in_your_tea@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      122
      ·
      edit-2
      1 year ago

      That’s why the 3-2-1 rule exists:

      • 3 copies of everything on
      • 2 different forms of media with
      • 1 copy off site

      For something like keys, that means:

      1. secure server share
      2. server share backup at a different site
      3. physical copy (either USB, printed in a safe, etc)

      Any IT pro should be aware of this “rule.” Oh, and periodically test restoring from a backup to make sure the backup actually works.

      • IphtashuFitz@lemmy.world
        link
        fedilink
        English
        arrow-up
        37
        ·
        1 year ago

        We have a cron job that once a quarter files a ticket with whoever is on-call that week to test all our documented emergency access procedures to ensure they’re all working, accessible, up-to-date etc.

    • kescusay@lemmy.world
      link
      fedilink
      English
      arrow-up
      62
      arrow-down
      2
      ·
      1 year ago

      Seems like an argument for a heterogeneous environment, perhaps a solid and secure Linux server to host important keys like that.

        • Voroxpete@sh.itjust.works
          link
          fedilink
          English
          arrow-up
          48
          ·
          1 year ago

          Their point is not that linux can’t fail, it’s that a mix of windows and linux is better than just one. That’s what “heterogeneous environment” means.

          You should think of your network environment like an ecosystem; monocultures are vulnerable to systemic failure. Diverse ecosystems are more resilient.

          • Avatar_of_Self@lemmy.world
            link
            fedilink
            English
            arrow-up
            4
            ·
            1 year ago

            Yes, but has it taken both OS’ out at the same time? It hasn’t but it could happen, however, the chances are even less. There’s obvious risk mitigation in mixing vendors in infrastructure for both hardware and software in the enterprise.

            If some critical services were lost in your enterprise last time until RH updated their kernel then you could have benefitted from running that service from Windows as well. Now the reverse is true. You could have another DC via Samba on Linux in your forest if you wanted to, in order to have an AD still for example. Same goes for file share servers, intermediary certificate servers (hopefully your Root CA is not always on the network) and pretty much most critical services.

            Most enterprises run a lot of services off of a hypervisor and have overhead to scale (or they are already in a sinking ship), so you can just spin up VMs to do that. It isn’t as if it is unreasonably labor intensive compared to other similar risk mitigation implementations. Any sane CCB (obviously there are edge cases but we are talking in general here) will even let you get away without a vendor support contract for those, since they are just for emergency redundancy and not anywhere near critical unless the critical services have already shit the bed.