• JohnAnthony@lemmy.dbzer0.com
      link
      fedilink
      arrow-up
      9
      ·
      1 day ago

      I tried telling this to my manager for years. He saw it as a “X days since we last had a problem and needed to reboot the server” and took pride in it.

      We finally shut it down at over 5 years of uptime. Some docker containers had been running for 4 years straight.

      Yes, that means what you think it does concerning update policies. Yes, the server and some containers were exposed to the internet. No, the backups were never tested.

      • Possibly linux@lemmy.zip
        link
        fedilink
        English
        arrow-up
        6
        ·
        1 day ago

        If a device hasn’t been rebooted in a long time there is a much higher chance of it not coming back after a reboot. This is made worse by the fact that sometimes power loss is unexpected which means that an outage can occur at a bad time.

        The other issue is that a high uptime device doesn’t usually have the latest updates installed. Delaying updates creates security issues and when you do get around to updating it means that more things get changed at once.

        • Redjard@reddthat.com
          link
          fedilink
          arrow-up
          1
          ·
          9 hours ago

          The reverse is that if you really know your stuff you can get away with fewer restarts, or even none. But you pretty much have to know every component and update you run while in that untested state.
          This is similar to bugs that go away on a restart. If you don’t know why, then you haven’t really fixed it, just rolled the dice again hoping it won’t reoccur.

          As for updates, on regular systems you can do update everything but the kernel. You do have to restart affected services afterwards (often done automatically).
          Even on atomic systems you can switcheroo the subvolume underneath a running system.
          Unfortunately the kernel is quite major, so that is a valid reason to see the need to update. Definitely not as pressing as say nginx, sshd, or sudo though. Kernel bugs bubbling up to an exposed attack surface is still quite unusual.