• AmbiguousProps@lemmy.today
    link
    fedilink
    English
    arrow-up
    40
    ·
    4 days ago

    I work in DevOps, this is one of the easier things to automate. It’s common for certs to be issued on a 90 day basis these days, no way that would be maintainable without automating.

      • dgdft@lemmy.world
        link
        fedilink
        English
        arrow-up
        5
        ·
        edit-2
        4 days ago

        Have you had Certbot or LE fail on prod for you before?

        I’m sure stuff happens, but I usually view them as one of the most robust moving parts on a server.

        E: I don’t mean to express disbelief at all; just curious to learn about possible footguns.

        • four@lemmy.zip
          link
          fedilink
          English
          arrow-up
          14
          ·
          4 days ago

          Certbot / LE has to be running on some machine and that machine can be accidentally turned off, payments not fulfilled, was supposed to be moved but the new instance doesn’t work, gateway configuration changed, etc.

          Automation requires maintenance and that introduces human error

          • AmbiguousProps@lemmy.today
            link
            fedilink
            English
            arrow-up
            5
            ·
            edit-2
            4 days ago

            Like dgdft said, if you’re using certbot, it should typically be running on the machine that your endpoints are hosted on. Enterprise solutions don’t require this, but they have other means of deploying certificates automatically and alarming if they are unable to, before they expire. My organization has dashboards showing which certs expire and when, and it triggers alarms at least a month before anything goes wrong.

            High stakes automation should always have alarms on error, and since certs have set expiration dates baked into them, you can alarm far before anything goes wrong. Apparently, Riot didn’t have that.

            Also, more frequent renewals make it so that people are less likely to forget it exists. Because of that, along with the possible security ramifications, 2 to 10 year certs should never be used, in my opinion. A 10 year cert will always get kicked on to the next team and it’s very possible for things to fall through the cracks.

          • dgdft@lemmy.world
            link
            fedilink
            English
            arrow-up
            4
            ·
            4 days ago

            Certbot/LE should typically be running on the box that’s terminating TLS for you, right? If the box handling your traffic is down, shouldn’t that be a self-evident problem?

            I’ve been running Caddy and certbot for nearly a decade and never found a way for them to break without it being 100% my fault. They’re more or less self-healing too. I’m with AmbiguousProps; cert renewals have been pretty damn reliable to automate compared to any other piece of tech, IME.

        • phx@lemmy.world
          link
          fedilink
          English
          arrow-up
          2
          ·
          3 days ago

          Yeah I’ve had certbot mess up a few times, though more often it was the scripts that actually shuttle the updated certs to their proper locations and restart services after updating

    • vodka@feddit.org
      link
      fedilink
      English
      arrow-up
      5
      ·
      3 days ago

      Where I work we went down to 60 days as of 01.01.2026, by end of year the plan is that no cert should be valid for more than 30 days.

      I hate it please make it stop.

    • ramble81@lemmy.zip
      link
      fedilink
      English
      arrow-up
      5
      ·
      4 days ago

      Cool story if everything you have has an API or code based. Try doing it on hundreds of switches and other embedded devices. The whole 42 day thing they’re floating is gonna be a massive nightmare because they don’t realize all the other things out there that use certificates.

      • AmbiguousProps@lemmy.today
        link
        fedilink
        English
        arrow-up
        8
        ·
        edit-2
        4 days ago

        What makes you think I don’t do this on embedded devices? I’m not about to dox my self with specifics, but I do this exclusively for embedded hardware as my job. We even do it for devices not directly attached to our network. It’s really not difficult so long as you have control of your enterprise hardware (which, you should, unless your management is terrible at their jobs). Hell, even the routers we use have this functionality built in, failure alarms and all.

        If this is a problem for you, it’s probably at an organizational level, and not a technical issue.