• Pennomi@lemmy.world
    link
    fedilink
    English
    arrow-up
    17
    ·
    3 days ago

    Only for a year or so. Any company still vulnerable after these tools have been out long enough deserve it.

    • Andrew Beveridge@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      17
      ·
      2 days ago

      Most people on lemmy seem to condemn use of LLMs in any way for anything, I wonder what those folks opinion of this stance is - should companies use the tools or not?

      • village604@adultswim.fan
        link
        fedilink
        English
        arrow-up
        18
        ·
        2 days ago

        Cybersecurity is actually one of the few fields that can benefit from AI. There are companies like Horizon3 who are using it alongside their other threat models to do continuous pen testing.

        • Chronographs@lemmy.zip
          link
          fedilink
          English
          arrow-up
          15
          ·
          2 days ago

          Yeah imo the one thing ai is legitimately useful for is finding answers to difficult problems that can be trivially verified as correct.

        • 🦄🦄🦄@feddit.org
          link
          fedilink
          English
          arrow-up
          4
          ·
          2 days ago

          Gonna take a guess here that what is used in cybersecurity is not LLMs but one of the more useful machine learning applications. Just a nitpick cause today “ai” and “LLM” are sadly synonymous.

          • boonhet@sopuli.xyz
            link
            fedilink
            English
            arrow-up
            10
            ·
            2 days ago

            No, LLMs can definitely be useful for cyber too. It’s the whole reason the US government banned Claude Fable for export.

            An LLM can not just try existing exploits like a script kiddy, but with iteration it can try variations and if you know what runs on the server, inspect the source for potential exploits.

            They can also look at your setup and say what issues they see (reverse proxy config, etc).

            Doesn’t replace an expert, but can be useful for a first pass before you get the highly paid people involved.

              • boonhet@sopuli.xyz
                link
                fedilink
                English
                arrow-up
                1
                ·
                18 hours ago

                I do. I reverse engineered some proprietary software using an agent. A pro could’ve maybe done it faster, but I did it AFK with little knowledge about reverse engineering.

                An agent could similarly try tons of attacks against online targets. Fairly sure some are doing it.

      • marzhall@lemmy.world
        link
        fedilink
        English
        arrow-up
        6
        ·
        2 days ago

        Finding holes in software has employed “fuzzing”, where you send completely random payloads, as a research tactic for quite a while (and it has found exploits). LLMs just seem like “educated” fuzzing, I don’t see why anyone would complain about updating your suite with them.

        • borari@lemmy.dbzer0.com
          link
          fedilink
          English
          arrow-up
          4
          ·
          edit-2
          2 days ago

          I’ve been fucking around with using Claude to solve CTF challenges. I’m using a harness built out of a custom agent I wrote that progressively loads specific a specific skill for the challenge category, cryptography, binary exploitation, reverse engineering, forensics, etc.

          It’s solving the simple shit in <1m using sonnet. It’s solved some shit that I couldn’t figure out at all during the CTF in the time limit we had in ~20 minutes. There’s been 2 challenges that after about 25 minutes I’ll kill the agent working on it, change to opus, then opus solved them in about 20m. One crypto challenge was so math heavy i never would have figured it out. One bin exp challenge didn’t provide a local binary, everything was remote. There was a catch that I never would have solved bc it was remote only and I couldn’t locally debug it.

          It’s fucking scary good at solving these things. I just prompt with “use <agent> to solve ./category/challenge/“ and it fully just does everything. It’s definitely akin a fuzzer that can be used for way more than just finding crashes and memory leaks. It takes some work and understanding to make it context/token efficient I think, but it lowers the bar so tremendously that I definitely see why there’s concern here. And again it’s solving most of these things with sonnet, not even opus and definitely not fable.

          All told, this feels like the same panic that happened when metasploit first got released/demo’d at defcon back in the day.

        • ozymandias117@lemmy.world
          cake
          link
          fedilink
          English
          arrow-up
          2
          ·
          2 days ago

          As long as they produce a PoC like fuzzing tools, I don’t think anyone is complaining

          It’s the theoretical attacks that nearly always turn out to be impossible, wasting time, and making it harder to find the real issues that need investigation that’s the problem with slop reports

      • DeadDigger@lemmy.zip
        link
        fedilink
        English
        arrow-up
        8
        ·
        2 days ago

        Well the problem is that for example curl got flooded with generated security reports where only 5% had some true security potential. So your llm will basically flood you with false positives

        • ByteJunk@lemmy.world
          link
          fedilink
          English
          arrow-up
          6
          ·
          2 days ago

          If 5% of the reports are genuine security vulnerabilities that they wouldn’t have found otherwise, that’s looking like a big win to me, not sure how you see it differently.

          • DeadDigger@lemmy.zip
            link
            fedilink
            English
            arrow-up
            1
            ·
            1 day ago

            No 5% is very low compared to before AI and this still does not mean the absolute number of found bugs has risen. From my understanding it didn’t for curl. Further it is unlikely that bugs in curl are not found. Basically everything works with curl and it’s a paid bug bounty program so a lot of security researchers are looking at it

          • frongt@lemmy.zip
            link
            fedilink
            English
            arrow-up
            7
            ·
            2 days ago

            The problem is identifying which 5%. Nobody wants to filter that much AI slop.

            • AwesomeLowlander@sh.itjust.works
              link
              fedilink
              English
              arrow-up
              11
              ·
              2 days ago

              If you’re working for a company’s cybersec, that’s your job. And a much preferable one to waiting for an attacker to do it for you.

              • borari@lemmy.dbzer0.com
                link
                fedilink
                English
                arrow-up
                6
                ·
                2 days ago

                If you’re submitting a vulnerability to a public repo, that’s also your job. These slop reports that are wasting maintainers time should never have been reported. The person tasking the LLM is out of their depth and can’t be the human in the loop that verifies the vulnerability report before submitting because they don’t have the required knowledge to do that. It’s a shame, because if people who had the requisite knowledge were the ones submitting, the ratio of valid reports to noise would be way higher than 5% and open source maintainers wouldn’t be feeling burned the fuck out.

              • ByteJunk@lemmy.world
                link
                fedilink
                English
                arrow-up
                5
                ·
                2 days ago

                Exactly. If you go through 100 tickets and find 5 real vulnerabilities to patch, that sounds incredibly good…

              • frongt@lemmy.zip
                link
                fedilink
                English
                arrow-up
                2
                ·
                2 days ago

                Sure, but nobody wants to do that, even at fair pay. Unpaid open source volunteer projects REALLY don’t want to do that, and risk burning out what is typically a solo main dev.