Low-skilled attacker used Claude, Codex to breach 14 companies

sanitation@lemmy.today · 3 days ago

Low-skilled attacker used Claude, Codex to breach 14 companies

Pennomi@lemmy.world · 3 days ago

Only for a year or so. Any company still vulnerable after these tools have been out long enough deserve it.

Andrew Beveridge@sh.itjust.works · 2 days ago

Most people on lemmy seem to condemn use of LLMs in any way for anything, I wonder what those folks opinion of this stance is - should companies use the tools or not?

village604@adultswim.fan · 2 days ago

Cybersecurity is actually one of the few fields that can benefit from AI. There are companies like Horizon3 who are using it alongside their other threat models to do continuous pen testing.

Chronographs@lemmy.zip · 2 days ago

Yeah imo the one thing ai is legitimately useful for is finding answers to difficult problems that can be trivially verified as correct.

MalReynolds@slrpnk.net · 2 days ago

In this case hallucinations actually help…

🦄🦄🦄@feddit.org · 2 days ago

Gonna take a guess here that what is used in cybersecurity is not LLMs but one of the more useful machine learning applications. Just a nitpick cause today “ai” and “LLM” are sadly synonymous.

boonhet@sopuli.xyz · 2 days ago

No, LLMs can definitely be useful for cyber too. It’s the whole reason the US government banned Claude Fable for export.

An LLM can not just try existing exploits like a script kiddy, but with iteration it can try variations and if you know what runs on the server, inspect the source for potential exploits.

They can also look at your setup and say what issues they see (reverse proxy config, etc).

Doesn’t replace an expert, but can be useful for a first pass before you get the highly paid people involved.

🦄🦄🦄@feddit.org · 2 days ago

You know what, fair enough. I don’t know enough about that particular one.

boonhet@sopuli.xyz · 18 hours ago

I do. I reverse engineered some proprietary software using an agent. A pro could’ve maybe done it faster, but I did it AFK with little knowledge about reverse engineering.

An agent could similarly try tons of attacks against online targets. Fairly sure some are doing it.

marzhall@lemmy.world · 2 days ago

Finding holes in software has employed “fuzzing”, where you send completely random payloads, as a research tactic for quite a while (and it has found exploits). LLMs just seem like “educated” fuzzing, I don’t see why anyone would complain about updating your suite with them.

borari@lemmy.dbzer0.com · edit-2 2 days ago

I’ve been fucking around with using Claude to solve CTF challenges. I’m using a harness built out of a custom agent I wrote that progressively loads specific a specific skill for the challenge category, cryptography, binary exploitation, reverse engineering, forensics, etc.

It’s solving the simple shit in <1m using sonnet. It’s solved some shit that I couldn’t figure out at all during the CTF in the time limit we had in ~20 minutes. There’s been 2 challenges that after about 25 minutes I’ll kill the agent working on it, change to opus, then opus solved them in about 20m. One crypto challenge was so math heavy i never would have figured it out. One bin exp challenge didn’t provide a local binary, everything was remote. There was a catch that I never would have solved bc it was remote only and I couldn’t locally debug it.

It’s fucking scary good at solving these things. I just prompt with “use <agent> to solve ./category/challenge/“ and it fully just does everything. It’s definitely akin a fuzzer that can be used for way more than just finding crashes and memory leaks. It takes some work and understanding to make it context/token efficient I think, but it lowers the bar so tremendously that I definitely see why there’s concern here. And again it’s solving most of these things with sonnet, not even opus and definitely not fable.

All told, this feels like the same panic that happened when metasploit first got released/demo’d at defcon back in the day.

ozymandias117@lemmy.world · 2 days ago

As long as they produce a PoC like fuzzing tools, I don’t think anyone is complaining

It’s the theoretical attacks that nearly always turn out to be impossible, wasting time, and making it harder to find the real issues that need investigation that’s the problem with slop reports

DeadDigger@lemmy.zip · 2 days ago

Well the problem is that for example curl got flooded with generated security reports where only 5% had some true security potential. So your llm will basically flood you with false positives

ByteJunk@lemmy.world · 2 days ago

If 5% of the reports are genuine security vulnerabilities that they wouldn’t have found otherwise, that’s looking like a big win to me, not sure how you see it differently.

DeadDigger@lemmy.zip · 1 day ago

No 5% is very low compared to before AI and this still does not mean the absolute number of found bugs has risen. From my understanding it didn’t for curl. Further it is unlikely that bugs in curl are not found. Basically everything works with curl and it’s a paid bug bounty program so a lot of security researchers are looking at it

frongt@lemmy.zip · 2 days ago

The problem is identifying which 5%. Nobody wants to filter that much AI slop.

AwesomeLowlander@sh.itjust.works · 2 days ago

If you’re working for a company’s cybersec, that’s your job. And a much preferable one to waiting for an attacker to do it for you.

borari@lemmy.dbzer0.com · 2 days ago

If you’re submitting a vulnerability to a public repo, that’s also your job. These slop reports that are wasting maintainers time should never have been reported. The person tasking the LLM is out of their depth and can’t be the human in the loop that verifies the vulnerability report before submitting because they don’t have the required knowledge to do that. It’s a shame, because if people who had the requisite knowledge were the ones submitting, the ratio of valid reports to noise would be way higher than 5% and open source maintainers wouldn’t be feeling burned the fuck out.

ByteJunk@lemmy.world · 2 days ago

Exactly. If you go through 100 tickets and find 5 real vulnerabilities to patch, that sounds incredibly good…

frongt@lemmy.zip · 2 days ago

Sure, but nobody wants to do that, even at fair pay. Unpaid open source volunteer projects REALLY don’t want to do that, and risk burning out what is typically a solo main dev.

Low-skilled attacker used Claude, Codex to breach 14 companies

Low-skilled attacker used Claude, Codex to breach 14 companies

Low-skilled attacker used Claude, Codex to breach 14 companies - Help Net Security