You Actually Do Need to Understand Mythos | Hank Green

qaz@lemmy.world · edit-2 3 months ago

You Actually Do Need to Understand Mythos | Hank Green

Lemmilicious@feddit.nu · 3 months ago

I’m always a little picky with AI news, so I’m curious how much is actually confirmed to be true. I only did a little bit of digging, but it seems that all of Mythos’ capabilities are simply claimed by Anthropic and not verified? I also couldn’t find information about any new vulnerabilities they say it managed to find, it only found already discovered and patched ones as far as I can tell…?

The claim seems to be that it found them looking at old unpatched code and without connection to the internet and without being “explicitly” trained to find them. To me this sound like it was implicitly trained to find them since they were known about at the time of training, but I don’t know this for sure. It sure does feel a lot like marketing and very little like facts at this time!

MissesAutumnRains@lemmy.blahaj.zone · edit-2 3 months ago

In their paper, they post keys that can be verified once the vulnerabilities are patched (so they aren’t just revealing exploitable issues to the world) but in the few that they demonstrated (ones that were quickly patched), it demonstrated a pretty sophisticated ability to find and exploit multiple vulnerabilities. The patches that you saw them mention are a direct result of Anthropic reporting those vulnerabilities.

The method they talk about is basically saying that they weren’t looking at old, patched code (which would mean that the model could have found vulnerability mentions on the web that others have pointed out) but rather current, actively used software. The vulnerabilities and exploits that the model found were novel, zero day (meaning as of yet they ‘undiscovered’ problems by the person/people being attacked).

I’m not a researcher though, so someone can correct any information I’ve gotten wrong here, but this is definitely not solely hype. It’s not exciting stuff (unless you just look at headlines) but the vulnerabilities they discovered are like actual problems, especially if a model like this gets into the hands of bad actors.

Lemmilicious@feddit.nu · 3 months ago

Ah thanks, I didn’t find their paper but you lead me on the correct path to find some nice info on their blog! Great idea with the keys they had, it’s good that we will be able to verify if their claims are true in the future at least. The bugs that were solved already did indeed seem cool, but they write the blog in a slightly odd day where I didn’t find the confirmation that those were also zero-day vulnerabilities. Either way, we should get plenty of confirmation with the keys. Thanks for the details!