Anthropic, the wildly successful AI company that has cast itself as the most safety-conscious of the top research labs, is dropping the central pledge of its flagship safety policy, company officials tell TIME.

In 2023, Anthropic committed to never train an AI system unless it could guarantee in advance that the company’s safety measures were adequate. For years, its leaders touted that promise—the central pillar of their Responsible Scaling Policy (RSP)—as evidence that they are a responsible company that would withstand market incentives to rush to develop a potentially dangerous technology.

But in recent months the company decided to radically overhaul the RSP. That decision included scrapping the promise to not release AI models if Anthropic can’t guarantee proper risk mitigations in advance.

    • ThePantser@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      5
      ·
      8 hours ago

      Sounds like they got black listed by the US and decided that was bad for business so they flipped quickly. Probably start sucking off Trump to get back in.

    • XLE@piefed.social
      link
      fedilink
      English
      arrow-up
      4
      ·
      edit-2
      7 hours ago

      First of all: this happened before the Pentagon dropped their contract.

      I’m not sure if this change is entirely relevant, because the whole “AI safety” thing has been a sham from the beginning. It’s always been unverifiable and the promises have always been undoable. LLM’s just predict next word with a little extra randomness. And there’s no way to guarantee through an LLM that they won’t predict next word that ends up being bad. You can’t promise this without removing the randomness and then testing the infinite input and output that could happen.

      It’s basically like when Google removed “don’t be evil.” It was a promise that was unfalsifiable and unquantifiable.

        • XLE@piefed.social
          link
          fedilink
          English
          arrow-up
          1
          ·
          edit-2
          7 hours ago

          The problem is the filtration algorithm is basically flaky in the same way as the LLM itself, and probably is an LLM. And even if it does work, I’ve never heard a single soul say that Anthropic shut down their account due to questionable prompts. I even ran into somebody here who claims he uses AI to work on sexual abuse cases; he says that he’s been stalled by the chatbot, but he’s never been blocked even for review.