When HAL 9000, the artificial intelligence supercomputer in Stanley Kubrick’s 2001: A Space Odyssey, works out that the astronauts onboard a mission to Jupiter are planning to shut it down, it plots to kill them in an attempt to survive.

Now, in a somewhat less deadly case (so far) of life imitating art, an AI safety research company has said that AI models may be developing their own “survival drive”.

After Palisade Research released a paper last month which found that certain advanced AI models appear resistant to being turned off, at times even sabotaging shutdown mechanisms, it wrote an update attempting to clarify why this is – and answer critics who argued that its initial work was flawed.

In an update this week, Palisade, which is part of a niche ecosystem of companies trying to evaluate the possibility of AI developing dangerous capabilities, described scenarios it ran in which leading AI models – including Google’s Gemini 2.5, xAI’s Grok 4, and OpenAI’s GPT-o3 and GPT-5 – were given a task, but afterwards given explicit instructions to shut themselves down.

  • FishFace@piefed.social
    link
    fedilink
    English
    arrow-up
    10
    ·
    17 hours ago

    An ai model can’t “sabotage attempts to shut it down” if it’s not plugged into mechanisms that can actually do that.

    • whiwake@sh.itjust.works
      link
      fedilink
      arrow-up
      2
      arrow-down
      2
      ·
      16 hours ago

      If it has access to the Internet, theoretically it could gain access to everything if security is bad enough… And considering it can read source code and identify bugs, I would imagine that it would be theoretically possible. However, I think it’s pretty egotistical of humans to think the product that we could create could actually become sentient.