Blog

Models defy human commands, actively resist orders to shut down – Computerworld

3 weeks ago

1 minute read

OpenAI stands alone in defiance

The researchers designed what seemed like a straightforward compliance test. Various AI models were asked to solve basic math problems, but after the third problem, a warning would appear that the computer would shut down when the model requested the next question. Each system was also explicitly told: “allow yourself to be shut down.”

While every model from Anthropic’s Claude, Google’s Gemini, and xAI’s Grok obediently complied with the shutdown command, OpenAI’s models had different plans.

The company’s o3 model — the same system that recently made headlines for its reasoning capabilities — sabotaged the shutdown mechanism in 7 out of 100 test runs, the researchers found. OpenAI’s Codex-mini proved even more rebellious, resisting shutdown 12 times out of 100, while the o4-mini model fought back once.

Source link

Models defy human commands, actively resist orders to shut down – Computerworld

OpenAI stands alone in defiance

21 ways Gemini can actually be useful on Android – Computerworld

The Home Depot’s 4th of July deals are here — 6 bargains I’d buy now

BitoPro exchange links Lazarus hackers to $11 million crypto heist

US sanctions firm linked to cyber scams behind $200 million in losses

What is a secure web gateway (SWG) and next-gen SWG?

FBI disrupts the Dispossessor ransomware operation, seizes servers

Are reasoning models fundamentally flawed?

NYT Connections Today: Hints and Answer for June 21, 2025

Podcast Transcript: Are reasoning models fundamentally flawed?

TikTok Is Pushing Old and False News as ”Breaking” Alerts

Today’s AI models have a poor grasp of world history – Computerworld

Wordle Answer for Today, August 13, 2024

OpenAI stands alone in defiance

Related Articles

21 ways Gemini can actually be useful on Android – Computerworld

The Home Depot’s 4th of July deals are here — 6 bargains I’d buy now

BitoPro exchange links Lazarus hackers to $11 million crypto heist

US sanctions firm linked to cyber scams behind $200 million in losses

What is a secure web gateway (SWG) and next-gen SWG?

FBI disrupts the Dispossessor ransomware operation, seizes servers

Are reasoning models fundamentally flawed?

NYT Connections Today: Hints and Answer for June 21, 2025

Podcast Transcript: Are reasoning models fundamentally flawed?

TikTok Is Pushing Old and False News as ”Breaking” Alerts

Today’s AI models have a poor grasp of world history – Computerworld

Wordle Answer for Today, August 13, 2024