OpenAI is routing GPT-4o to safety models when it detects harmful activities

Over the weekend, some people noticed that GPT-4o is routing requests to an unknown model out of nowhere. Turns out it’s a “safety” feature.

ChatGPT routes some conversations to different models than what is expected. This can happen when you’re using GPT-5 in auto mode, and you ask AI to think harder. It’ll route your requests to GPT-5-thinking.

While that’s good, what has upset users is an attempt to route GPT-4o conversations to different models, likely a variant of GPT-5.

This can happen when you’re having a conversation with GPT-4o on a sensitive or emotional topic and it feels that it is some sort of harmful activity. In those cases, GPT-4o will switch to gpt-5-chat-safety.

OpenAI has confirmed the reports and explained that their intention is not evil.

“Routing happens on a per-message basis; switching from the default model happens on a temporary basis. ChatGPT will tell you which model is active when asked,” Nick Turley, who is VP of ChatGTP, noted in a X post.

“As we previously mentioned, when conversations touch on sensitive and emotional topics the system may switch mid-chat to a reasoning model or GPT-5 designed to handle these contexts with extra care.”

It is not possible to turn off the routing because it’s part of OpenAI’s implementation to enforce safety measures.

OpenAI says this is part of their broader effort to strengthen safeguards and learn from real-world use before a wider rollout

46% of environments had passwords cracked, nearly doubling from 25% last year.

Get the Picus Blue Report 2025 now for a comprehensive look at more findings on prevention, detection, and data exfiltration trends.


Source link
Exit mobile version