OpenAI GPT-4.1 models promise improved coding and instruction following

“GPT‑4.1 mini is a significant leap in small model performance, even beating GPT‑4o in many benchmarks. It matches or exceeds GPT‑4o in intelligence evals while reducing latency by nearly half and reducing cost by 83%,” the announcement said. “For tasks that demand low latency, GPT‑4.1 nano is our fastest and cheapest model available. It delivers exceptional performance at a small size with its 1 million token context window, and scores 80.1% on MMLU, 50.3% on GPQA, and 9.8% on Aider polyglot coding—even higher than GPT‑4o mini. It’s ideal for tasks like classification or autocompletion.”

These improvements, OpenAI said, combined with primitives such as the Responses API, will allow developers to build more useful and reliable agents that will perform complex tasks such as extracting insights from large documents and resolving customer requests “with minimal hand-holding.”

OpenAI also said that GPT-4.1 is significantly better than GPT-4o at tasks such as agentically solving coding tasks, front-end coding, making fewer extraneous edits, following diff formats reliably, ensuring consistent tool usage, and others.

Source link