Blog

Anthropic’s new Claude model could be a game changer for developers: Opus 4 ‘pushes the boundaries in coding’, dramatically outperforms OpenAI’s GPT-4.1, and can code independently for seven hours

3 weeks ago

2 minutes read

Anthropic has launched its next generation of Claude models, and one in particular stands out as a developer’s dream.

Claude Opus 4, unveiled alongside Claude Sonnet 4, sets “new standards for coding”, according to the startup, and marks its most powerful model launch yet.

Opus 4 is designed specifically for software developers and engineers and “excels at coding and complex problem-solving” tasks.

The company claims the model achieved a 72.5% score on SWE-bench, which is used to benchmark software engineering tasks. Notably, this means the model dramatically outperforms OpenAI’s GPT-4.1, which scored 54.6% on the same testing.

When OpenAI announced the launch of GPT-4.1, it targeted the model at software developers – an area which the company has focused on in recent months.

Admittedly, this model did mark a sizable improvement compared to GPT-4o, which scored 21.4% when it launched. Regardless, Anthropic appears to have once again blown it out the water.

“These models advance our customers’ AI strategies across the board: Opus 4 pushes boundaries in coding, research, writing, and scientific discovery, while Sonnet 4 brings frontier performance to everyday use cases as an instant upgrade from Sonnet 3.7,” the firm said in a blog post announcing the launch.

Elsewhere, Claude Opus 4 significantly outperforms previous models on memory capabilities, according to Anthropic. The firm said that when developers building applications provide Claude access to local files, the model “becomes skilled” at creating and maintaining memory files.

This allows the model to store key information more efficiently, thereby improving long-term task awareness, performance, and coherence.

Claude Opus 4 has serious stamina

According to Anthropic, the Opus 4 model boasts a combination of performance and stamina, so to speak.

During testing at Rakuten, the model contended with a “demanding” open source refactoring exercise while running independently for seven hours. Notably, Anthropic said it did so with “sustained performance”.

This marks a step change in performance and longevity, and suggests developers harnessing Opus 4 can do so across the length of their working day, rather than in sporadic bursts of activity.

“Cognition notes Opus 4 excels at solving complex challenges that other models can’t, successfully handling critical actions that previous models have missed,” Anthropic said in its blog post.

How to access the new models

Anthropic described Opus 4 and Sonnet 4 as “hybrid models” which offer two distinct modes. These include “near-instant responses and extended thinking for deeper reasoning”.

The Pro, Max, Team, and Enterprise Claude plans will include both models and extended thinking capabilities. Notably, Sonnet 4 will also be available for free users.

Pricing for Opus 4 starts at $15/$75 per million tokens (input/output) and Sonnet 4 at $3/$15.

Both will be made available via the Anthropic API, or through Amazon Bedrock and Google Cloud’s Vertex AI service.

MORE FROM ITPRO

Source link

Anthropic’s new Claude model could be a game changer for developers: Opus 4 ‘pushes the boundaries in coding’, dramatically outperforms OpenAI’s GPT-4.1, and can code independently for seven hours

Pure Storage beefs up FlashBlade, FlashArray offerings in platform performance focus

How US employers can protect immigrant tech workers – Computerworld

If ‘Materialists’ left you wanting more Dakota Johnson, this rom-com on HBO Max is for you

The best Notion templates for business productivity – Computerworld

I Tried This DIY Mosquito Trap, and It Actually Works

Best Dehumidifiers of 2024 – Consumer Reports

This Massive 75″ Amazon Fire TV Omni QLED TV Is on Sale for Less Than $450 for Prime Members

I Became a Windows Power User Overnight This New Open-Source App from Microsoft

How to rotate a video on an iPhone

Recent Dr.Web cyberattack claimed by pro-Ukrainian hacktivists

Today’s AI models have a poor grasp of world history – Computerworld

Wordle Answer for Today, August 13, 2024

Related Articles