Chinese AI firm DeepSeek has Silicon Valley flustered
Chinese AI firm DeepSeek has released a range of models capable of competing with OpenAI in a move experts told ITPro showcases the strength of open source AI.
The announcement appears to have taken big tech players by surprise, with commentators noting that it highlights the growing capabilities of Chinese-based firms operating in the space.
In a post on LinkedIn over the weekend, Meta’s chief AI scientist Yann LeCun said those seeing the DeepSeek news as part of a geopolitical conversation between China and the US are looking at it incorrectly.
Instead, the firm’s success underlines the critical role open source development plays in the broader generative AI race.
“To people who see the performance of DeepSeek and think: ‘China is surpassing the US in AI’ – You are reading this wrong. The correct reading is: ‘Open source models are surpassing proprietary ones,’” LeCun wrote.
DeepSeek has benefited from open research and other open source AI applications, LeCun said, including Meta’s Llama.
“They came up with new ideas and built them on top of other people’s work. Because their work is published and open source, everyone can profit from it. That is the power of open research and open source,” he said.
Under the hood of DeepSeek’s new model
The release is called DeepSeek R1, a fine-tuned variation of DeepSeek’s V3 model which has been trained on 37 billion active parameters and 671 billion total parameters, according to the firm’s website.
DeepSeek has published some of its benchmarks, and R1 appears to outpace both Anthropic’s Claude 3.5 and OpenAI’s GPT-4o on some benchmarks, including several related to coding.
The Chinese challenger models are free to access, and the DeepSeek app has ousted ChatGPT from the top free application spot on Apple’s App Store.
The model itself was also reportedly much cheaper to build and is believed to have cost around $5.5 million. This comes in stark contrast to the $100+ million price tags associated with big tech projects.
“It’s clever engineering and architecture, not just raw computing power, which is huge because it shows you don’t need Google or OpenAI’s resources to push the boundaries,” Camden Woollven at GRC International Group, told ITPro.
Costs for users could also have providers such as OpenAI sweating. According to the company’s pricing, API access costs just $0.14 per million tokens.
By contrast, OpenAI’s comparable pricing stands at $7.50.
Open source vs closed source
While many of the big-name models from the likes of OpenAI and Google are proprietary, firms such as Meta and now DeepSeek are championing an open approach, and there is an argument for the benefits this can bring to the industry.
Research suggests that companies using open source AI are seeing a better return on investment (ROI), for example, with 60% of firms looking to open source ecosystems as a source for their tools.
“Open source models have a compounding effect where different organizations are able to build on top of each other’s work and achieve greater results than they would otherwise be able to on their own,” Komninos Chatzipapas, founder of HeraHaven.AI, told ITPro.
With high-profile success stories such as this, Chatzipapas said this could help turn the tide in favour of open source on the LLM space.
There are also many benefits from the end user perspective, Chatzipapas said, such as lower costs through the ability of organizations to self-host, and enhanced privacy as third-party reliance is less of a necessity.
As Woollven added though, it’s not as simple as one being better than the other. While open source has its advantages for innovation and transparency, close source has value in other ways.
“Companies like OpenAI can pour massive resources into development and safety testing, and they’ve got dedicated teams working on preventing misuse which is important,” Woollven said.
Peter van der Putten, director of Pegasystems’ AI Lab and assistant professor in AI at Leiden University, said this marks the latest in a string of interesting releases by Chinese companies in the AI space.
“The last couple of months a lot of powerful or interesting AI systems have come out Chinese labs, not just DeepSeek R1, but also for instance Tencent’s Hunyuan tex2video model, and Alibaba’s QWQ reasoning/questioning models, and they are in many cases open source,” he said.
“As these are mostly challengers with a ‘side business’, for instance DeepSeek came out of a hedge fund. it makes sense to make models available open source, just as in the early generative AI days.”
Long-term, however, DeepSeek and others could make the shift toward a closed model approach.
“As their markets and offerings mature this may change to more closed models,” he said. “Or DeepSeek could be making a bet that given their know-how they are best positioned to provide low-cost inference services, it doesn’t hurt to make earlier versions of these models available open source and learn from feedback.
“Their most promising model from a scientific point of view is DeepSeek-R1-Zero, and relies considerably less on human fine-tuning feedback, but it didn’t perform as good yet. So this won’t be the last we heard from DeepSeek.”
Source link