DeepSeek’s R1 Model Outperforms OpenAI’s o1 on Key Benchmarks

Amelia White

January 21, 2025
3 min read

DeepSeek’s R1 Model Outperforms OpenAI’s o1 on Key Benchmarks

Chinese AI lab DeepSeek has introduced its reasoning model, DeepSeek-R1, which it claims outperforms OpenAI’s o1 on several important AI benchmarks.

The model, available on the Hugging Face platform under an MIT license, can be used commercially without restrictions. DeepSeek asserts that R1 surpasses o1 on benchmarks like AIME, MATH-500, and SWE-bench Verified. AIME evaluates a model’s general performance using other AI models, while MATH-500 includes a variety of word problems, and SWE-bench Verified tests programming abilities.

R1 is designed as a reasoning model, meaning it can fact-check its own responses, reducing the risk of common errors. Though reasoning models tend to take longer to produce answers, ranging from seconds to minutes, they are generally more accurate in fields such as physics, science, and mathematics compared to typical models.

The full version of DeepSeek’s R1 model contains 671 billion parameters. In AI, the number of parameters is an indicator of a model’s problem-solving ability, with models having more parameters often demonstrating superior performance. To ensure broader accessibility, DeepSeek also offers smaller, “distilled” versions of R1, ranging from 1.5 billion to 70 billion parameters. These smaller models can run on less powerful hardware, such as laptops. Meanwhile, the full R1 model, requiring more robust infrastructure, is available via DeepSeek’s API at a price 90%-95% lower than OpenAI’s o1.

However, DeepSeek’s model has its limitations. As a Chinese-developed system, R1 is subject to regulations enforced by China’s internet authorities, which require AI responses to align with “core socialist values.” This means the model will not engage with politically sensitive topics, such as the Tiananmen Square protests or the status of Taiwan, reflecting the tight control over information that Chinese AI systems must adhere to.

Also Read: Meta Introduces “Edits,” a Cutting-Edge Video Editing App to Rival CapCut

DeepSeek’s release comes amid ongoing discussions around AI technology and its regulation. Recently, the U.S. government proposed stricter export controls targeting AI technologies and semiconductor components, which could further limit the ability of Chinese companies to develop advanced AI systems.

DeepSeek is not the only Chinese company claiming to rival OpenAI’s o1. Other labs, including Alibaba and Kimi (a subsidiary of Moonshot AI), have also produced models they say compete with OpenAI’s offerings. This growing competition highlights the accelerating pace of AI development in China.

As DeepSeek’s R1 model makes waves in the AI space, it signifies the increasing capacity of Chinese labs to produce cutting-edge models that compete on the global stage. Despite regulatory and political challenges, these models are becoming more accessible and capable, broadening the scope for AI innovation worldwide.

Amelia White

Amelia is a senior writer at Blockiance, focusing on the cultural implications of NFTs and digital ownership. Holding a master’s in media studies, she combines her academic background with a passion for storytelling to explore how Web3 technologies reshape creative industries.