DeepSeek, a Chinese startup, has made an impression on the technology industry with its powerful large language model, built on an open-source foundation.
DeepSeek has also shaken the AI industry by proving that it can develop a powerful AI for only $6 million in hardware costs, while companies like OpenAI, Google, and Microsoft have invested billions of Bits.
DeepSeek is a project of investor and entrepreneur Liang Wenfeng, born in 1985, who studied electronic information engineering and communication at Zhejiang University. Liang began his AI career by applying the technology to quantitative trading, co-founding the Hangzhou-based High-Flyer Quantitative Investment Management hedge fund in 2015. In 2023, Liang founded DeepSeek with the goal of advancing Artificial General Intelligence (AGI).
DeepSeek released its first large language model, DeepSeek-Coder, on November 29, 2023.
However, it was not until January 20, 2025, when DeepSeek-R1 was announced, that the company truly shook the AI industry.
With a team of just under 200 people and a budget of only $6 million, DeepSeek has released a free, open-source model that matches the quality of OpenAI's GPT-o1 - a project that cost $600 million and took about two years with 3,500 personnel to develop.
Unlike Western tech giants with massive workforces, DeepSeek optimizes recruitment by focusing on fresh graduates: "Work experience of 3 to 5 years is the maximum, and those with more than 8 years of experience are almost eliminated," a recruitment expert revealed to 36kr, China's leading technology news platform.
Furthermore, while OpenAI and other top AI models are primarily provided as paid subscription products, DeepSeek's source code is entirely open, publicly verifiable, and downloadable to run locally on computers through the Hugging Face platform or used for free through a mobile app.
DeepSeek's underlying technology is considered a major breakthrough in the field of AI. The release of this model has shocked the American tech community, causing the market capitalization of major companies to plummet by $1 trillion in a single day.
DeepSeek's success stems from its unique approach to model design and training. Like a massive parallel supercomputer that divides tasks for simultaneous processing, DeepSeek's Mixture-of-Experts (MoE) system only activates about 37 Bit out of 671 Bit parameters for each task. This approach significantly optimizes performance, reducing computational costs while maintaining top-tier performance across many applications.
DeepSeek has also improved the training process through Group Relative Policy Optimization, a reinforcement learning technique that enhances decision-making by comparing the model's choices with those of similar learning agents. This helps the AI refine its reasoning more effectively, resulting in higher-quality training data.
Additionally, DeepSeek is committed to transparency and open-source accessibility when releasing its models under the MIT license. This allows users to download, deploy, and customize the AI model, setting it apart from competitors who maintain proprietary systems. The open-source model also enables developers to improve and share the technology, creating a continuous cycle of evolution and upgrades.
DeepSeek is supported by a large number of Nvidia A100 chips combined with cheaper hardware. Some estimates suggest that DeepSeek has access to around 50,000 Nvidia GPUs, compared to the 500,000 GPUs that OpenAI used to train ChatGPT.
Many AI technology experts praise DeepSeek as a powerful, efficient, and cost-effective model, while some critical voices express concerns about privacy and data security.
"We are living in a time when a non-US company is holding true to the original mission of OpenAI—open research, pioneering, and empowering everyone. This is hard to believe," wrote Jim Fan, Nvidia's Senior Director of Research, on X. "The most interesting outcome is also the most likely one."
Even OpenAI's CEO, Sam Altman, acknowledged DeepSeek as a formidable competitor:
"We will certainly create better models, but it's really exciting to have a new competitor!" Altman shared on X.
However, just a few days later, OpenAI announced that it had found evidence that DeepSeek used OpenAI's proprietary models to train its own AI model through a process called distillation.
Additionally, DeepSeek has faced criticism over its terms of service, cybersecurity practices, and potential ties to the Chinese government. Some experts also express concerns about the amount of user data DeepSeek collects, including device models, operating systems, keyboard patterns, and IP addresses—all stored on servers located in China according to the company's privacy policy.
"Privacy issues have always existed when it comes to China. There is always data collection from users, so be cautious," said Kevin Surace, CEO of Appvance. "This will force all of us to rethink how we train models and the resources needed to operate AI."
DeepSeek's rapid rise is challenging the dominant position of Western tech giants and raising big questions about the future of AI—who will build it, who will control it, and whether AI should be open and accessible to all.
However, there are many unanswered questions about DeepSeek's long-term impact, and whether US President Donald Trump will react to China's unexpected dominance in the AI field with a TikTok-like ban. Will High-Flyer inflate GPU performance to make DeepSeek appear more efficient than it is in reality? Is DeepSeek's sudden public launch a ploy to drive down Nvidia's stock price to benefit well-positioned investors?