Author: Sam Gao, Author of ElizaOS
0. Preface
In recent times, the successive releases of DeepSeek V3 and R1 have started to make American AI researchers, entrepreneurs, and investors experience FOMO. This grand event can be as surprising as the emergence of ChatGPT at the end of 2022.
Leveraging the complete open-sourcing of DeepSeek R1 (the model can be downloaded from HuggingFace for free local inference) and its extremely low price (1/100 the price of OpenAI o1), DeepSeek reached the top of the US Apple AppStore within just 5 days.
So, where does this mysterious new AI force, incubated by a Chinese quantitative company, come from?
1. The Origin of DeepSeek
I first heard about DeepSeek in 2021, when the genius girl, Luofuli, a Peking University master's student who published 8 ACL papers in a year, left her job at Damo Academy and joined Huanfang Quant (High-Flyer Quant). At the time, everyone was very curious why the highly profitable quantitative company would recruit AI talent: did Huanfang also need to publish papers?
As far as I know, the AI researchers recruited by Huanfang were mostly working independently, exploring some frontier directions, with the core directions being large language models (LLMs) and text-to-image models (the then OpenAI Dall-e).
Time flew to the end of 2022, and Huanfang gradually began to absorb more and more top AI talents (mostly current students from Tsinghua and Peking University). Inspired by ChatGPT, Huanfang CEO Liang Wenfeng decided to venture into the field of general artificial intelligence: "We have built a new company, starting with large language models, and later we will also have vision and other aspects."
Yes, this company is DeepSeek, and in early 2023, companies like Zhipuai, Yuezhi, and Baichuanzhihui gradually took center stage, and in the bustling Zhongguancun and Wudaokou area, DeepSeek's presence was largely overshadowed by these hot-money companies.
Therefore, in 2023, as a pure research institution without a star founder, DeepSeek (like Li Kaifu's Zero One Wanwu, Yang Zhilin's Yuezhi, and Wang Xiaochuan's Baichuanzhihui) found it difficult to raise funds independently from the market. Therefore, Huanfang decided to spin off DeepSeek and fully fund its development. In the fiery era of 2023, no venture capital firm was willing to provide funding for DeepSeek, firstly because most of the people in DeepSeek were just-graduated PhDs without very well-known top researchers, and secondly because the prospect of capital exit was far away.
In an environment full of noise and restlessness, DeepSeek began to write its stories in AI exploration:
November 2023, DeepSeek released DeepSeek LLM, with as many as 67 billion parameters, with performance close to GPT-4.
May 2024, DeepSeek-V2 was officially launched.
December 2024, DeepSeek-V3 was released, with benchmark tests showing it outperforming Llama 3.1 and Qwen 2.5, and on par with GPT-4o and Claude 3.5 Sonnet, igniting industry attention.
January 2025, the first generation large model with reasoning ability, DeepSeek-R1, was released, with performance exceeding OpenAI o1 by more than 100 times at a price less than 1/100, making the entire tech world tremble: the world truly realized that the power of China has arrived... Open source will always win!
2. Talent Strategy
I knew some of the DeepSeek researchers quite early on, mainly in the AIGC direction, such as the author of Janus released in November 2024 and the author of DreamCraft3D, including one who helped me optimize my latest paper @xingchaoliu.
Based on my findings, the researchers I know are mostly very young, mostly current doctoral students or within 3 years of graduation.
These people are mostly graduate or doctoral students in the Beijing area, with strong academic achievements: many are researchers who have published 3-5 top conference papers.
I asked my DeepSeek friends why Liang Wenfeng only recruits young people?
They relayed the words of Huanfang CEO Liang Wenfeng, which are as follows:
The mysterious veil of the DeepSeek team has aroused people's curiosity: what is its secret weapon? The foreign media say that this secret weapon is "young geniuses" who are capable of competing with financially powerful American giants.
In the AI industry, hiring experienced veterans is the norm, and many domestic AI startups tend to recruit senior researchers or talents with overseas doctoral degrees. However, DeepSeek goes against the grain and prefers young people without work experience.
A headhunter who has collaborated with DeepSeek revealed that DeepSeek does not recruit experienced technical personnel, "work experience of 3-5 years is already the most, and those with more than 8 years of work experience are basically passed over." Liang Wenfeng also stated in an interview with 36Kr in May 2023 that most of DeepSeek's developers are either fresh graduates or people just starting their careers in artificial intelligence. He emphasized: "Most of our core technical positions are held by fresh graduates or people with one or two years of work experience."
Without a work history, how does DeepSeek select people? The answer is, by looking at potential.
Liang Wenfeng once said that for a long-term endeavor, experience is not that important, and basic capabilities, creativity, and passion are more important. He believes that perhaps the top 50 AI talents in the world are not yet in China, "but we can cultivate such people ourselves."
This strategy reminds me of OpenAI's early strategy. When OpenAI was founded in late 2015, Sam Altman's core idea was to find young and ambitious researchers, so apart from President Greg Brockman and Chief Scientist Ilya Sutskever, the remaining four core founding technical team members (Andrew Karpathy, Durk Kingma, John Schulman, Wojciech Zaremba) were all fresh PhD graduates, from Stanford University, the University of Amsterdam, UC Berkeley, and New York University respectively.
From left to right: Ilya Sutskever (former Chief Scientist), Greg Brockman (former President), Andrej Karpathy (former Technical Lead), Durk Kingma (former Researcher), John Schulman (former Reinforcement Learning Team Lead), and Wojciech Zaremba (current Technical Lead)
This "young wolf strategy" has already allowed OpenAI to taste the sweetness, incubating talents like Alec Radford (the father of GPT, equivalent to a private third-tier university graduate), Aditya Ramesh (NYU undergraduate) the father of the text-to-image model DALL-E, and Prafulla Dhariwal, the multi-modal lead for GPT-4o and a three-time gold medalist in the International Mathematical Olympiad. This has allowed the initially unclear "save the world" plan of OpenAI to carve out a path through the reckless charge of the young people, growing from a nameless underdog next to DeepMind to a giant.
Liang Wenfeng has precisely seen the success of Sam Altman's strategy and firmly chosen this path. However, unlike OpenAI, which had to wait 7 years to see ChatGPT, Liang Wenfeng's investment has seen results in just over 2 years, which can be called the speed of China.
3. Speaking Up for DeepSeek
In the article on DeepSeek R1, its various indicators are surprisingly excellent. But it has also raised doubts from everyone: there are two points of suspicion,
① The Mixture of Experts (MoE) technology it uses requires high training requirements and high data requirements, which indicates that there are valid reasons for questioning Deepseek's use of OpenAI data for training.
② Deepseek uses Reinforcement Learning (RL) technology, which has high hardware requirements, but compared to Meta and OpenAI's Volta cluster, Deepseek's training only used 2048 H800 GPUs.
Due to the limitations of computing power and the complexity of MoE, the fact that DeepSeek R1 succeeded with only $5 million seems a bit suspicious. However, whether you worship its "low-cost miracle" or question its "empty hype", you cannot ignore the dazzling functional innovation.
BitMEX co-founder Arthur Hayes wrote: Will the rise of DeepSeek cause global investors to question American exceptionalism? Is the value of American assets severely overestimated?
Stanford University professor Andrew Ng publicly stated at this year's Davos Forum: "I am impressed by the progress of DeepSeek. I believe they can train models in a very economical way. Their latest released inference model is excellent... 'Keep it up'!"
A16z founder, Marc Andreessen said, "DeepSeek R1 is one of the most astonishing and impressive breakthroughs I've ever seen - and as open source, it's a profound gift to the world."
Standing in the corner of the stage in 2023, DeepSeek finally reached the pinnacle of the world's AI in 2025, just before the Lunar New Year.
4. Argo and DeepSeek
As a technical developer of Argo and an AIGC researcher, I have DeepSeeked some of Argo's important functions: As a workflow system, the rough and raw workflow generation work in Argo is done using DeepSeek R1. In addition, Argo has integrated LLM as a standard DeepSeek R1, and has chosen to abandon the closed-source and expensive OpenAI models, because workflow systems typically involve a large amount of Token consumption and context information (average >= 10k tokens), which would make the execution cost of using high-priced OpenAI or Claude 3.5 very expensive, and this pre-consumed spending would be detrimental to the product before web3 users truly capture value.
As DeepSeek continues to improve, Argo will work more closely with the Chinese forces represented by DeepSeek: including but not limited to the Sinicization of Text2Image/Video interfaces and the Sinicization of LLM.
In terms of cooperation, Argo will invite DeepSeek researchers to share their technical achievements in the future, and provide grants for top AI researchers, to help web3 investors and users understand the progress of AI.