ADVERTISEMENT
Liang Wenfeng, the founder of DeepSeek, has quickly emerged as a transformative force in the global AI landscape. On January 20, 2025, DeepSeek, an AI research lab from China, stunned Silicon Valley with the release of its cutting-edge open-source model, DeepSeek-R1, which has since been lauded for surpassing even the most dominant AI models, including OpenAI’s GPT series, in several critical benchmarks. With this bold move, Liang and his team are challenging the long-standing dominance of Western AI giants like OpenAI and Meta, while offering an efficient, cost-effective alternative in the process.
A Story of Scientific Curiosity Over Commercial Gain
DeepSeek’s ascent in the competitive AI sector is nothing short of extraordinary. Liang, who has a master's degree in computer science, shared with 36Kr, a Chinese tech publication, that his motivation for founding DeepSeek wasn’t driven by commercial interests. His quotes, translated by the American publication, WIRED, allowed us to see a deeply-felt scientific mind who understood that AI cannot be used for commercial gain, but for solving the deep human problems.
"I wouldn’t be able to find a commercial reason [for founding DeepSeek] even if you ask me to," Liang remarked in the interview. “Basic science research has a very low return-on-investment ratio. When OpenAI’s early investors gave it money, they sure weren’t thinking about how much return they would get. Rather, they really wanted to do this thing.”
For Liang, the pursuit of knowledge and the drive to address the most difficult questions in AI were the primary catalysts. “I created DeepSeek to solve the hardest questions in the world,” he explained. His passion for scientific advancement over monetary rewards has propelled DeepSeek to the forefront of AI innovation in China and globally.
The Unconventional Path to AI Excellence
DeepSeek's origins are deeply intertwined with the success of High-Flyer, one of China’s premier quantitative hedge funds. Founded in 2015, High-Flyer became the first fund to raise over 100 billion RMB (approximately $15 billion), although its capital has since dipped to around $8 billion. Despite the substantial focus on financial data analysis, High-Flyer had quietly amassed a substantial stockpile of GPUs and built supercomputers for deep-learning research. It was within this environment that Liang saw an opportunity to create something bigger.
In 2023, after years of stockpiling computational power, Liang decided to pivot, pouring the hedge fund’s resources into a new venture—DeepSeek. The firm would focus on creating its own advanced AI models and working towards the development of artificial general intelligence (AGI). "It was as if Jane Street decided to become an AI startup and burn its cash on scientific research," Liang quipped, acknowledging the unconventional nature of the transition.
DeepSeek: A Different Kind of Research Culture
Liang’s approach to building DeepSeek was starkly different from the methods employed by China’s more prominent tech companies, such as Baidu, Alibaba, or ByteDance. Instead of hiring experienced engineers to develop consumer-facing products, Liang focused on attracting the brightest PhD students from top Chinese universities like Peking University and Tsinghua University. These researchers, though highly accomplished in their academic fields, had little industry experience—an unorthodox choice for most startups.
“Our core technical positions are mostly filled by people who graduated this year or in the past one or two years,” Liang shared. This strategic decision has fostered a collaborative and innovative culture at DeepSeek, one where young, driven researchers are encouraged to explore unconventional avenues without the pressures of consumer demands or immediate profits.
This environment, where ample computing resources are devoted to long-term research goals, stands in stark contrast to China’s established internet giants, where competition for resources often leads to internal strife and inefficiencies. DeepSeek’s culture emphasizes collective advancement, and Liang believes that this is what makes his team unique. “Most people, when they are young, can devote themselves completely to a mission without utilitarian considerations,” he noted.
Innovation Under Constraint
In October 2022, the U.S. government imposed export controls that significantly restricted Chinese AI companies’ access to advanced chips like Nvidia’s H100. For DeepSeek, which had begun its research with a stockpile of A100 chips, this presented a challenge. However, Liang was undeterred. “The problem we are facing has never been funding, but the export control on advanced chips,” he explained in a follow-up interview.
Rather than abandoning their ambitions, DeepSeek’s engineers worked to make their models more efficient by optimizing their architecture and employing innovative methods to overcome limitations imposed by these restrictions. By using engineering tricks like custom communication schemes between chips, reducing the size of fields to save memory, and employing a mix-of-models approach, DeepSeek was able to significantly reduce the resources needed to train their models.
"We optimized our model architecture using a battery of engineering tricks," explained Wendy Chang, a software engineer turned policy analyst. “Combining these approaches successfully to produce a cutting-edge model is a remarkable feat.”
As a result, DeepSeek's latest model required just one-tenth the computing power of Meta’s comparable Llama 3.1 to train, making it an incredibly efficient and cost-effective alternative to existing AI models.
Embracing Open-Source to Foster Global Collaboration
DeepSeek’s commitment to openness and collaboration has been a key factor in its rise. Unlike many Chinese tech companies, DeepSeek has embraced the open-source model, sharing its innovations with the global AI research community. This move has garnered considerable goodwill, particularly as it offers a viable alternative to the Western-dominated AI ecosystem.
"Cutting-edge models can be built using less, though still a lot of, money, and the current norms of model-building leave plenty of room for optimization," noted Wendy Chang. "We are sure to see a lot more attempts in this direction going forward."
Liang, however, remains unapologetically focused on the bigger picture. He sees DeepSeek's open-source contributions as a way to advance the global conversation around AI, offering insights and models that could inspire future breakthroughs across the world.
A New Era in AI Innovation
DeepSeek’s approach has disrupted the status quo, showing that efficiency, collaboration, and scientific curiosity can lead to groundbreaking innovations even without the vast financial resources available to companies like OpenAI or Meta. Liang’s vision is clear: to solve the hardest problems in AI and push the boundaries of what’s possible.
"Solving the hardest questions in the world—that's what drives us," Liang had said in the interview to the Chinese publication, concluding, “And we’re just getting started.”
Fortune India is now on WhatsApp! Get the latest updates from the world of business and economy delivered straight to your phone. Subscribe now.