What is This New Chinese AI; DeepSeek

DeepSeek’s AI Chatbot Tops US App Store, Shaking Faith in Big Tech’s AI Spending

What is This New Chinese AI; DeepSeek
Photo by Markus Spiske / Unsplash

In a sudden upset that’s sending ripples through the artificial intelligence industry, Chinese startup DeepSeek has seen its AI chatbot soar to the top of the Apple App Store’s free downloads list in the United States this week—unseating OpenAI’s ChatGPT from the No. 1 spot. The stunning climb follows the January 20th release of DeepSeek’s new R1 model, which the company claims can solve complex problems as efficiently as OpenAI’s own leading systems.

The tool behind DeepSeek’s chatbot, known simply as “DeepSeek,” relies on the startup’s open-source large language models. According to the company, the training process for these models requires significantly fewer specialized chips compared to the hardware-heavy approach favored by incumbent AI titans. DeepSeek says it managed to develop its earlier V3 LLM, released in December, using only about 2,000 high-end Nvidia chips. By contrast, training top-tier models like GPT-4 can require upwards of 16,000 or more of these costly components.

If DeepSeek’s assertions prove accurate, they pose a direct challenge to the entrenched belief that “more compute equals better AI.” The cost-saving aspect is especially striking: DeepSeek claims its latest models have cost under $6 million to develop. OpenAI CEO Sam Altman, meanwhile, has previously stated that GPT-4 required more than $100 million to train. These reports have spooked investors, sparking a pre-market dip of over 12 percent in Nvidia’s share price and similar downward trends for Microsoft and other major firms betting heavily on larger, more resource-intensive models.

A Disruptive Newcomer

DeepSeek’s R1 model builds on the V3 LLM, which the company says is already competitive with GPT-4o and Anthropic’s Claude 3.5 Sonnet. The new R1 focuses on advanced reasoning tasks, performing at near-parity with OpenAI’s “o1” on certain benchmarks. These are bold claims for a startup that few in the West had heard of even a few months ago.

Industry insiders note that such efficiency gains could stem from innovations ranging from architectural tweaks to compressed training datasets—especially vital for Chinese AI labs dealing with US trade restrictions that limit access to cutting-edge chips. In other words, necessity may have driven DeepSeek’s engineers to discover new methods for training on minimal resources.

Billions on the Line

At the heart of the debate is whether the traditional, compute-intensive approach to AI will remain dominant. Nvidia, Microsoft, OpenAI, and Meta have collectively poured billions of dollars into AI data centers—according to reports, $500 billion alone is projected for the so-called Stargate Project, with $100 billion earmarked for Nvidia hardware. If DeepSeek’s model outperforms or even rivals established chatbots, it could throw into question the return on those massive investments.

For now, DeepSeek’s sudden success presents a potential pivot point for the AI community. It highlights the possibility that smaller-scale, more cost-effective models could match or surpass the performance of large-scale projects that rely on thousands of top-tier GPUs and similarly vast budgets. The market response—an immediate hit to Nvidia’s stock price—suggests that investors are at least taking DeepSeek’s claims seriously.

Next Steps for AI

Whether DeepSeek can maintain its momentum remains to be seen. The company’s claims have yet to be independently verified, and many analysts remain cautious about the feasibility of scaling low-compute approaches for the toughest AI tasks. Nonetheless, the DeepSeek phenomenon underscores a growing appetite for AI models that can do more with less—an outcome that could reshape how researchers, investors, and businesses approach AI development.

For now, one thing is clear: in an industry often criticized for its high barriers to entry, DeepSeek’s rapid ascent is a reminder that major breakthroughs can come from unexpected places, potentially upending assumptions about the future of AI.