OpenAI Accuses DeepSeek of Distilling Its Models to Train Rival AI
The AI cold war between the U.S. and China just escalated. OpenAI has sent a formal memo to the House Select Committee on China, accusing Chinese startup DeepSeek of systematically stealing its intellectual property through a technique called distillation.
What Happened
According to OpenAI, DeepSeek employees routed traffic through third-party proxies and unauthorized resellers to hide their origin while querying OpenAI's models with millions of complex prompts. The high-quality responses were then used as training data for DeepSeek's R1 model, effectively cloning GPT-4's reasoning capabilities without investing billions in original R&D.
OpenAI says its internal forensic data shows 'statistically impossible similarities' in reasoning structures and error patterns between DeepSeek-R1 and its own proprietary models.
Why It Matters
This is the first time OpenAI has explicitly asked Congress to intervene. The company warns that unregulated distillation could undermine the entire Western AI industry. DeepSeek shocked the market in 2025 by releasing models rivaling GPT-4 at 80% lower training costs.
There is also a safety concern: when models are distilled, the original safety guardrails often don't transfer. DeepSeek may have captured the intelligence of U.S. models while stripping away restrictions designed to prevent misuse.
The Bigger Picture
Representative John Moolenaar, chair of the House China committee, called it 'the CCP playbook: steal, copy, then dominate.' OpenAI argues that software-level piracy is neutralizing U.S. hardware sanctions. Even without the latest Nvidia chips, Chinese firms can simply query the outputs of American models running on those chips in U.S. data centers.
DeepSeek has not formally responded, though the company has previously attributed its efficiency to algorithmic breakthroughs and better chip utilization. Open-source advocates have pushed back, accusing OpenAI of trying to maintain a monopoly on frontier AI.


