In January 2025, Chinese lab DeepSeek released R1 — an open-source reasoning model that:
- Matches OpenAI o1 on many math and coding benchmarks
- Was reportedly trained for ~$6M in compute (vs estimated $100M+ for comparable US models)
- Uses a novel training approach: pure RL with process reward models, no SFT warm-up
**Why it matters:** Suggests compute efficiency gains are far from exhausted. The assumption that frontier AI requires massive datacenter build-outs may be wrong.
**The policy angle:** Export controls on Nvidia chips may be less effective than assumed if models can be trained efficiently on lower-tier hardware.