COMPANY

DeepSeek

companytopic-note

Overview

DeepSeek is a Chinese AI research lab known for producing frontier-class models at extremely low training costs. The company has established a pattern of cost-efficient large-scale model development and is notable for being among the first to deploy frontier AI on non-NVIDIA hardware, specifically Huawei’s Ascend chips.

Timeline

  • 2026-05-01-AI-Digest — DeepSeek V4 / V4 Pro release crystallizes the canonical non-NVIDIA frontier-model story with 1M-token context, Hybrid Attention Architecture, and explicit deployment on Huawei Ascend as headline feature rather than footnote; frames as geopolitical bifurcation under export controls.

  • 2026-04-10-AI-Digest — DeepSeek V4 enters final pre-release validation as a 1 trillion parameter mixture-of-experts model with ~37B active parameters per response, handling text, image, and video natively. Reuters confirms it will be the first frontier AI model trained and deployed on Huawei Ascend 950PR chips. DeepSeek introduces “Fast Mode” and “Expert Mode” product tiers, formalizing a paid service for the first time. Estimated training cost of ~$5.2M. Release expected in the last two weeks of April 2026.

  • 2026-04-28-AI-Digest — MIT Technology Review analysis of DeepSeek V4 frames it around long-horizon reasoning; model trails Gemini 3.1 Pro by ~3–7 points but offers competitive cost-vs-capability ratio.

Key Developments

  1. Extreme Cost Efficiency: DeepSeek’s estimated ~$5.2M training cost for a 1T-parameter frontier model is among the cheapest ever reported, consistent with the lab’s established pattern of doing more with less.

  2. Huawei Silicon Deployment: V4 as the first frontier model on Huawei Ascend 950PR chips represents a geopolitically significant proof point — if competitive, it demonstrates that US export controls on NVIDIA have shifted the supply chain rather than blocked Chinese frontier AI development.

  3. Business Model Evolution: The introduction of paid “Fast Mode” and “Expert Mode” tiers marks DeepSeek’s transition from a fully free research lab to a commercial entity, likely driven by rising inference costs.

  4. First Outside Capital Round (April 2026): The $300M raise at a $10B+ valuation — DeepSeek’s first ever external fundraise — is a structural concession that frontier training/talent costs have outgrown High-Flyer’s solo backing. It also sets the commercial baseline for Chinese open-weights labs more broadly: at current economics, even cost-efficient frontier labs need external capital to keep training.

Timeline (continued)

  • 2026-04-11-AI-Digest — DeepSeek formally launches “Fast Mode” and “Expert Mode” in the chat service, formalizing its first paid product tier as V4 launch preparation. V4 remains in final pre-release validation on Huawei Ascend 950PR chips, with release expected in the last two weeks of April. The r/LocalLLaMA community runs speculative performance comparisons against Gemma 4 and Qwen 3.5 at the 37B active-parameter tier.
  • 2026-04-12-AI-Digest — DeepSeek V4 nears late-April launch with 1M-token context window powered by “Engram” conditional memory system; test interface reveals three product tiers (Fast, Expert, Vision). DeepSeek reportedly gave Huawei exclusive early hardware access while denying NVIDIA early access — a deliberate geopolitical signal.
  • 2026-04-13-AI-Digest — DeepSeek V4 pre-release tracking continues on r/LocalLLaMA; debate centers on whether 37B active parameters on Huawei Ascend 950PR can match NVIDIA-optimized inference latency. Community skeptics cite historical Ascend throughput issues while optimists note the $5.2M training cost makes V4 the most cost-efficient frontier model ever trained.
  • 2026-04-14-AI-Digest — Final-stretch V4 speculation dominates r/LocalLLaMA: prevailing specs are ~1T total / 32–37B active MoE, 1M-token context, tiered Fast/Expert/Vision product surface with Expert likely the first paid SKU. Bulk orders placed by Alibaba, ByteDance, and Tencent pushed Ascend 950PR spot prices up ~20% in weeks — the community reads this as a leading indicator of launch imminence. Launch window: last two weeks of April.
  • 2026-04-15-AI-Digest — Founder Liang Wenfeng reconfirms late-April V4 window via internal communication; Reuters’ April 4 report that V4 runs on Huawei Ascend 950PR silicon continues to hold. Community logistics debate shifts from speculation to which quantizations will drop day-one (Q4_K_M and Q8_0 likely) and whether Huawei’s Ascend inference stack will be open-sourced alongside the weights. The strategic framing: if V4 hits 80% of Claude Opus 4.6 / GPT-5.4 at competitive latency on Ascend, the “export controls as capability cap” premise of US policy collapses.
  • 2026-04-16-AI-Digest — Final-stretch V4 watch continues on r/LocalLLaMA: consensus on 1T total / 32–37B active MoE, 1M-token context, Fast/Expert/Vision tiers with Expert as the first paid SKU. Alibaba/ByteDance/Tencent bulk Ascend 950PR orders and the ~20% spot-price jump remain the most credible leading indicator of launch imminence. Open questions unchanged: Ascend inference latency parity with NVIDIA, and whether 1M context is a real deployment spec or marketing. V4 would be the first frontier model released with explicit product-tier price discrimination built into launch day.
  • 2026-04-18-AI-DigestDeepSeek opens to outside capital for the first time — in talks to raise at least $300M at a $10B+ valuation (per The Information, reported April 17), its first external fundraise since founding. Until now fully funded by High-Flyer Capital Management, DeepSeek had publicly rejected outside investors through 2024–2025. Domestic Chinese investors are the most likely participants; US venture firms face regulatory pressure and national-security review risk that effectively bars meaningful participation. The $10B valuation sits an order of magnitude below Anthropic/OpenAI/Cursor-class pricing and reflects DeepSeek’s deliberate under-pricing more than a market constraint. Strategic read: a concession that frontier-training compute and talent costs have moved past what High-Flyer alone can sustain — the clearest sign yet that the “you don’t need $10B to build a frontier model” narrative DeepSeek embodied in early 2025 has reverted closer to the cohort median. Lands two days after Stanford’s 2026 AI Index report showed the Arena-leaderboard US-China gap down to 2.7 points.
  • 2026-04-21-AI-DigestDeepSeek V4 enters the actual launch window. The April 3 Reuters/Information “next few weeks” reporting has aged into the “latter-half April 2026” formal-release window that the community is now tracking as the single largest open-source event of Q2. Consolidated specs: ~1T MoE with ~37B active, 1M-token context via Engram conditional memory, native multimodal generation, 81% SWE-bench Verified, $0.30/MTok inference pricing, Apache 2.0 weights — and the technically significant finding, no CUDA dependency anywhere in the stack. The benchmark profile puts V4 inside Claude Opus 4.7 range on coding (87.6% SWE-bench Verified) while carrying a 16× cost advantage, and the CUDA-independence decouples the model from the US export-control regime at a level no prior Chinese open model has achieved. The enterprise-procurement read: V4 forces a first-principles cost-quality reconsideration for every Fortune 500 engineering-tooling budget. The narrative-contest between EmTech’s “Great Integration” frame and “the open Chinese frontier model ships on Huawei silicon” may collide on the same news cycle this week.
  • 2026-04-22-AI-DigestV4 now formally three missed forecast windows deep (April 3 Reuters, April 10 BigGo, April 14 DeepSeek V4 blog). r/LocalLLaMA’s consolidated reading: V4-Lite has been live-tested on API nodes, pre-training is confirmed done, and the CUDA-free Huawei Ascend 950PR production path is the single technical risk still unresolved — i.e., this is a Huawei-silicon production-yield story rather than a model-readiness story. The late-April window is now understood as “before end of April, or after Google Cloud Next if Google lands anything that reshuffles open-vs-closed positioning.” Paired with the Tencent Hunyuan 3.0 late-April launch reporting (~30B parameters, led by former OpenAI researcher Shunyu Yao), the two-week horizon could see two Chinese frontier-class open models ship in succession — a cadence that would retire the “Chinese labs are behind” framing decisively.
  • 2026-04-25-AI-Digest — r/LocalLLaMA community demonstrates DeepSeek v4‘s practical capability: single-shot generation of a 100KB self-contained HTML “web OS” from a single model invocation, enabled by the model’s 384K output window (the largest of any publicly available model to date). The practical demonstration that an output window of this size (not just input context) opens a new category of autonomous agent tasks — full-application generation and long-horizon code synthesis in a single turn — tasks previously requiring multi-turn stitching. The capability validates the cost-quality positioning: V4’s 16× cost advantage over Claude Opus 4.7 applies at frontier-level capability on a key dimension (output length) that enables architectural simplification on the agent side.
  • 2026-04-26-AI-Digest — A quiet Sunday brings confirmation that the open-weights frontier continues to move at exceptional pace: Xiaomi’s MiMo V2.5 Pro lands at #54 Artificial Analysis Index with weights queued for imminent release, and Qwen3.6-27B hits 80 tokens/sec throughput at 218K context on a single RTX 5090 with NVFP4 + MTP quantization under vLLM 0.19.1rc1. The pattern holds: the gap between closed and open-weights frontiers is no longer monotonic — for several days at a time this month, the open-weights frontier IS the closed frontier. DeepSeek v4’s 384K output window (r/LocalLLaMA’s 100KB self-contained HTML demo on April 25) remains the single largest open-weights capability jump on the inference-output dimension.
  • 2026-04-27-AI-Digest — DeepSeek V4-Pro launches a 75% promotional price cut ($0.43/Mtok input, down from $1.74 standard rate) alongside a 10× input-cache-hit discount (to ~$0.03625, from standard ~$0.36) through May 5, 2026. V4-Flash sees the same one-tenth cache treatment. The promotion is framed as a limited-time window rather than a permanent rate reset, signaling a strategic play to pull RAG/agentic/repeated-context workloads onto V4-Pro at price points that make the comparison against Claude Opus 4.7 ($5/$25 per million tokens) and GPT-5.5 a different-order-of-magnitude question through the promotional window.
  • 2026-05-09-AI-Digest — Reporting (originated by The Information, corroborated by SCMP) places DeepSeek at up to RMB 50B (~$7.35B) at a $45–50B valuation in its first external round. Tencent and China’s national AI fund are reportedly discussing $3–4B combined; founder Liang Wenfeng will anchor with the largest individual check. V4.1 slated for next month. The structural moment is the shift from self-financed lab (via Liang’s High-Flyer hedge fund) to externally-capitalised one — the dollar figure is the trailing indicator. State-adjacent participation mirrors the US hyperscaler posture toward OpenAI and Anthropic.
  • 2026-05-10-AI-Digest — Full DeepSeek V4 paper drops on r/MachineLearning, expanding the April preview with FP4 quantization-aware training applied during late-stage training to MoE expert weights (FP8 elsewhere in the stack), with real FP4 weights used during inference and RL rollout. The honest framing is that V4 is a Blackwell validator rather than a Blackwell threat: the model is FP8+FP4 mix (not end-to-end FP4) and is built FOR Blackwell’s NVFP4 path, with NVIDIA’s own developer blog promoting the integration. V4 is the first open-weights frontier MoE with FP4 expert weights and a co-released FP4 train+serve stack — cost-curve pressure lands on FP8-era incumbents, not on NVIDIA.
  • 2026-05-21-AI-Digest — DeepSeek is forming a Beijing “Harness” team focused on a coding-agent product, with PM and engineering roles posted on X by Deli Chen on May 20. The Decoder frames this as a Claude Code / Codex competitor, but the substantive point is there is no product, preview, or repo yet — this is a hiring signal that DeepSeek intends to compete on the harness layer (IDE/CLI surface and tool-orchestration loop) rather than only on the underlying model. Worth tracking the team size and the first commit out of the Harness repo when it lands.
  • 2026-05-24-AI-Digest — DeepSeek formalises the 75% promotional discount on V4-Pro as the permanent list rate: $0.435/M input (cache miss), $0.003625/M (cache hit), $0.87/M output. Against GPT-5.5‘s $5/M input and $30/M output, that’s roughly 11.5× cheaper on input and 34× cheaper on output; the cache-hit input rate puts DeepSeek at sub-cent-per-million economics no US frontier lab is publishing. The signal is structural rather than promotional — the China-vs-US frontier-API pricing gap, which the broader Chinese frontier-lab cohort has been operating at through Q1 2026, is now locked in at the ~10–35× range rather than the 3–5× US analysts assumed would re-converge.
  • 2026-05-25-AI-Digest — DeepSeek surfaces as the economics enabler for the day’s HN front-page story rather than as a first-party launch. Reasonix — a community / third-party project from the esengine GitHub org (MIT-licensed, npm-shipped, ~5.5k★) — is engineered specifically around V4-Pro‘s prefix cache, claiming a 99.82% cache-hit rate and ~93% cost savings against Claude Code equivalents. The signal is demand-side: practitioners reacted to yesterday’s permanent V4-Pro pricing with a same-day working coding-agent build optimised for the cache economics — not that DeepSeek itself is moving up-stack to own the agent layer. Read as third parties are building cheap-coding-agent stacks on top of DeepSeek’s economics; don’t impute supply-side strategy from a community build.
  • 2026-06-06-AI-Digest — DeepSeek is in final stages of its first external financing round50B yuan ($7.4B) targeting a $52–59B valuation, term sheets signed but not closed. Founder Liang Wenfeng commits ¥20B ($2.8B) — the largest single check, larger than any external participant — with Tencent ($1.5B) and CATL ($740M) the largest external backers and the National AI Industry Investment Fund alongside. Proceeds target training compute and domestic chip integration. The “Tencent-led” framing carried by early reporting overstates Tencent’s position — the more accurate read is the founder doubling down with the bulk of the capital while external strategics fill in around him. Round is “in talks / near close,” not closed; the “China’s largest-ever startup financing” framing is plausible at the headline number but not yet final.
  1. Harness Team Signals Coding-Agent Push at the IDE/CLI Layer: The May 20 Beijing “Harness” team hiring announcement is a strategic signal — DeepSeek intends to contest the harness layer where Anthropic has been compounding through Claude Code and OpenAI’s Codex relaunch has been catching up, not only the underlying model. With no product, preview, or repo yet, the read is intent rather than capability; the milestones to track are team size and the first public Harness commit.

  2. V4-Pro Discount Becomes Permanent List Pricing (May 24, 2026): The 75% promotional cut DeepSeek ran from late April becomes the list rate — $0.435/M input cache-miss, $0.003625/M cache-hit, $0.87/M output. Against GPT-5.5 ($5/M input, $30/M output) that’s roughly 11.5× cheaper on input and 34× cheaper on output. The structural read is that the China-vs-US frontier-API pricing gap is now locked at the ~10–35× range rather than the 3–5× US analysts assumed would re-converge once promo pricing ended.

  3. Reasonix as Community Demand-Side Signal (May 25, 2026): A third-party MIT-licensed terminal coding agent (esengine/reasonix, ~5.5k★ on GitHub) engineered around V4-Pro’s prefix cache claims 99.82% cache-hit rate and ~93% cost savings against Claude Code equivalents. Lands the day after permanent V4-Pro pricing — the demand-side practitioner read on yesterday’s price cut, not a DeepSeek-owned agent launch.