MODEL

GLM-5.1

modeltopic-noteopen-sourcezhipuz-aichina

GLM-5.1

Z.ai’s (formerly Zhipu AI) post-training upgrade to GLM-5, released March 27, 2026 under MIT license. A 744B-parameter mixture-of-experts model (44B active per token) with a 200K context window, trained exclusively on Huawei Ascend 910B chips — zero NVIDIA hardware. The first open-weights model to hold the top position on a major real-world software engineering benchmark (SWE-Bench Pro, 58.4%) and the specific open-source reference point the April 2026 r/LocalLLaMA community converged on for agentic coding workflows.

Key Specs

  • Architecture: 744B-parameter MoE, 40B active parameters per token, DeepSeek Sparse Attention
  • Context window: 200K tokens
  • License: MIT
  • Training hardware: 100K Huawei Ascend 910B chips (no NVIDIA GPUs) on MindSpore
  • SWE-Bench Verified: 77.8% (top open-source position)
  • SWE-Bench Pro: 58.4% (community-highest — above GPT-5.4’s 57.7 and Claude Opus 4.6’s 57.3)
  • Coding eval vs Opus 4.6: 45.3 vs 47.9 (94.6% parity claim from Z.ai internal benchmark)
  • Relative improvement over GLM-5: 28% on coding via post-training only (unusually large for a post-train-only update)
  • Pricing (GLM Coding Plan): $3/month starting tier

Timeline

  • 2026-03-30-AI-Digest — Z.ai ships GLM-5.1 on March 27 as a post-training upgrade to GLM-5. 28% coding improvement over GLM-5 (45.3 vs 35.4 on Z.ai internal eval), reaching 94.6% of Claude Opus 4.6’s score. Architecture unchanged — same 744B MoE, 40B active, 200K context, DeepSeek Sparse Attention. Not yet open-source (GLM-5 weights on Hugging Face under MIT; GLM-5.1 open release teased). GLM Coding Plan starts at $3/month.
  • 2026-04-16-AI-Digest — GLM-5.1’s quiet dominance on coding benchmarks. Now holding top open-source SWE-Bench Verified (77.8%) and community-highest SWE-Bench Pro (58.4) — above GPT-5.4 (57.7) and Claude Opus 4.6 (57.3). First time an open-weights model has claimed top position on a major real-world software engineering benchmark. MIT license, zero NVIDIA hardware.
  • 2026-04-17-AI-Digest — Continues as top open-weights slot at 77.8 / 58.4. Claude Opus 4.7’s new numbers (87.6 / 64.3) extend the frontier-to-open-weights gap meaningfully, but community framing is that GLM-5.1 is now the specific open-source benchmark to beat, not a secondary model in a broader Qwen / Gemma 4 / Llama cohort. Recommended for agentic coding; Qwen 3.5 remains general-purpose default.
  • 2026-04-18-AI-Digest — Second-most-active r/LocalLLaMA thread of the week: GLM-5.1 vs Qwen 3.5 as best open-weights coding daily driver. Pro-GLM camp emphasizes SWE-Bench Pro lead and tool-use reliability on agentic loops; pro-Qwen camp points to broader language coverage, faster inference on commodity hardware, more mature tokenizer. Community working consensus: GLM-5.1 for agentic coding workflows, Qwen 3.5 for everything else, run both if you have the VRAM.
  • 2026-04-19-AI-Digest — Weekend “open-weights safety floor is a competitive moat” framing. GLM-5.1 (77.8% SWE-Bench Verified) and Qwen 3.5 can’t match Opus 4.7’s 87.6% / 64.3%, but also can’t match Claude Mythos Preview‘s zero-day discovery or GPT-5.4-Cyber‘s defensive-analysis profile. The r/LocalLLaMA framing: open-weights should benchmark against frontier labs’ gated models, not their shipping models, because the gap to shipping frontier is closing faster than the gap to the real frontier.

Strategic Significance

GLM-5.1 is the first open-weights model to take the top spot on a major real-world software engineering benchmark under a fully permissive license (MIT), trained on non-NVIDIA hardware. Three threads compound:

  1. Export-controls effectiveness question. A 94%+-of-Opus coding model trained on 100K Huawei Ascend 910B chips is the cleanest data point to date on whether US export controls are slowing Chinese frontier AI. The answer appears to be “less than intended.”
  2. Cost structure. At $3/month for the GLM Coding Plan, GLM-5.1 establishes a price floor for near-frontier coding that is an order of magnitude below US frontier-lab API pricing. If the promised open-source release ships, even that $3 floor goes away.
  3. Agentic-coding category positioning. By mid-April 2026 the community has stopped asking “which open model matches frontier” and started asking “which open model is the least-compromised local alternative for agentic coding specifically” — and the answer is GLM-5.1.

See Also