Chinese Open-Source AI Just Beat GPT-5.4 on Coding

On the same day Anthropic gated its best model, Zhipu AI open-sourced one that outperforms GPT-5.4 on coding benchmarks. The best open-weight AI in the world right now comes from Beijing. Here's why that matters.

Zhipu GLM-5 Qwen Alibaba DeepSeek v4 Kimi Moonshot OPEN WEIGHTS
4
Leading Chinese labs
#1
Open on coding benchmarks
Apr 7
GLM-5 open-source drop
MIT
License

On the same day Anthropic announced it was gating its best model to 50 companies, Zhipu AI did the exact opposite. The Beijing-based lab open-sourced a frontier-class model that outperforms GPT-5.4 on coding benchmarks. Weights freely downloadable. Permissive license. No gated access, no waitlist, no enterprise contract. As a practitioner I have to be honest: the best open-weight model in the world right now comes from China, and it's not particularly close.

This is worth paying attention to whether you're building AI in Silicon Valley, Shenzhen, or anywhere else. The open-source AI center of gravity has shifted east, quietly, over the last eighteen months. The four labs leading it — Zhipu, DeepSeek, Alibaba's Qwen team, and Moonshot AI — each ship on a cadence and a quality level that most Western open-weight efforts can't match.

The 5-Second Version

01

What Zhipu Just Shipped

Zhipu AI released GLM-5 on April 7, 2026 under a commercially permissive license. The headline number is that on coding evaluations — specifically HumanEval, SWE-bench Verified, and MBPP — the model scores higher than GPT-5.4. Not marginally. By a clear margin, on benchmarks that the major labs themselves treat as definitive.

The release includes full weights, a reference inference setup, and an evaluation harness anyone can rerun. Zhipu has made it easy to verify the claims, which is the opposite of how most benchmark disputes go. You can download the model, run it on the same tasks, and see the same numbers.

02

The Four Labs Shaping Open-Weight AI

GLM-5 is the newest entry, but it isn't isolated. Four Chinese labs have, between them, shipped most of the strongest open-weight releases of the last 18 months. Each has a distinct technical personality and each is worth knowing.

Zhipu AI
Beijing
GLM-5 · Apr 2026

Spun out of Tsinghua University. Known for tight integration between research and fast release cycles. GLM-5 is their strongest open model to date and now leads open coding benchmarks.

DeepSeek
Hangzhou
DeepSeek V4 · 2026

Jolted the entire industry in early 2025 with a reasoning model trained at a fraction of Western lab compute budgets. Continues to ship frontier-class weights under permissive licenses.

Qwen (Alibaba)
Hangzhou
Qwen3 series

Alibaba's research arm. Qwen is the most deployed open-weight model family globally in 2026 — by number of active Hugging Face downloads, it outpaces every Western open family combined.

Moonshot AI
Beijing
Kimi K2

Best known for long-context models. Kimi's 2M-token context window was the first at that scale to actually work end-to-end for document-heavy workflows.

None of these labs are a monolith. They compete with each other. They publish conflicting results. They disagree on architecture. The thing they share is a commitment to open weights at a quality tier Western labs have largely walked away from.

03

The Great Open/Closed Split

Zoom out and the pattern is striking. In April 2026, the same week Anthropic gated Claude Mythos to 50 companies, Zhipu open-sourced a coding-frontier model and Google released Gemma 4 under Apache 2.0. Open and closed are pulling apart — and geographically, the open side is increasingly Chinese.

Gating & Closing

Western Frontier Labs

OpenAI, Anthropic, xAI all push their most capable models behind APIs or gated programs. Rationale: dual-use risk, revenue, strategic advantage. Public access to the absolute frontier is tightening.

Opening

Chinese Labs + Google DeepMind

Zhipu, DeepSeek, Qwen, Moonshot, and Google (Gemma 4) ship frontier-adjacent models with open weights and permissive licenses. Rationale: global developer adoption, ecosystem lock-in, technical leadership signaling.

Both strategies are rational. Both are working — for different definitions of winning. The closed strategy maximizes revenue per user and concentrates capability. The open strategy maximizes reach and builds an ecosystem the originating lab can keep extending. Which "wins" depends entirely on what you think matters in the long run.

04

How to Actually Use GLM-5

If you want to try it today, the fastest path is Hugging Face. Zhipu published the weights alongside the release announcement. For a quick test on a single GPU, pull the quantized variant and run it through vLLM:

run_glm5.py
Python
from vllm import LLM, SamplingParams

llm = LLM(
    model="zhipuai/glm-5-coding",
    dtype="bfloat16",
    tensor_parallel_size=1,
)

prompt = "Write a Python function that takes a directory and " \
         "returns the 10 largest files recursively, excluding " \
         "dotfiles. Include docstring and type hints."

out = llm.generate(
    prompt,
    SamplingParams(temperature=0.2, max_tokens=600),
)
print(out[0].outputs[0].text)

That's the full setup. No API key, no credit card, no quota. It runs on hardware you already have. For coding workflows specifically — IDE integrations, PR review agents, code search — this is genuinely competitive with what you'd get from a paid GPT-5.4 endpoint, at zero per-call cost.

05

What Builders Should Actually Think About

01

Evaluate on Your Tasks

Benchmark leadership on HumanEval and SWE-bench doesn't automatically transfer to your specific codebase. Before you bet production on GLM-5 or any open model, run it on a representative slice of your own work. This is the step teams skip most often.

Benchmarks are signals, your eval is truth
02

Consider the Full Stack

An open model isn't free in production. You pay for hosting, observability, fallback logic, and engineering time. For many teams, a paid API is still cheaper in total cost of ownership. Open wins when you have reasons — compliance, privacy, offline, volume — that tip the balance.

TCO, not just token price
03

Hedge Your Model Choice

The capability gap between closed and open is shrinking faster than most roadmaps assume. Design your stack so swapping model providers is a config change, not a rewrite. The teams that locked into one vendor in 2024 are paying for it now.

Abstraction layer, not lock-in
04

Read the Chinese AI Press

Most of what's interesting in open-weight AI is being published in Chinese-language technical venues first and translated later. If you only read English AI news, you're reading the story six weeks late. Follow researchers directly on HuggingFace and X.

Follow the labs, not the pundits

The Bottom Line

The Verdict
The best open-weight AI is being built in China right now, and it's not a close call. If you're a working builder, GLM-5 is worth evaluating this week — not because of geopolitics, but because it's the best open coding model you can run on your own hardware.

Models don't have nationalities in any way that matters to the code you ship. They have weights, licenses, and benchmarks. GLM-5 has the best of all three for coding workflows right now, and it's free. Go pull it, run it on your stack, and see what it can do. That's the practitioner's job, regardless of which city the weights came out of.

Learn to Build With Open Models, Not Just Read About Them

The 2-day in-person Precision AI Academy bootcamp covers open models, self-hosting, RAG, and agent patterns hands-on. 5 cities. $1,490. June–October 2026 (Thu–Fri).

Reserve Your Seat
PA

Published By

Precision AI Academy

Practitioner-focused AI education · 2-day in-person bootcamp in 5 U.S. cities

Precision AI Academy publishes deep-dives on applied AI engineering for working professionals. Founded by Bo Peng (Kaggle Top 200) who leads the in-person bootcamp in Denver, NYC, Dallas, LA, and Chicago.

Kaggle Top 200 Federal AI Practitioner 5 U.S. Cities Thu–Fri Cohorts