Was openPangu 2.0 really trained without NVIDIA GPUs?

Yes. Training ran entirely on Huawei Ascend 910B NPUs — the first frontier-scale open LLM trained without NVIDIA hardware.

How does openPangu 2.0 compare to DeepSeek V4 Pro?

DeepSeek wins raw coding and reasoning (~200B active params). openPangu wins 512K context, sovereign/Ascend deployment, and planned full training-code release.

How do I run openPangu 2.0 locally?

Fastest: Huawei Cloud ModelArts API. Self-host: download Flash weights from GitCode Ascend Tribe and run inference.py on Ascend 910B or ~96GB unified memory.

openPangu 2.0 Open Source | 505B MoE 512K

01

Timeline: HDC 2026 to GitCode Release

Date	Event
2026-06-12	HDC 2026 — Richard Yu keynotes openPangu 2.0 launch
2026-06-30	Flash weights, inference code, training operators on GitCode
July 2026 (planned)	Pro weights and inference code
H2 2026 (planned)	Pre-training code, post-training code, more operators

Why this release matters

01
Export controls: US restrictions on A100/H100 made “no NVIDIA, no frontier model” a common assumption — 505B MoE on Ascend challenges it.
02
Open depth: most labs ship weights + inference only; Huawei plans pre/post-training code and Ascend kernels.
03
News window: Flash went live June 30 — peak interest for developers evaluating sovereign stacks.
04
HarmonyOS Agent: native engine for HarmonyOS 7 agents; 30B edge model runs offline on Kirin phones.

02

Specs and Seven Open Components

Variant	Total	Active	Sparsity	Context	Status
Pro	505B	18B	~28:1	512K	July 2026
Flash	92B	6B	~15:1	512K	Live June 30

Cite: 512K tokens ≈ eight full novels in one prompt; Flash activates only 6B params per token while drawing on 92B knowledge.

01
Model architecture — released
02
Weights (Flash live; Pro July) — Flash released
03
Technical report — released
04
Inference + training operators — released
05
Pre-training code — H2 2026
06
Post-training (SFT/RLHF) — H2 2026
07
Ascend training kernels — H2 2026

03

Architecture and Training Breakthroughs

mHC routing: Multi-Head Combinatorial expert routing, less load imbalance
Muon optimizer: second-order momentum for large-scale stability
ModAttn: modular attention for 512K windows
DSA+SWA (Flash): ultra-sparse attention for inference efficiency

Metric	Value
Hypernode training efficiency	+30%
512K sequence throughput	+50%
Train/inference consistency (MoE)	>99%
Ascend single-card vs mainstream OSS	2× throughput
Flash-Int8 (W4A8)	-40% memory, <10% quality loss

04

Ascend Stack and Developer Ecosystem

Training ran on Ascend 910B NPUs only — no A100/H100. Stack: CANN (CUDA-class runtime) + torch_npu; standard PyTorch with import torch_npu switches backend. Deploy via ModelArts API, GitCode self-host, or HarmonyOS native integration. Edge: 30B embedded model — 50% faster inference, 20% less memory on Kirin silicon.

05

vs DeepSeek, Qwen, Kimi — Honest Tradeoffs

Model	Total	Active	Context	Hardware	Open depth
openPangu 2.0 Pro	505B	18B	512K	Ascend	7 components
DeepSeek V4 Pro	1.6T	~200B	128K	NVIDIA	weights + infer
Qwen 3.7 Max	~400B+	varies	128K	NVIDIA	partial training
Kimi K2.7	1T	32B	256K	NVIDIA	weights + infer

DeepSeek wins coding and hard reasoning today. openPangu wins 512K context (4× most rivals), sovereign deployment without NVIDIA, 2× Ascend throughput, and planned full training pipeline. Kimi wins MCP-heavy agent tooling. Pick Flash for local cost (~96GB); Pro for long-document RAG when weights ship in July.

06

How to Access: ModelArts API and GitCode

01
Sign up for Huawei Cloud
02
ModelArts → AI Gallery → search openPangu 2.0
03
Subscribe and copy API endpoint + token
04
Call Chat Completions (curl below)
05
Set per-model billing caps and audit logs

ModelArts API

curl -X POST "https://modelarts.${REGION}.myhuaweicloud.com/v1/infers/openpangu-2-flash/chat/completions" \
  -H "Content-Type: application/json" \
  -H "X-Auth-Token: ${TOKEN}" \
  -d '{"model":"openpangu-2.0-flash","messages":[{"role":"user","content":"Explain MoE in simple terms"}],"max_tokens":1024}'

Flash on single Ascend 910B

python inference.py --model_path ./openPangu-Flash --device npu:0 --context_length 512000 --precision bf16

Variant	Recommended	Minimum
Flash	1× Ascend 910B	~96GB unified memory
Flash-Int8	Atlas A2	~48GB VRAM
Pro	4+ Ascend 910B	multi-card cluster

07

Sovereign AI, License, HarmonyOS Agents

Under openPangu License: commercial use permitted, royalty-free, non-exclusive (see GitCode for terms). Strategically, openPangu backs HarmonyOS 7 agents (>90% complex-task success on framework 2.0). When pre-training code ships H2 2026, researchers can reproduce a frontier MoE pipeline on Ascend — rare at this scale.

Links: GitCode Ascend Tribe · ModelArts · HDC 2026

FAQ

Yes — Ascend 910B only, no A100/H100 in the training pipeline.

DeepSeek for coding/reasoning; openPangu for 512K docs, sovereign/Ascend deploy, and future full training code.

Closing

openPangu 2.0 is not today’s benchmark king — DeepSeek still leads many coding tasks. It is something else: a NVIDIA-independent, full-stack frontier MoE with 512K context and a credible open roadmap. Flash weights are live now.

Routing openPangu beside Claude or DeepSeek in OpenClaw on macOS often needs GUI OAuth, Keychain, and a host that stays awake. Before you buy hardware, validate primary/fallback pairs on a Mac with real screens. VNCMac rents physical Mac mini nodes monthly for multi-model Agent routing — pricing, homepage.

Huawei's openPangu 2.0 Is Open-SourceTrained Without a Single NVIDIA GPU