DeepSeek V4 One Month In: Cache Tool Hits 99.82%, Prices Drop, V3.2-Exp Appears on Huawei Cloud

DeepSeek’s V4 series has been out for a month. The company made its temporary discounts permanent. The open source community built something that makes the API even cheaper.

Reasonix is a terminal coding harness built specifically for DeepSeek’s API. It exploits DeepSeek’s prefix-cache mechanism by using an append-only run loop. Old context stays in place. New messages only get appended. The result is a cache hit rate that hit 99.82% in tests.

That number matters because of how DeepSeek charges. A session that burns through 400 million tokens at 61 USD drops to about 12 USD at 99.82% cache. That is two cents on the dollar. Reasonix also switches between V4 Flash (cheap) and V4 Pro (expensive) depending on task difficulty, and escalates automatically after repeated failures.

The project is on GitHub under esengine/DeepSeek-Reasonix. The authors say it is built entirely around DeepSeek’s features and will not support general use. It has a desktop version for people who do not like the terminal.

Some users in the community pointed out you can get similar cache benefits by writing a thin bridge between Codex and DeepSeek’s API. Others reported that using DeepSeek V4 Pro through Claude Code costs less than through OpenCode. Your mileage will vary.

Separately, Huawei Cloud now offers DeepSeek-V3.2-Exp through its Experience Space. I could not find detailed benchmark numbers or parameter counts for this model. The page is listed but sparse on specs. If you want to try it, you go through Huawei Cloud’s website.

The V4 line itself launched in late April 2026. DeepSeek has not published detailed architecture papers for V4 yet, but the series includes at least Flash and Pro tiers. The permanent price cuts apply to both.

Also worth noting: the 2026 BAAI Conference (智源大会) is scheduled for June 12-13 in Beijing. The lineup includes multiple Turing Award winners, researchers from Google, Meta, Nvidia, Stanford, MIT, and Harvard, plus Chinese labs like Zhipu AI, Stepfun, MiniMax, and Baidu. The agenda covers world models, agent systems, embodied AI, and AI safety. First-time tracks include AI-native education and token economics. The conference may produce announcements worth watching.

Ant’s Lingbo Robotics lab also had its LingBot-VA paper accepted at RSS 2026. The model is a causal world modeling framework for robot control that predicts environmental changes before issuing action commands. It achieved 92% on RoboTwin 2.0 (easy), 91.1% (hard), and 98.5% on the LIBERO benchmark. The code is open source.

If you want to save money on DeepSeek V4: use Reasonix or rig your own prefix-aware client. If you want to try V3.2-Exp, go to Huawei Cloud. If you want predictions about where Chinese AI is going, wait for the BAAI conference in June.