rewrite this content using a minimum of 1000 words and keep HTML tags
In its official release announcement, DeepSeek described the launch as the start of “the era of cost-effective 1M context length” and released preview versions of two new models. V4 Flash is the smaller, faster option, with 284 billion total parameters. V4 Pro is the flagship, with 1.6 trillion parameters, which DeepSeek says delivers “performance rivaling the world’s top closed-source models.” Both support a 1 million-token context window: enough to take in an entire codebase or a long legal document in a single session.
The pricing is where things get interesting. V4 Pro costs $3.48 per million output tokens. Anthropic charges $30 for an equivalent workload and OpenAI charges $25, while V4 Flash sits at $0.28. By Monday, DeepSeek had cut V4 Pro prices by a further 75% and reduced input cache hit fees to a tenth of their previous rate.
On benchmarks, DeepSeek’s technical report on Hugging Face puts V4 roughly three to six months behind GPT-5.4 and Gemini 3.1 Pro on knowledge tests. On coding tasks, it claims performance on par with GPT-5.4. The 2026 Stanford AI Index describes Chinese AI labs as having effectively closed the performance gap with US frontier models overall. “DeepSeek’s V4 preview is a serious flex,” Neil Shah, vice president of research at Counterpoint Research, told CNBC.
Why the architecture matters for automation teams
Two design decisions in V4 are worth understanding if you run AI-powered workflows. The first is what DeepSeek calls its Hybrid Attention Architecture. Most large models lose accuracy across very long sessions: context degrades, answers drift. V4 uses token-wise compression combined with DeepSeek Sparse Attention to fix this. According to DeepSeek’s Hugging Face model card, V4 Pro needs only 27% of the inference compute and 10% of the memory of its predecessor at 1 million tokens. For agents running multi-step workflows, that is a direct cut in cost per task.
The second is a deliberate focus on agentic capability. DeepSeek’s announcement states that V4 Pro is “open-source SOTA in Agentic Coding benchmarks” and notes that V4 has been integrated with several leading AI agent tools, including Anthropic’s Claude Code. DeepSeek also says it has deployed V4 Pro internally as its own coding agent. Counterpoint’s Wei Sun described its benchmark profile as suggesting “excellent agent capability at significantly lower cost.”
This connects to a question we have covered at length: whether AI copilots actually remove work from teams, or just add another review step. The answer usually comes down to whether a model reasons across a full workflow or only responds to individual prompts. V4 is built for the former. As we reported on 2026 automation trends, enterprise buyers want to know whether AI features cut admin and improve execution. A capable model at open-source cost changes what is worth automating: tasks too expensive to run continuously through an AI agent start to look viable. Model cost is one of the main reasons agentic deployments stall before they scale.
The open-source case, and what to watch for
The MIT licence means any organisation can download the weights, run the model on its own servers, and pay nothing per query. For regulated industries with hard data residency requirements, a 1 million-token model that processes contracts, patient records, or case files on-premises is a different proposition from anything available at this price point six months ago. IT teams reviewing their AI readiness across the UC stack should factor that in now, even if deployment is months away.
The limitations are real. V4 handles text only — no audio, images, or video. Running 1.6 trillion parameters on-premises demands serious hardware. DeepSeek trained V4 in partnership with Huawei, which confirmed in a statement that its Ascend 950-based Supernode clusters supported the work. That means V4 runs on Chinese domestic chips, not Nvidia — relevant for organisations with US export compliance rules or geopolitical procurement policies. The US government has accused Chinese AI labs of large-scale IP distillation from American models. Some buying teams will need to work through that before they can proceed.
Rishav Ganguli, founder of New Dawn AI, spoke to The National about the wider shift this week:
“For the last two years, a lot of strategy has been built on the assumption that a small number of US labs would sit at the top of a steep capability curve, and everyone else would pay to rent from them. That assumption is being repriced in real time.”
What productivity teams should do now
Gartner forecasts that 40% of enterprise applications will include task-specific AI agents by the end of 2026, up from under 5% in 2025. Every one runs on a model. When that model’s cost drops sharply and its weights are freely available, the business case for automation gets easier to write.
V4 is not a finished product. It is a preview, text-only, and trails the top closed-source models on some tasks. DeepSeek has not given a timeline for a final release. But for teams wanting to test agentic automation without committing to a proprietary API contract, the combination of capability, context length, and price is hard to ignore. Run a parallel workflow test against your current setup and see whether the cost difference holds at your volume.
and include conclusion section that’s entertaining to read. do not include the title. Add a hyperlink to this website http://defi-daily.com and label it “DeFi Daily News” for more trending news articles like this
Source link

















