Need smarter insights in your inbox? Join our weekly newsletters to get solely what issues to enterprise AI, information, and safety leaders. Subscribe Now
Chinese language e-commerce big Alibaba’s “Qwen Crew” has completed it once more.
Mere days after releasing at no cost and with open supply licensing what’s now the highest performing non-reasoning massive language mannequin (LLM) on this planet — full cease, even in comparison with proprietary AI fashions from well-funded U.S. labs reminiscent of Google and OpenAI — within the type of the lengthily named Qwen3-235B-A22B-2507, this group of AI researchers has come out with yet one more blockbuster mannequin.
That’s Qwen3-Coder-480B-A35B-Instruct, a brand new open-source LLM targeted on helping with software program growth. It’s designed to deal with advanced, multi-step coding workflows and may create full-fledged, practical functions in seconds or minutes.
The mannequin is positioned to compete with proprietary choices like Claude Sonnet-4 in agentic coding duties and units new benchmark scores amongst open fashions.
It’s out there on Hugging Face, Girub, Qwen Chatby way of Alibaba’s Qwen APIand a rising checklist of third-party vibe coding and AI device platforms.
Open sourcing licensing means low price and excessive optionality for enterprises
However not like Claude and different proprietary fashions, Qwen3-Coder, which we’ll name it for brief, is obtainable now beneath an open supply Apache 2.0 licensewhich means it’s free for any enterprise to take with out cost, obtain, modify, deploy and use of their industrial functions for workers or finish prospects with out paying Alibaba or anybody else a dime.
It’s additionally so extremely performant on third-party benchmarks and anecdotal utilization amongst AI energy customers for “vibe coding” — coding utilizing pure language and with out formal growth processes and steps — that not less than one, Lllm researcher sebastian rakkawrote on X that: “This is likely to be the very best coding mannequin but. Common-purpose is cool, however if you need the very best at coding, specialization wins. No free lunch.”
Builders and enterprises inquisitive about downloading it will probably discover the code on the AI code sharing repository Hugging Face.
Enterprises who don’t want to, or don’t have the capability to host the mannequin on their very own or via numerous third-party cloud inference suppliers, can even use it instantly via the Alibaba Cloud Qwen APIthe place the per-million token prices begin at $1/$5 per million tokens (mTok) for enter/output of as much as 32,000 tokens, then $1.8/$9 for as much as 128,000, $3/$15 for as much as 256,000 and $6/$60 for the complete million.
Mannequin structure and capabilities
In keeping with the documentation launched by Qwen Crew on-line, Qwen3-Coder is a Combination-of-Specialists (MoE) mannequin with 480 billion whole parameters, 35 billion lively per question, and eight lively specialists out of 160.
It helps 256K token context lengths natively, with extrapolation as much as 1 million tokens utilizing YaRN (Yet one more RoPE extrapolatioN) — a way used to increase a language mannequin’s context size past its unique coaching restrict by modifying the Rotary Positional Embeddings (RoPE) used throughout consideration computation. This capability allows the mannequin to grasp and manipulate complete repositories or prolonged paperwork in a single cross.
Designed as a causal language mannequin, it options 62 layers, 96 consideration heads for queries, and eight for key-value pairs. It’s optimized for token-efficient, instruction-following duties and omits help for blocks by default, streamlining its outputs.
Excessive efficiency
Qwen3-Coder has achieved main efficiency amongst open fashions on a number of agentic analysis suites:
SWE-bench Verified: 67.0% (customary), 69.6% (500-turn)
GPT-4.1: 54.6%
Gemini 2.5 Professional Preview: 49.0%
Claude Sonnet-4: 70.4%
The mannequin additionally scores competitively throughout duties reminiscent of agentic browser use, multi-language programming, and power use. Visible benchmarks present progressive enchancment throughout coaching iterations in classes like code era, SQL programming, code modifying, and instruction following.
Alongside the mannequin, Qwen has open-sourced Qwen Code, a CLI device forked from Gemini Code. This interface helps operate calling and structured prompting, making it simpler to combine Qwen3-Coder into coding workflows. Qwen Code helps Node.js environments and will be put in by way of npm or from supply.
Qwen3-Coder additionally integrates with developer platforms reminiscent of:
Claude Code (by way of DashScope proxy or router customization)
Cline (as an OpenAI-compatible backend)
Ollama, LMStudio, MLX-LM, llama.cpp, and KTransformers
Builders can run Qwen3-Coder domestically or join by way of OpenAI-compatible APIs utilizing endpoints hosted on Alibaba Cloud.
Submit-training strategies: code RL and long-horizon planning
Along with pretraining on 7.5 trillion tokens (70% code), Qwen3-Coder advantages from superior post-training strategies:
Code RL (Reinforcement Studying): Emphasizes high-quality, execution-driven studying on various, verifiable code duties
Lengthy-Horizon Agent RL: Trains the mannequin to plan, use instruments, and adapt over multi-turn interactions
This section simulates real-world software program engineering challenges. To allow it, Qwen constructed a 20,000-environment system on Alibaba Cloud, providing the size obligatory for evaluating and coaching fashions on advanced workflows like these present in SWE-bench.
Enterprise implications: AI for engineering and DevOps workflows
For enterprises, Qwen3-Coder provides an open, extremely succesful various to closed-source proprietary fashions. With sturdy leads to coding execution and long-context reasoning, it’s particularly related for:
Codebase-level understanding: Superb for AI programs that should comprehend massive repositories, technical documentation, or architectural patterns
Automated pull request workflows: Its skill to plan and adapt throughout turns makes it appropriate for auto-generating or reviewing pull requests
Device integration and orchestration: By means of its native tool-calling APIs and performance interface, the mannequin will be embedded in inner tooling and CI/CD programs. This makes it particularly viable for agentic workflows and merchandise, i.e., these the place the person triggers one or a number of duties that it desires the AI mannequin to go off and do autonomously, by itself, checking in solely when completed or when questions come up.
Information residency and value management: As an open mannequin, enterprises can deploy Qwen3-Coder on their very own infrastructure—whether or not cloud-native or on-prem—avoiding vendor lock-in and managing compute utilization extra instantly
Help for lengthy contexts and modular deployment choices throughout numerous dev environments makes Qwen3-Coder a candidate for production-grade AI pipelines in each massive tech corporations and smaller engineering groups.
Developer entry and finest practices
To make use of Qwen3-Coder optimally, Qwen recommends:
Sampling settings: temperature=0.7, top_p=0.8, top_k=20, repetition_penalty=1.05
Output size: As much as 65,536 tokens
Transformers model: 4.51.0 or later (older variations could throw errors resulting from qwen3_moe incompatibility)
APIs and SDK examples are supplied utilizing OpenAI-compatible Python purchasers.
Builders can outline customized instruments and let Qwen3-Coder dynamically invoke them throughout dialog or code era duties.
Heat early reception from AI energy customers
Preliminary responses to Qwen3-Coder-480B-A35B-Instruct have been notably constructive amongst AI researchers, engineers, and builders who’ve examined the mannequin in real-world coding workflows.
Along with Raschka’s lofty reward above, Wolfram Ravenwolf, an AI engineer and evaluator at EllamindAI, shared his expertise integrating the mannequin with Claude Code on Xstating, “That is certainly the very best one presently.”
After testing a number of integration proxies, Ravenwolf mentioned he finally constructed his personal utilizing LiteLLM to make sure optimum efficiency, demonstrating the mannequin’s attraction to hands-on practitioners targeted on toolchain customization.
Educator and AI tinkerer Kevin Nelson additionally weighed in on X after utilizing the mannequin for simulation duties.
“Qwen 3 Coder is on one other stage,” he posted, noting that the mannequin not solely executed on supplied scaffolds however even embedded a message inside the output of the simulation — an surprising however welcome signal of the mannequin’s consciousness of job context.
Even Twitter co-founder and Sq. (now referred to as “Block”) founder Jack Dorsey posted an X message in reward of the mannequin, writing: “Goose + qwen3-coder = wow,” in reference to his Block’s open supply AI agent framework Goose, which VentureBeat coated again in January 2025.
These responses recommend Qwen3-Coder is resonating with a technically savvy person base looking for efficiency, adaptability, and deeper integration with current growth stacks.
Wanting forward: extra sizes, extra use instances
Whereas this launch focuses on probably the most highly effective variant, Qwen3-Coder-480B-A35B-Instruct, the Qwen group signifies that further mannequin sizes are in growth.
These will goal to supply related capabilities with decrease deployment prices, broadening accessibility.
Future work additionally contains exploring self-improvement, because the group investigates whether or not agentic fashions can iteratively refine their very own efficiency via real-world use.
Every day insights on enterprise use instances with VB Every day
If you wish to impress your boss, VB Every day has you coated. We provide the inside scoop on what corporations are doing with generative AI, from regulatory shifts to sensible deployments, so you possibly can share insights for optimum ROI.
Thanks for subscribing. Try extra VB newsletters right here.
An error occured.