DeepSeek-V3.2 scores big in maths and coding beats GPT-5 in key tests

DeepSeek-V3.2 has shown strong gains in reasoning and coding tasks, including top scores in global competitions like AIME and Codeforces. The Speciale version even surpasses GPT-5-High in several key benchmarks. With lower compute needs, the model could attract fast adoption in India.

DeepSeek-V3.2 beats GPT-5-High in global reasoning tests and coding challenges

Siddharth Shankar | Updated on: Dec 01, 2025 | 06:38 PM

New Delhi: China’s DeepSeek has rolled out DeepSeek-V3.2, and the company says the model combines strong reasoning, advanced agent abilities, and much faster compute efficiency. attention.

DeepSeek says the model delivers "high computational efficiency with superior reasoning and agent performance.” The charts shared by the company show DeepSeek-V3.2 outperforming GPT-5-High and Claude-4.5-Sonnet in several reasoning tests and agent tasks.

DeepSeek-V3.2 aims to solve reasoning and tool-use together

The model works on three big changes in its design. DeepSeek calls the first one DeepSeek Sparse Attention. The system reduces compute load for long content while keeping accuracy high. This could help with large legal text, long code files, or enterprise chat sessions.

Second, they have scaled their reinforcement learning approach. This helps the model solve maths, logic and competition-style problems better. DeepSeek said its high compute variant Speciale "surpasses GPT-5” and performs close to Gemini-3.0-Pro in reasoning benchmarks.

The company celebrated one milestone strongly. It said the model achieved "Gold-medal performance in the 2025 International Mathematical Olympiad (IMO) and International Olympiad in Informatics (IOI).”

The third innovation is a new pipeline for training agent behaviour. It creates huge amounts of synthetic data for tool-based tasks, such as automated coding, browser actions, and file search.

DeepSeek v3.2 Benchmarks

Charts released by the team show the Speciale version leading in multiple tests

• 96 percent Pass@1 in AIME 2025

• 99.2 percent in HMMT 2025

• Codeforces score of 2701

• 46.4 percent on Terminal Bench 2.0

• 80.3 percent on T2 Bench

In Codeforces, it ranks higher than GPT-5-High and very close to Gemini-3.0-Pro. In hands-on coding tasks like SWE verified, it is again in the top cluster.

Codeforces and software engineering tasks reflect real industry jobs like debugging code, shipping fixes, or verifying production systems.

Changes for developers using the model

DeepSeek has also updated how the model formats conversations. It introduced a new "thinking with tools” flow and a separate role named developer just for search agent scenarios. The company shared Python code samples but warned that the output parser "is not suitable for production use without robust error handling.”

There is no Jinja template provided in this release.

Focus on decentralised verification and trust

The company has released the full answer submissions for IMO 2025, IOI, ICPC World Finals and CMO 2025 for the community. This allows independent checking of the exact work the model did. It shows DeepSeek wants credibility in reasoning rather than only leaderboard claims.

What it means for the AI race

DeepSeek-V3.2 looks designed to challenge US firms on two fronts at the same time. Better reasoning and lower compute cost. That could appeal to India, where startups and institutions want stronger models without expensive GPUs.

Latest

Big News for Sahara India investors! How to get upto Rs 10 lakh refund

Beetroot for hair: Easy masks to reduce dandruff and boost growth

Harendra Singh quits Indian women's hockey team chief coach post: Report

Free after years: What were the espionage charges against BrahMos scientist Nishant Agarwal?

DeepSeek-V3.2 scores big in maths and coding beats GPT-5 in key tests

DeepSeek-V3.2 has shown strong gains in reasoning and coding tasks, including top scores in global competitions like AIME and Codeforces. The Speciale version even surpasses GPT-5-High in several key benchmarks. With lower compute needs, the model could attract fast adoption in India.

DeepSeek-V3.2 aims to solve reasoning and tool-use together

DeepSeek v3.2 Benchmarks

Changes for developers using the model

Focus on decentralised verification and trust

What it means for the AI race

Latest

Photo Gallery

Entertainment

World

Sports

Lifestyle

India

Technology

Business

Religion

Shorts

Career

Videos

Education

Science

Cities