Need smarter insights in your inbox? Join our weekly newsletters to get solely what issues to enterprise AI, information, and safety leaders. Subscribe Now
Google DeepMind introduced Monday that a sophisticated model of its Gemini synthetic intelligence mannequin has formally achieved gold medal-level efficiency on the Worldwide Mathematical Olympiadfixing 5 of six exceptionally troublesome issues and incomes recognition as the primary AI system to obtain official gold-level grading from competitors organizers.
The victory advances the sphere of AI reasoning and places Google forward within the intensifying battle between tech giants constructing next-generation synthetic intelligence. Extra importantly, it demonstrates that AI can now sort out advanced mathematical issues utilizing pure language understanding moderately than requiring specialised programming languages.
“Official outcomes are in — Gemini achieved gold-medal stage within the Worldwide Mathematical Olympiad!” Demis HassabisCEO of Google DeepMind, wrote on social media platform X Monday morning. “A sophisticated model was in a position to clear up 5 out of 6 issues. Unbelievable progress.”
Official outcomes are in – Gemini achieved gold-medal stage within the Worldwide Mathematical Olympiad! ? A sophisticated model was in a position to clear up 5 out of 6 issues. Unbelievable progress – large congrats to @lmthang and the workforce! https://t.co/pp9bXF7rVj
– Demis Hassabus (@demishabisabis) July 21, 2025
The Worldwide Mathematical Olympiadheld yearly since 1959, is broadly thought-about the world’s most prestigious arithmetic competitors for pre-university college students. Every taking part nation sends six elite younger mathematicians to compete in fixing six exceptionally difficult issues spanning algebra, combinatorics, geometry, and quantity concept. Solely about 8% of human members sometimes earn gold medals.
The AI Influence Sequence Returns to San Francisco – August 5
The following section of AI is right here – are you prepared? Be a part of leaders from Block, GSK, and SAP for an unique take a look at how autonomous brokers are reshaping enterprise workflows – from real-time decision-making to end-to-end automation.
Safe your spot now – house is restricted: https://bit.ly/3GuuPLF
How Google DeepMind’s Gemini Deep Assume cracked math’s hardest issues
Google’s newest success far exceeds its 2024 efficiency, when the corporate’s mixed AlphaProof and AlphaGeometry methods earned silver medal standing by fixing 4 of six issues. That earlier system required human specialists to first translate pure language issues into domain-specific programming languages after which interpret the AI’s mathematical output.
This 12 months’s breakthrough got here via Gemini Deep Assumean enhanced reasoning system that employs what researchers name “parallel pondering.” Not like conventional AI fashions that observe a single chain of reasoning, Deep Assume concurrently explores a number of attainable options earlier than arriving at a remaining reply.
“Our mannequin operated end-to-end in pure language, producing rigorous mathematical proofs straight from the official downside descriptions,” Hassabis defined in a follow-up publish on the social media website X, emphasizing that the system accomplished its work throughout the competitors’s commonplace 4.5-hour time restrict.
We achieved this 12 months’s spectacular consequence utilizing a sophisticated model of Gemini Deep Assume (an enhanced reasoning mode for advanced issues). Our mannequin operated end-to-end in pure language, producing rigorous mathematical proofs straight from the official downside descriptions –…
– Demis Hassabus (@demishabisabis) July 21, 2025
The mannequin achieved 35 out of a attainable 42 factors, comfortably exceeding the gold medal threshold. Based on IMO President Prof. Dr. Gregor Dolinar, the options had been “astonishing in lots of respects” and located to be “clear, exact and most of them simple to observe” by competitors graders.
OpenAI faces backlash for bypassing official competitors guidelines
The announcement comes amid rising pressure within the AI trade over aggressive practices and transparency. Google DeepMind’s measured strategy to releasing its outcomes has drawn reward from the AI neighborhood, notably in distinction to rival OpenAI’s dealing with of comparable achievements.
“We didn’t announce on Friday as a result of we revered the IMO Board’s authentic request that each one AI labs share their outcomes solely after the official outcomes had been verified by unbiased specialists & the scholars had rightly acquired the acclamation they deserved,” Hassabis Wroteshowing to reference OpenAI’s earlier announcement of its personal olympiad efficiency.
Btw as an apart, we didn’t announce on Friday as a result of we revered the IMO Board’s authentic request that each one AI labs share their outcomes solely after the official outcomes had been verified by unbiased specialists & the scholars had rightly acquired the acclamation they deserved
– Demis Hassabus (@demishabisabis) July 21, 2025
Social media customers had been fast to notice the excellence. “You see? OpenAI ignored the IMO request. Disgrace. No class. Straight up disrespect,” wrote one person. “Google DeepMind acted with integrity, aligned with humanity.”
The criticism stems from OpenAI’s choice to announce its personal mathematical olympiad outcomes with out taking part within the official IMO analysis course of. As a substitute, OpenAI had a panel of former IMO members grade its AI’s efficiency, a strategy that some in the neighborhood view as missing credibility.
“OpenAI is sort of probably the worst firm on the planet proper now,” wrote one critic, whereas others urged the corporate must “take issues critically” and “be extra credible.”
You see?
OpenAI ignored the IMO request. Disgrace. No class. Straight up disrespect.
Google DeepMind acted with integrity, aligned with humanity.
TRVTHNUKE pic.twitter.com/8LAOak6XUE
– nik (@ns123abc) July 21, 2025
Contained in the coaching strategies that powered Gemini’s mathematical mastery
Google DeepMind’s success seems to stem from novel coaching methods that transcend conventional approaches. The workforce used superior reinforcement studying strategies designed to leverage multi-step reasoning, problem-solving, and theorem-proving information. The mannequin was additionally offered entry to a curated assortment of high-quality mathematical options and acquired particular steerage on approaching IMO-style issues.
The technical achievement impressed AI researchers who famous its broader implications. “Not simply fixing math… however understanding language-described issues and making use of summary logic to novel circumstances,” wrote AI observer Elyss Wren. “This isn’t rote reminiscence — that is emergent cognition in movement.”
Ethan Mollicka professor on the Wharton College who research AI, emphasised the importance of utilizing a general-purpose mannequin moderately than specialised instruments. “Growing proof of the power of LLMs to generalize to novel downside fixing,” he wrote, highlighting how this differs from earlier approaches that required specialised mathematical software program.
It wasn’t simply OpenAI.
Google additionally used a common goal mannequin to resolve the very onerous math issues of the Worldwide Math Olympiad in plain language. Final 12 months they used specialised software use
Growing proof of the power of LLMs to generalize to novel downside fixing https://t.co/Ve72fFmx2b
— Ethan Mollick (@emollick) July 21, 2025
The mannequin demonstrated notably spectacular reasoning in a single downside the place many human opponents utilized graduate-level mathematical ideas. Based on DeepMind researcher Junehyuk Jung, Gemini “made a superb remark and used solely elementary quantity concept to create a self-contained proof,” discovering a extra elegant answer than many human members.
What Google DeepMind’s victory means for the $200 billion AI race
The breakthrough comes at a crucial second within the AI trade, the place corporations are racing to reveal superior reasoning capabilities. The success has instant sensible implications: Google plans to make a model of this Deep Assume mannequin accessible to mathematicians for testing earlier than rolling it out to Google AI Extremely subscribers, who pay $250 month-to-month for entry to the corporate’s most superior AI fashions.
The timing additionally highlights the intensifying competitors between main AI laboratories. Whereas Google celebrated its methodical, officially-verified strategy, the controversy surrounding OpenAI’s announcement displays broader tensions about transparency and credibility in AI growth.
This aggressive dynamic extends past simply mathematical reasoning. Latest weeks have seen varied AI corporations announce breakthrough capabilities, although not all have been acquired positively. Elon Musk’s xAI lately launched Grok 4which the corporate claimed was the “smartest AI on this planet,” although leaderboard scores confirmed it trailing behind fashions from Google and OpenAI. Moreover, Grok has confronted criticism for controversial options together with sexualized AI companions and episodes of producing antisemitic content material.
The daybreak of AI that thinks like people—with real-world penalties
The mathematical olympiad victory goes past aggressive bragging rights. Gemini’s efficiency demonstrates that AI methods can now match human-level reasoning in advanced duties requiring creativity, summary pondering, and the power to synthesize insights throughout a number of domains.
“It is a important advance over final 12 months’s breakthrough consequence,” the DeepMind workforce famous of their technical announcement. The development from requiring specialised formal languages to working totally in pure language means that AI methods have gotten extra intuitive and accessible.
For companies, this growth alerts that AI might quickly sort out advanced analytical issues throughout varied industries with out requiring specialised programming or area experience. The power to cause via intricate challenges utilizing on a regular basis language might democratize refined analytical capabilities throughout organizations.
Nevertheless, questions persist about whether or not these reasoning capabilities will translate successfully to messier real-world challenges. The mathematical olympiad supplies well-defined issues with clear success standards — a far cry from the ambiguous, multifaceted choices that outline most enterprise and scientific endeavors.
Google DeepMind plans to return to subsequent 12 months’s competitors “looking for an ideal rating.” The corporate believes AI methods combining pure language fluency with rigorous reasoning “will change into invaluable instruments for mathematicians, scientists, engineers, and researchers, serving to us advance human information on the trail to AGI.”
However maybe essentially the most telling element emerged from the competitors itself: when confronted with the competition’s most troublesome downside, Gemini began from an incorrect speculation and by no means recovered. Solely 5 human college students solved that downside accurately. Ultimately, it appears, even gold medal-winning AI nonetheless has one thing to study from teenage mathematicians.
Every day insights on enterprise use circumstances with VB Every day
If you wish to impress your boss, VB Every day has you coated. We provide the inside scoop on what corporations are doing with generative AI, from regulatory shifts to sensible deployments, so you possibly can share insights for optimum ROI.
Thanks for subscribing. Take a look at extra VB newsletters right here.
An error occured.