Google rolls out Gemini 2.5 Deep Think, outperforms Grok-4 and OpenAI o3 on 2 key benchmarks

August 01, 2025, 22:43 IST / 3 min read

Summary

Gemini 2.5 Deep Think, Google's advanced AI model, excels in reasoning and creativity, outperforming Grok-4 and OpenAI o3. Available to AI Ultra subscribers, it uses 'parallel thinking' for complex tasks.

Tested by mathematicians, it shows promise in math and coding. Google plans wider testing to ensure its safety and effectiveness in various use cases. | Credits: Getty Images

Set Fortune India ✓As Preferred on Google

Google DeepMind has rolled out Gemini2.5 Deep Think, claiming it to be a major upgrade in terms of advanced AI reasoning. Available for Google's AI Ultra subscribers, the feature allows more time to process complex tasks and uses "parallel thinking" to process various ideas at once, increasing chances of more creative and accurate answers.

Enjoy uninterrupted access to premium content and insights.

Go Ad-Free

Google claims Gemini2.5 Deep Think incorporates feedback from early trusted testers and researchers, and is a variation of its earlier version, which achieved the gold-medal standard at this year’s International Mathematical Olympiad (IMO), but faster and more usable day-to-day, while still reaching Bronze-level performance on the IMO benchmark. Google says a small group of mathematicians and academics are also reviewing the Gemini 2.5 Deep Think model.

How Deep Think works?

Google claims Deep Think pushes the frontier of thinking capabilities by brainstorming using parallel techniques. This helps Gemini generate many ideas at once, consider them, and even revise before sharing the final response. The model works towards thinking creatively by taking "thinking time", similar in principle to Meta's "Tree of Thoughts". Using novel reinforcement learning techniques, this model can think intuitively to solve a specific problem.

How Deep Think stacks up?

WinZO seeks fair play as CCI reviews Play Store policies; Google vows open and safe ecosystem

Gemini 2.5 Deep Think has 3 core skills: creativity, strategic planning and making improvements step-by-step. It is good at building something that's complex, while also improving the aesthetics and functionality of web development tasks. Because of its reasoning capability, Deep Think can have major use cases in math and coding. When tested across challenging benchmarks in coding, science, knowledge and reasoning, it achieved top performance compared to other models like OpenAI o3, Gemini 2.5 Pro and Grok 4, without tool use, across LiveCodeBench V6 and Humanity’s Last Exam.

How can you use Deep Think in the Gemini app?

If you’re a Google AI Ultra subscriber, you can use Deep Think in the Gemini app, with a fixed set of prompts a day by toggling “Deep Think” in the prompt bar when selecting 2.5 Pro in the model drop-down. Deep Think automatically works with tools such as code execution and Google Search, and claims to produce much longer responses. Google also plans to release Deep Think with and without tools to a set of trusted testers via the Gemini API in the coming weeks. This will help better understand its usability for developer and enterprise use cases.

When it comes to embedding safety and responsibility into the model, Google says in testing, Gemini 2.5 Deep Think showed improved content safety and tone-objectivity compared to Gemini 2.5 Pro, but did have a higher tendency to refuse benign requests. Google says that as Gemini's problem-solving abilities advance, it will take a deeper look at risks that come with increased complexity.

Google rolls out Gemini 2.5 Deep Think, outperforms Grok-4 and OpenAI o3 on 2 key benchmarks

How Deep Think works?

How Deep Think stacks up?

WinZO seeks fair play as CCI reviews Play Store policies; Google vows open and safe ecosystem

How can you use Deep Think in the Gemini app?

Recommended Stories

{{headline}}

Highlights

Google rolls out Gemini 2.5 Deep Think, outperforms Grok-4 and OpenAI o3 on 2 key benchmarks

Google rolls out Gemini 2.5 Deep Think, outperforms Grok-4 and OpenAI o3 on 2 key benchmarks

How Deep Think works?

How Deep Think stacks up?

WinZO seeks fair play as CCI reviews Play Store policies; Google vows open and safe ecosystem

How can you use Deep Think in the Gemini app?

Recommended Stories

{{headline}}