Computer

Google’s Gemini 3 is living up to the hype and creating games in one shot


Google’s Gemini 3 is finally here, and we’re impressed with the results, especially when it comes to building simple games.

Gemini 3 Pro is an impressive model, and early benchmarks confirm it.

For example, it tops the LMArena Leaderboard with a score of 1501 Elo. It also offers PhD-level reasoning with top scores on Humanity’s Last Exam (37.5% without the usage of any tools) and GPQA Diamond (91.9%).

Wiz

Real life results also back these numbers.

Pietro Schirano, who created MagicPath, a vibe coding tool for designers, says we’re entering a new era with Gemini 3.

In his tests, Gemini 3 Pro successfully created a 3D LEGO editor in one shot. This means a single prompt is enough to create simple games in Gemini 3, which is a big deal if you ask me.

LLMs have been traditionally bad with games, but Gemini 3 shows some improvements in that direction.

This aligns with Google’s claims that Gemini 3 Pro redefines multimodal reasoning with 81% on MMMU-Pro and 87.6% on Video-MMMU benchmarks.

“It also scores a state-of-the-art 72.1% on SimpleQA Verified, showing great progress on factual accuracy,” Google noted in a blog post.

“This means Gemini 3 Pro is highly capable of solving complex problems across a vast array of topics like science and mathematics with a high degree of reliability.”

Gemini 3 is impressive in my early tests, but adherence remains an issue

I’ve been using Claude Code for a year now, and it’s been a great help with my Flutter/Dart projects.

Gemini 3 is a better model than Claude Sonnet 4.5, but there are some areas where Claude shines.

So far, no model has come close to Claude Code, particularly with adherence, and Gemini 3 is no exception.

One of the areas is adherence.

I personally found Claude Code better for following instructions. Likewise, Claude Code is also a better CLI than Gemini 3 Pro, which gives it an edge over competitors.

For everything else, Gemini 3 is a better choice, especially if you’ve been using Gemini 2.5 Pro.

If you use LLMs, I’d recommend sticking to Sonnet 4.5 for regular tasks and Gemini 3 Pro for complex queries.

Whether you’re cleaning up old keys or setting guardrails for AI-generated code, this guide helps your team build securely from the start.

Get the cheat sheet and take the guesswork out of secrets management.





Source link