In This Article
- What Google actually announced
- Gemini 3.5 Flash: the new default everywhere
- Gemini took over the Search bar
- Spark: an agent that runs while your phone is off
- Deep Think and the three-tier ladder
- Omni: the video and world model
- What builders should actually do
- What learners should actually do
- Common questions
Key Takeaways
- Gemini 3.5 Flash is the new default model in both the Gemini app and Google Search — fast (~280 tokens/sec), frontier-grade, and priced at $1.50/$9.00 per million tokens.
- Gemini Spark is a cloud agent that keeps working even when your device is off — monitoring listings, prices, and your inbox, and connecting to tools through MCP.
- Deep Think is the heavy-reasoning tier for hard math, science, and logic, reserved for Google's top-paid subscribers.
- The lesson for you is not to chase every feature. Pick one assistant, learn it deeply, and design anything you build to be model-portable.
Google held its annual I/O developer conference on May 19, 2026, and announced what it counted as roughly one hundred things. That number is itself the message: Google is no longer launching a product, it is launching a posture. The posture is that Gemini, its family of AI models, should be everywhere you already are — in Search, in the Gemini app, on your phone, and quietly working in the cloud while you sleep.
I follow these releases closely, both as someone who teaches AI and as someone who builds with it in federal work. Let me walk through what actually shipped, translate the model names into plain language, and then tell you what a learner or a small-business owner should genuinely do about it — and what you can safely ignore.
What Google actually announced
Four things matter out of the hundred. First, a new fast model, Gemini 3.5 Flash, which is now the default almost everywhere. Second, the decision to power the Google Search bar with that model. Third, a cloud-based personal agent called Spark. Fourth, a video and "world" model called Omni, alongside an upgraded Deep Think reasoning mode. Everything else at I/O was a variation, an integration, or a developer detail layered on top of these.
The throughline connecting all of them is a single idea: stop making people go to AI, and instead put AI inside the things they already use. That is a strategy aimed less at impressing engineers and more at reaching billions of ordinary people who will never read a tech headline.
Gemini 3.5 Flash: the new default everywhere
Gemini 3.5 Flash is Google's newest fast, low-cost model, and it is the engine behind most of what was announced. "Flash" models are tuned for speed and volume rather than the very hardest reasoning — which is exactly what you want for things that happen millions of times a second.
The numbers are real and worth knowing. Google reports Flash running at roughly 280 tokens per second — a "token" being a chunk of a word — which is several times faster than comparable frontier models. It carries a one-million-token context window, meaning it can hold an enormous amount of text in mind at once, and it accepts text, images, video, and audio as input. Google priced it at $1.50 per million input tokens and $9.00 per million output tokens, and reported it outscoring the previous Gemini 3.1 Pro on coding and agentic benchmarks. In plain terms: faster, cheaper, and smarter than the model it replaces.
It is now the default in the free Gemini app and in AI Mode in Search worldwide. So even if you never choose it, you are almost certainly already using it.
Gemini took over the Search bar
Here is the change most people will feel without ever knowing its name. Google said its Search bar is now powered by Gemini 3.5 Flash, generating custom, AI-summarized pages in response to a question rather than the familiar list of ten blue links.
Search is the front door to the internet for billions of people. When that door changes from "here are ten links, go read them" to "here is a written answer assembled for you," the habits of the entire web shift behind it. For most of us, as readers, this is convenient. For anyone who publishes — a business, a portfolio, a blog — it is the single most important trend to understand this year.
When the front door of the internet changes how it answers, every room behind that door has to rethink how it is found.
Spark: an agent that runs while your phone is off
Gemini Spark is Google's new personal agent, and it has one genuinely novel property: it runs in the cloud, so it keeps working on your behalf even when your phone or laptop is completely powered off. That is the difference between an assistant and an agent. An assistant answers when you ask. An agent keeps working when you are not there.
Concretely, Spark can monitor things continuously — an apartment listing, a product restock, a flight price — and alert you the moment something changes. Its "Daily Brief," rolling out to U.S. Google AI subscribers over 18, works overnight to read your inbox, calendar, and deadlines and hand you a short, actionable digest when you wake up. It connects to Google Workspace directly and to outside tools through the Model Context Protocol, the shared standard that lets agents plug into things they were not custom-built for.
Gemini 2026 model tiers at a glance
| Tier | Best for | Where it runs | Notes |
|---|---|---|---|
| Gemini 3.5 Flash | Everyday questions, speed | Search, free Gemini app | ~280 tok/s, 1M context, $1.50/$9.00 per M |
| Gemini (full) | Real work, coding, writing | Gemini app, API | The stronger workhorse model |
| Gemini 3 Deep Think | Hard math, science, logic | Google AI Ultra only | Top reasoning mode, slowest |
| Gemini Omni | Video generation & understanding | Gemini app, Flow | The new video / world model |
Deep Think and the three-tier ladder
Google's naming can feel like alphabet soup, so here is the simple map that cuts through it. There are three rough tiers, and you only need to know what each is for.
Flash is the everyday model. Fast, cheap, good enough for the vast majority of questions. It runs in Search and in the free experiences.
The full Gemini model is the workhorse. Stronger reasoning, better writing and coding, the one you reach for when a task has real weight.
Deep Think is the heavy lifter. Google's top reasoning mode, aimed at hard math, science, logic, and long multi-step problems, and reserved for its highest-paid Ultra tier. You would not use it to draft an email; you would use it to work through something genuinely difficult.
Every major AI company now offers some version of this same three-step ladder. Once you see the pattern, the model names stop being intimidating — they are just rungs.
Omni: the video and world model
Google also introduced Gemini Omni, a model for generating and understanding video, and folded it into the Gemini app and Google's creative tool, Flow. Video generation is the frontier where the most compute is being spent right now, because moving images are vastly harder to produce than text or still pictures.
For most learners this is the least urgent announcement — fun, impressive, and not yet central to daily work. But it signals where the next year of investment is heading, and it is worth filing away. The companies that master video generation will shape a large slice of how the next phase of the internet looks.
Why “everywhere” is the whole strategy
Notice that none of these announcements asks you to visit a new website or download a new app. Flash is inside Search. Spark is inside the cloud. Omni is inside Flow. Google is betting that the winner of the AI era is not whoever has the cleverest chatbot, but whoever puts capable AI inside the tools billions of people already open every day. That is a distribution strategy, and distribution usually beats raw capability.
What builders should actually do
If you build software, three practical moves follow from I/O 2026.
Learn what MCP is. Spark connecting to outside tools through the Model Context Protocol is not a footnote — it is the same standard Anthropic and others have adopted. When the giants converge on one protocol, that protocol becomes the ground you build on. You do not need deep implementation knowledge; you need enough to know how a tool exposes itself and how an agent uses it.
Design for AI-summarized Search. If you publish anything online, the shift from "rank for clicks" to "be the source an AI quotes" is real. Clear, accurate, genuinely useful writing — the kind a model can safely summarize — is rewarded more than ever. Thin, keyword-stuffed pages are rewarded less.
Stay model-portable. Flash is excellent and cheap today, but the default model changed at one conference. Build so you can swap the model underneath your product with a small change, not a rebuild. Treat any single model as a tenant, not the foundation.
What learners should actually do
I teach working professionals and students, and the most common feeling after a week like this is quiet panic: another conference, another hundred features, how could anyone keep up. So let me say the reassuring and true thing. You do not need to keep up with all of it. You need to keep up with the part that touches your work.
Pick one assistant and go deep. Whether it is Gemini, Claude, or ChatGPT matters far less than learning one of them well enough to trust it with real tasks. Depth beats breadth every time. Then add a single new habit: when you face a chore that repeats — summarizing, drafting, looking something up the same way each week — ask whether an agent like Spark could carry it. Start a short list. The tools to clear that list are arriving faster than most people realize, and the people who thrive will be the ones who already know which chores to hand over.
Learn the whole AI stack, not just the headlines
Models, agents, the standards layer, and how to build something real with them — taught hands-on. Our bootcamp runs in five U.S. cities, June through October 2026.
See Our BootcampSources: Google, “100 things we announced at Google I/O 2026” and “Gemini 3.5: frontier intelligence with action” (blog.google); TechCrunch, “Google updates its Gemini app to take on ChatGPT and Claude at I/O 2026” (May 19, 2026); TechRadar and Latent Space coverage of Gemini 3.5 Flash, Spark, and Omni. Figures (280 tok/s, $1.50/$9.00 pricing, 1M-token context) reflect Google's public I/O 2026 statements.
Common questions
Is Gemini 3.5 Flash free to use? Yes, in large part. It is now the default model in the free Gemini app and in AI Mode in Search, so most people will use it without paying. Heavier tiers like Deep Think and some Spark features are reserved for paid Google AI plans (Plus, Pro, Ultra).
What is the difference between Gemini, Gemini Flash, and Deep Think? Think of three rungs on a ladder. Flash is fast and cheap for everyday questions. The full Gemini model is the stronger workhorse for real work. Deep Think is the slow, powerful mode for genuinely hard problems. You match the rung to the task — most of the time, Flash is plenty.
Is Gemini Spark safe to give access to my email and calendar? Spark connects to Google Workspace and to outside tools through the Model Context Protocol. Convenience and access always trade off against privacy. Start by granting it narrow, low-stakes permissions — watching a price, drafting a summary — before you let any agent take consequential actions on your behalf.
Will AI-powered Search hurt websites? It changes the game. When Search answers in a written summary instead of a list of links, fewer people click through. The durable response for anyone who publishes is to be the clear, accurate, trustworthy source an AI wants to quote — and to build an audience that comes to you directly, not only through Search.