8 min read
Google Just Put 200+ Models, Including Claude, Inside One Agent Platform. Single-Model Lock-In Is Over.

Category
The AI World
Share the article
Two years ago, buying an "enterprise AI platform" meant picking one vendor's model and building behind its API. The bet you made on day one was the bet you were stuck with. That definition just collapsed. Google, the world's third-largest cloud provider, now ships an agent platform with more than 200 models available out of the box, and a competitor's frontier model sits right there in the catalog next to its own. Model-agnostic is no longer the contrarian position. It's the product.
The shift showed up at Google Cloud Next 2026, held April 22-24 at Mandalay Bay in Las Vegas, where Google made 260 product announcements across three keynotes to roughly 32,000 attendees. The headline for enterprise buyers wasn't a bigger model. It was a bigger argument about architecture.
What Google actually consolidated
Google folded Vertex AI into a unified offering called the Gemini Enterprise Agent Platform. Per Google's own announcement, "all Vertex AI services and roadmap evolutions will be delivered exclusively through the Agent Platform, rather than as a standalone service." The standalone model API is gone. Everything routes through an agent layer now.
The platform gives developers first-class access to more than 200 models through Model Garden. That list spans Google's own Gemini and Gemma families, open models like Llama, and third-party frontier models. Google states plainly that customers have "support for third-party models like Anthropic's Claude Opus, Sonnet and Haiku." Anthropic's Claude family runs natively inside the same catalog as Gemini, billed and managed as a serverless API with no separate infrastructure to provision.
Here's the before-and-after, in plain terms.
Enterprise AI platform, ~2024 | Gemini Enterprise Agent Platform, 2026 | |
|---|---|---|
Core unit | A model behind an API | An agent layer that routes across models |
Model choice | One vendor, one family | 200+ models, Google + Llama + Anthropic's Claude |
Build surface | Code against one API | Agent Studio (low-code visual builder) |
Cross-tool reach | Custom integrations | Managed remote MCP servers (now GA) |
Cross-agent comms | Mostly proprietary | Agent2Agent (A2A) protocol, production-grade |
Lock-in posture | Single-vendor by default | Multi-model by default |
The pieces around the models matter as much as the count. There's a managed remote MCP server, now generally available, so agents can reach external tools and data without bespoke plumbing. There's a web-browsing agent, Project Mariner, that operates in cloud sandboxes. And there's the Agent2Agent protocol, which Google and partners describe as running in production rather than pilot, with 150 organizations routing real tasks between agents built on different platforms. (We've covered A2A and MCP in depth separately; the point here is narrower.)
The point here is what the catalog says about strategy.
Why putting Claude in Google's catalog is the real signal
A cloud vendor shipping a third-party frontier model inside its flagship agent platform is not a footnote. It's a concession about how enterprises actually buy.
Google would prefer you run Gemini end to end. Vertical integration is the whole pitch, from the eighth-generation TPUs up through the agent runtime. Yet the same launch makes Claude a first-class citizen, because the buyers Google is chasing have already decided they want options at the model layer. When the company with the strongest incentive to push one model instead ships 200, the single-model thesis is finished as a default.
This tracks what enterprises were already doing. As we argued in why one model can't fit every business, the honest answer to "which model do you run?" inside most large organizations is "all of them." Different tasks reward different models. A reasoning-heavy underwriting step and a high-volume classification step have no business sharing one engine, and the cost and latency math rarely lines up either.
It's worth being precise about why this is a concession and not just generosity. Google charges for Claude usage the same way it charges for Gemini, so there's a revenue logic to hosting a rival. But the deeper reason is retention. An enterprise that has to leave Google Cloud to run Claude is an enterprise that might keep going and leave for everything else too. Keeping the rival model inside the platform keeps the customer inside the platform. That only makes sense if the customer was going to demand the rival model anyway, which is the whole point.
What this means if you're already on Vertex AI
If your team built against Vertex AI, the first thing to know is that it isn't being shut off, it's being absorbed. Google's language is that Vertex services continue, delivered through the Agent Platform rather than as a separate product. Your existing model calls don't stop working overnight.
What changes is the unit you provision and reason about. The thing you deploy is now an agent with a routing policy, not a bare model endpoint. For most teams that's a smaller code change than it sounds and a bigger mental shift than it sounds. The inference call looks familiar. The surrounding assumptions, that "the model" is the project, that you pick it once, that the deployment is a model behind an endpoint, are the parts that age out.
The practical move is to treat the consolidation as a prompt to separate two decisions you may have been making as one: which model runs a given step, and what governs the agent that calls it. Google just made the first decision cheap and reversible. The second decision, governance and orchestration, is where the work moves.
The numbers behind the move
The market context explains why Google reorganized its entire AI stack around agents rather than chat. Gartner predicts that 40% of enterprise applications will feature task-specific AI agents by the end of 2026, up from less than 5% in 2025. That's an eightfold jump in roughly a year, and platforms get built around where demand is heading, not where it sits.
The value at stake is the other half of the story. McKinsey estimates generative AI could deliver $2.6 trillion to $4.4 trillion in economic benefits annually across 63 use cases. Capturing a slice of that doesn't come from one model getting marginally better. It comes from getting many models to do useful work together, reliably, inside real business processes. A platform reorganized around agents rather than a single model is a bet on exactly that, and it's a bet the rest of the market is making at the same time.
(If you'd rather get this kind of breakdown in your inbox than dig through keynote recaps, we send a weekly write-up on where enterprise AI is actually heading.)
What this changes for how you buy
The lock-in calculus has flipped. A year ago, standardizing on one model looked like discipline. Now it looks like risk, because the frontier moves every few months and the vendor you standardized on may not lead the next benchmark. Anthropic, OpenAI, and Google trade the top spot often enough that betting the whole stack on one of them is a bet against your own optionality.
The unit of architecture has changed too. The thing you design around is no longer a model, it's an orchestration layer that decides which model handles which step. Google's own restructuring says this out loud: the model API became a sub-component of an agent platform. As we put it in the case for multi-model orchestration, the model is now a routing decision, not a foundational commitment.
A routing layer, in one concrete example
Take an insurance claims workflow. The step that reads a messy adjuster note and decides whether it signals fraud is reasoning-heavy and low-volume; it rewards a frontier reasoning model, and paying premium rates for it is worth it. The step that classifies fifty thousand inbound documents a day by type is high-volume and latency-sensitive; running it on a frontier model is wasted money and wasted seconds. One workflow, two steps, two different models. The only way that works is if something above both models routes each step to the right one. That something is the orchestration layer, and it's the reason the model becomes a routing decision rather than a platform commitment.
And interoperability is becoming table stakes. A2A in production and managed MCP servers mean agents are expected to talk across platforms and vendors. An agent that can only talk to other agents from the same vendor is going to feel as dated as software that couldn't read a competitor's file format.
Three takeaways for the people signing the contracts:
Don't standardize on a model. Standardize on a routing layer. The model you pick today is unlikely to be the one you want in twelve months. Architecture that assumes swaps survives. Architecture that hard-codes one model doesn't.
Treat third-party model support as a baseline requirement. When Google ships Claude inside its own platform, "we support multiple providers" stops being a differentiator and starts being the floor.
Weight interoperability in the evaluation. Production-grade cross-agent protocols are now shipping from the biggest vendors. Buy for a world where your agents talk to systems you don't control.
This is the consensus now, not a contrarian take
For a while, model-agnostic, orchestration-first architecture was a position you had to defend. It's the one we built Beam's ModelMesh around: route each task to the model that handles it best, keep the orchestration layer above any single provider, and treat models as interchangeable parts rather than the foundation. The platform itself is model-agnostic on purpose, with Claude, Gemini, and others available behind one routing layer.
When we started making that argument, it read as a hedge against vendor risk. After Google Cloud Next 2026, it reads as the direction the entire market is moving. One of the biggest cloud vendors just made multi-model and cross-agent interoperability the default posture of its flagship agent platform. The debate about whether enterprises should lock into one model is largely settled. The interesting questions now are about how you route, govern, and orchestrate across many.
That's a better problem to have. It means the leverage moved from picking the right model to building the right system around all of them.





