Gemini API Now Has Spend Caps: Per-Project Budget Controls Land in AI Studio

Google shipped spend caps for the Gemini API today. Logan Kilpatrick, Google’s Lead Product Manager for AI Studio and the Gemini API, announced it on X:

Spend caps in the Gemini API, available starting today!!

This is another step forward of many to give developers more control and peace of mind when building with Gemini. Please go set a cap and send us any feedback as you use them! pic.twitter.com/P4xPoiax3V
— Logan Kilpatrick (@OfficialLoganK) March 12, 2026

This has been a long-requested feature. The Google AI Developers Forum has threads going back months with developers asking for a way to cap their monthly bills. Until today, the only options were Cloud Console budget alerts (which notify but don’t stop spending) or manually monitoring usage. Now there’s a hard cap.

How It Works

Spend caps are set per project in Google AI Studio. Here’s the setup:

Go to the Spend page in AI Studio
Find Monthly spend cap
Click Edit spend cap
Set your dollar amount

Once your project hits the cap, API requests stop going through until the next billing cycle. You need project editor, owner, or admin access to set or change the cap.

This is useful if you’re running multiple projects under one billing account and want to make sure a runaway experiment on one project doesn’t eat the budget for everything else.

What Gets Counted

The Gemini API bills based on four things:

Metric	Description
Input tokens	Tokens sent in your prompts
Output tokens	Tokens generated in responses
Cached tokens	Tokens read from context caching
Cache storage	Duration cached content is stored

All of these count toward your spend cap. The pricing page has the per-model rates. Free tier usage doesn’t count since it’s not billed.

The Catch: Billing Delay

There’s one important caveat from the docs: billing data can be delayed by up to 10 minutes in AI Studio. If your project is near its cap and you send a burst of requests, some of those requests may process before the system realizes you’ve crossed the limit.

Batch API completions are also noted as a potential source of overages, since large batch jobs can accumulate costs faster than the billing system tracks them.

In practice, this means your actual spend might exceed your cap by a small amount. It’s not a zero-tolerance hard stop. Think of it as a cap with a ~10 minute reaction time. For most projects that’s fine, but if you’re operating close to the line on a high-volume endpoint, budget some headroom.

Context: The Tier System

For reference, the Gemini API has four access tiers that determine your rate limits:

Tier	Requirement	Key Benefit
Free	Eligible country	Limited models, free tokens, lower rate limits
Tier 1	Active billing account	Production rate limits, all models
Tier 2	$250+ cumulative spend, 30+ days	Higher RPM/TPM limits
Tier 3	$1,000+ cumulative spend, 30+ days	Highest rate limits

Spend caps are orthogonal to this. Your tier determines how fast you can call the API; your spend cap determines how much you’re willing to pay per month. You can be on Tier 3 with generous rate limits and still set a $50/month cap if that’s what your project needs.

Why This Matters

The Gemini API’s free tier is generous enough that many developers start building without thinking much about billing. Then they flip to the paid tier for production, and suddenly costs are unpredictable. This has been especially fraught recently: Google tightened free tier access and adjusted rate limits across tiers over the past few months, leaving some developers unclear on what they’d actually pay.

Spend caps give a simple answer: set a number, and Google stops charging beyond it.

For comparison, OpenAI has had usage limits (monthly caps on API spending) for a while. Anthropic offers monthly spend limits configurable through the Console. Google was one of the last major API providers to offer this kind of control natively. It’s a table-stakes feature, and it’s good to see it land.

What to Do

If you’re using the Gemini API on a paid tier:

Set a cap now. Even if it’s generous, having a cap prevents accidental runaway costs from bugs, retry loops, or testing scripts left running.
Leave headroom for the billing delay. If your real budget is $100/month, set the cap at $90 to account for the ~10 minute processing lag.
Check per-project if you have multiple. Caps are per project, not per billing account. If you have three projects, you need three caps.
Monitor batch jobs separately. Batch API usage can exceed caps before they’re enforced, so keep an eye on large batch submissions.

Sources: Logan Kilpatrick on X (March 12, 2026), Gemini API Billing documentation, Gemini API Pricing, Gemini API Rate Limits, Google AI Developers Forum: Budget control requests