What Anthropic gating Opus behind "extra usage" actually means for your Claude Code spend

Anthropic quietly flipped a switch on Claude Pro: Opus is no longer the default model in Claude Code. If you're running Claude Code at volume for client work, that change has real billing implications you need to understand before your next sprint starts.

What changed — and what the docs actually say

Previously, Claude Code on a Pro subscription routed to the best available model. In practice that meant Opus when it was available. The new behavior: Sonnet is the default. Opus only activates when you manually enable "extra usage" in your Claude Pro settings.

Anthropic's model-configuration documentation is matter-of-fact about this. Extra usage is an explicit toggle, not something that carries over from prior sessions. If your developers haven't touched that setting, they've been running on Sonnet for their last several sessions — without a notification, without a warning, without a visible indicator in the Claude Code interface.

That's not catastrophic. Sonnet 4.6 handles the majority of agency development work without issue: UI components, bug fixes, migrations, code review, documentation rewrites. But for workflows that were built on the assumption of Opus-level reasoning — complex codebase refactors, multi-step architecture analysis, anything where the model needs to hold significant context and reason carefully across a long task — the silent downgrade matters. The output quality difference between Sonnet and Opus on hard problems is real, and it shows up in the places you least want it to: the reasoning your developer trusted in an architecture decision, the refactor that looked clean in a 20-file diff.

Why this hits agency teams harder than solo developers

A solo developer using Claude Code for personal projects has one billing dimension: their own Pro subscription. An agency team running Claude Code across multiple client repos has three separate concerns that compound:

Seat cost — which developers are on Pro, which are on Free, and whether that maps to the work they're doing
Model cost — Sonnet tokens versus Opus tokens in API pricing, which diverges significantly at volume
Extra-usage add-on — Anthropic's new gate for Opus access, which is per-account, not per-project

The billing math changes when you're quoting client work. If you've been pricing Claude Code hours at a rate that assumed Sonnet-level API cost, but your developers have extra-usage enabled and are hitting Opus by default, your margin assumptions are wrong. If you've been selling Opus-quality output as a differentiator and your developers are now on Sonnet-by-default without realizing it, your quality promises may not be landing.

Both are problems. Neither shows up obviously in invoices or sprint retrospectives without explicit tracking. The fix is to make the model explicit at the repo level, not assumed from account settings.

The model-config pin you should add today

Claude Code reads a CLAUDE.md file at the project root and supports a model field that pins the default for every session in that repo. This is the right place to lock your assumption:

model: claude-sonnet-4-6

Add that line to the top of each client repo's CLAUDE.md. It overrides whatever a developer's account-level extra-usage setting is. Every session defaults to Sonnet. Opus becomes a deliberate CLI override — claude --model claude-opus-4-5 — not an ambient default that costs you without a decision being made.

For those already using Claude Code guardrails, this is the same principle behind Version Sentinel's approach to guardrails inside Claude Code: lock assumptions at the toolchain level so they can't drift silently in production. A model pin is a one-line version of that same idea applied to spend, not dependencies.

Most client work doesn't need Opus pinned. The work that does — architectural decisions, large-scale refactors, reasoning-heavy analysis — should be surfaced explicitly anyway, which is another reason the deliberate override pattern is better than an ambient default.

A billing framework for Claude Code client work

Here's what works for structuring this cleanly across client engagements:

Default to Sonnet across all sprint work. Pin it in CLAUDE.md so there's no room for account-level drift. Sonnet covers the bulk of billable development work without issue.
Reserve Opus for architecture-tier sessions. When a client is paying for a technical strategy engagement, a major refactor, or a greenfield architecture, Opus cost is built into that engagement tier. For a 2-hour bug fix, it isn't.
Log Opus overrides separately. If a developer invokes Opus explicitly during a sprint, that session gets logged against the client project at the higher model rate. This is a practice habit, not a system — but it prevents margin surprises at month-end.
Audit extra-usage settings across your team. Because extra-usage is per account, not per project, one developer with it enabled can hit Opus in any project that doesn't have a model pin. The pin is your fence.

We run this pattern across Get Indiana, Roots Realty, and ISFE work. Sonnet handles routine production code without issue. Opus gets explicitly invoked for architectural decisions — and that's treated as a cost decision, not a convenience.

If you're still calibrating what Opus actually delivers on real tasks versus what Sonnet can handle, the earlier piece on what Claude Opus 4.6 means for your business breaks down the capability gap in concrete terms. The short version: Opus earns its cost on reasoning-heavy tasks, not routine code generation.

The takeaway

Anthropic's change isn't an emergency — Sonnet is a capable model and most Claude Code work doesn't require Opus. But the shift from "best available by default" to "Sonnet by default, Opus on explicit request" means you need a model policy for client work. Not a mental note. A line in CLAUDE.md.

Two minutes of configuration now is cheaper than a billing audit at end of quarter. Add the pin, audit your team's extra-usage settings, and treat Opus as a deliberate cost decision rather than a background assumption. That's the whole change.

What Anthropic gating Opus behind "extra usage" actually means for your Claude Code spend

What changed — and what the docs actually say

Why this hits agency teams harder than solo developers

The model-config pin you should add today

A billing framework for Claude Code client work

The takeaway

More writing

What the OpenAI Partner Network actually means for small agencies

What's actually in my .claude/skills directory (and why you should have one)

Nvidia RTX Spark and the case for on-prem AI for SMB clients

Anthropic Mythos hits the EU: what it signals about Claude's enterprise roadmap

MCP in production: what we actually wired up at Tuscan and what broke

Qwen 3.6 vs Claude vs GPT: When Local Models Actually Make Sense for Agency Work

Start a project.

Start a project.