What you’re describing sounds like a pattern a lot of users have reported recently with Anthropic’s Claude Code, and it usually falls into a few technical explanations rather than a single “bug,” though the experience can definitely feel like something is broken.
What users are noticing
Across GitHub and Reddit discussions, the common complaints are:
-
Usage limits depleting much faster than expected
-
“Long session” workflows ending within minutes
-
High token consumption even for seemingly small tasks
-
Rapid exhaustion during iterative coding or debugging
Why this can happen (most likely causes)
1. Token-heavy context buildup
Claude Code works by continuously feeding:
-
your entire conversation history
-
file contents
-
tool outputs
-
intermediate reasoning context
If the session grows large, every new request becomes significantly more expensive in tokens—even if the user isn’t aware of it.
2. Hidden context repetition in coding workflows
Coding assistants often:
-
re-send large files repeatedly
-
re-analyze unchanged code
-
regenerate similar outputs across steps
That can multiply usage faster than expected.
3. Tool-driven loops
If Claude Code is:
-
running tests
-
fixing errors
-
re-reading logs
-
iterating automatically
Each cycle may count as a full high-cost request, even if it feels like “one task.”
4. More aggressive safety/quality routing
Some users suspect (and this is not confirmed publicly) that:
-
more requests are being routed through higher-reasoning models
-
or expanded context windows are being used more frequently
That would also increase usage burn rate.
5. Plan limit visibility mismatch
Some users confuse:
-
rate limits (requests per time window)
with -
token-based usage caps
So a “minutes instead of hours” experience can happen if a few large requests spike token usage early.
Important reality check
There’s no confirmed public statement indicating a systemic “bug” causing incorrect metering, but the experience spike is very real and usually explained by:
modern coding agents consuming far more hidden context than users expect
Bottom line
With tools like Claude Code from Anthropic, the main issue isn’t necessarily stricter limits—it’s that agent-style workflows are extremely token-expensive under the hood, especially when they repeatedly process large codebases or debug loops.
If you want, I can break down how to structure prompts or workflows to dramatically reduce token burn (there are a few simple habits that make a big difference).
This update actually clarifies what’s going on—and it’s less about random metering errors and more about a deliberate change in how usage is distributed.
For Anthropic’s Claude Code, the key detail is in the official statement you quoted: they are changing session-based rate limits during peak hours, not necessarily reducing total weekly usage.
What’s actually changing
Anthropic says:
-
Total weekly limits stay the same
-
But 5-hour session limits now drain faster during peak hours
-
Peak window: 5am–11am PT / 1pm–7pm GMT
-
About ~7% of users are affected more noticeably
So what users perceive as “sudden exhaustion” is often:
the same usage quota being consumed more aggressively per session block during high-demand periods
Why users feel like something broke
Even if totals didn’t change, this feels like a downgrade because:
1. Session-based budgeting is very sensitive
If you rely on long continuous coding sessions:
-
the system now penalizes heavy bursts more quickly
-
you hit caps mid-workflow instead of gradually
2. Agentic workflows amplify consumption
Claude Code tasks like:
-
multi-file edits
-
iterative debugging loops
-
tool-based execution
can spike token usage unpredictably
So even “normal behavior” can suddenly cross new thresholds.
The confusing part: perception vs actual limit
What users are reporting (like “21% → 100% in one prompt”) can happen when:
-
a single request triggers large context expansion
-
multiple tool calls are bundled internally
-
caching behavior changes between sessions
-
peak-hour throttling increases per-request cost weighting
That creates the impression of sudden usage spikes, even if billing logic is consistent.
Why Anthropic changed it
Based on their statement, the goal is:
-
handle growing demand
-
prevent overload during peak hours
-
shift heavy workloads to off-peak times
-
maintain infrastructure stability without lowering total weekly caps
So it’s essentially load shaping, not a pure usage cut.
The real user impact
Even though weekly limits remain unchanged:
-
long uninterrupted coding sessions are harder
-
“set and forget” agent workflows get disrupted
-
perceived value of Max tiers ($100/$200) feels worse during peak windows
That’s why the backlash is so strong in communities like r/ClaudeCode and GitHub issues.
Bottom line
What’s happening with Claude Code from Anthropic is not a silent reduction in overall usage—but a restructuring of how usage is consumed across time, which heavily impacts users doing long, continuous, agent-style coding sessions.
If you want, I can explain how to structure Claude Code workflows so they avoid hitting these session cliffs (there are a few effective patterns people are using now).
