GitHub Copilot's Billing Overhaul: From PRUs to AI Credits

Effective June 1, 2026 · Developer Tools / AI Analysis

On June 1, 2026, GitHub Copilot fully replaced its old "Premium Request Unit (PRU)" billing with a token-based "AI Credits" model. The crux: the fallback model is gone, and Copilot stops the moment your credits run out. Light users barely notice the change, but power users who lean on agentic coding every day got hit with bills up to 60× higher. Here's what changed, why GitHub made the move, and how to adapt.

🔍 What Changed — How the Two Billing Models Work

To understand the backlash, start with the fact that the two schemes count usage differently. One meters requests; the other meters tokens processed. That single distinction is what split cost predictability wide open.

① The old model — Premium Request Units (PRUs)

Introduced in June 2025, this model metered a single request as the unit, but how many PRUs a request consumed (its multiplier) depended on the model you picked. The decisive safety net was the fallback: once your PRUs were exhausted, Copilot automatically switched to a cheaper model rather than cutting you off. Because it "slowed down but never stopped," costs were easy to predict.

② The new model — AI Credits (token-based)

On June 1, 2026, PRUs were retired in favor of token-based billing. One AI Credit = $0.01, and every interaction's input, output, and cache tokens are summed and converted at the model's published rate. Because you pay for tokens actually processed rather than per request, the same task can cost up to 10× more depending on session length, context size, and model.

The most consequential change is the removal of the fallback model. What happens when you hit your limit flipped to the exact opposite behavior.


flowchart TD
  A([Credits / limit exhausted]) --> B{Billing model?}
  B -->|Old PRU| C[Auto-fallback to
a cheaper model
work continues]
  B -->|New Credits| D[Feature stops instantly
unusable for the month]
  style A fill:#fef9e7,stroke:#f39c12
  style B fill:#fef9e7,stroke:#f39c12
  style C fill:#eafaf1,stroke:#27ae60,color:#1e8449
  style D fill:#fdedec,stroke:#e74c3c,color:#c0392b

🔁 Diagram in brief: when the limit is reached, the old PRU model fell back to a cheaper model so work continued, whereas the new AI Credits model halts the feature the instant credits run dry — effectively locking you out for the rest of the month.

For reference, the per-model multipliers under the old PRU scheme looked like this. The same "single request" could cost up to 90× more depending on the model.

Model	PRU multiplier
GPT-4o, GPT-4.1	0× (free)
GPT-5.4 mini	0.33×
Claude Sonnet family	1×
Claude Opus 4.5/4.6	3×
Claude Opus 4.6 Fast mode	30×

📊 The Numbers — Plans and Burn Rates Compared

Here are the credits included with each plan after the switch. Notably, the "real dollar value" of those bundled credits is set higher than the monthly fee, so on paper it doesn't look like a loss.

Plan	Monthly fee	Included credits	Stated value
Copilot Pro	$10	1,500	$15
Copilot Pro+	$39	7,000	$70
Business	$19/user	1,900	$19
Enterprise	$39/user	3,900	$39

The catch is the burn rate. Run the same small task (3K input / 1K output tokens) and credit consumption swings nearly 7× depending on which model you choose.

MAI-Code-1-Flash

~0.68

Claude Sonnet 4.6

~2.4

GPT-5.5 (top tier)

~4.5

Unit: AI Credits consumed per small task

✓ One exception: inline code completions and Next Edit suggestions remain unlimited and free. In other words, "typing assistance" stays free, while heavier work — chat, agents, refactoring — all burns credits. That's the dividing line that separated the fates of light users and power users.

💡 Why It Happened — GitHub's Own Admission of Structural Losses

GitHub was unusually candid about its rationale in the official blog. The core problem: charging the same for "a single chat question" and "a multi-hour autonomous coding session" was no longer sustainable.

"A quick chat question and a multi-hour autonomous coding session could be billed to the user at the same cost. GitHub has absorbed a significant share of the rising inference costs, but the current model is no longer sustainable." — GitHub Blog, May 2026

The cause boils down to a structural problem driven by two pressures acting at once.

1. Surging inference costs for frontier models — Per-token costs for the Claude Opus and GPT-5 families became incomparably higher than in the GPT-4 era.

2. The spread of agent mode — Since late 2025, multi-step autonomous coding sessions became routine, and that pattern consumes tens of times the tokens of a normal chat.

In short, a structure where "one chat" and "a three-hour agent session" ate the same PRUs was a clear loss from GitHub's side. The shift to token billing passes that cost on to users while also separating heavy users from light users on cost.

🔥 Impact and Fallout — The Anger, by the Numbers

Right after the announcement, GitHub's official community discussion thread drew 435 comments alongside 904 downvotes vs. 22 upvotes. By ratio, that's effectively the entire community against it.

Community reaction (share of downvotes) 97.6% opposed

22 upvotes · 904 downvotes · 435 comments (official GitHub thread)

The real-world cases shared on Reddit and X are striking. For heavy users, monthly bills jumped by tens of times.

$29 → $750

~25×

$50 → $3,000

~60×

Estimated monthly cost increase, per user reports

One Pro+ user reported burning 8% of the monthly credits in a 2-hour session. Working backward, that's a ceiling of roughly 25 hours per month. There were even allegations that at the automatic midnight switch on June 1, accounts already started with 60% of credits depleted (a retroactive charge for usage just before the cutover). Here's the timeline of the change.

Jun 2025

PRUs introduced

Jun 1, 2026

Credits switch · fallback removed

Sep 2026

Temporary promo ends

The complaints distill into three threads.

🔴 The shock of losing fallback — Previously, hitting your limit just dropped you to a lower-performing model so work continued; now, the moment credits run out, Copilot effectively stops for the rest of the month.

🔴 Unpredictable budgets — Token billing makes costs swing with session, model, and context, prompting reactions like "now I have to do the math before every chat."

🔴 Loss of identity — The prevailing view is that Copilot's core appeal — "use multiple models for a predictable flat monthly fee" — is gone.

🧠 The key asymmetry: light users who only do simple autocomplete and the occasional chat barely see a difference. The hit lands on power users who lean on agentic coding daily — the very segment GitHub has marketed to hardest now loses the most. A paradoxical setup.

⚙️ Multi-Model Switching — Why It Was Loved, and Where It Stands Now

Setting this controversy aside, the multi-model selection Copilot has offered since late 2024 was a standout competitive edge: from a single IDE plugin, you can pick the model that best fits the task. As of 2026, the recommended model mapping looks like this.

Purpose	Recommended models
Fast prototyping / lightweight	GPT-5 mini, o4-mini, MAI-Code-1-Flash
General coding / documentation	GPT-4.1, GPT-4o, Claude Sonnet 4.5/4.6
Complex refactoring / architecture	Claude Sonnet 4.7, GPT-5.5, o3
Advanced reasoning / system design	Claude Opus 4.6, GPT-5.5, o1

Compared with going straight to the providers' APIs (OpenAI, Anthropic), Copilot's advantages are still very much intact.

✓ Zero-config IDE integration — Access multiple models from one plugin with no key management (VS Code, JetBrains, Visual Studio, Neovim).

✓ Automatic code-context awareness — Open files, related files, and directory structure are folded into the prompt automatically. Direct APIs require manual copy-paste.

✓ "Best model" without vendor lock-in — If one model regresses, switch to another instantly.

✓ Unified data protection — A single "no training on your data" policy applies uniformly to every model under one agreement. Especially important for enterprises.

✓ Auto mode — Assigns the optimal model based on task complexity to conserve credits.

That said, the downsides became more visible under the new billing.

• Unpredictable cost — Token prices are abstracted one more layer into credits, hurting intuition.

• Limited model choice — Unsupported models like Mistral and Llama aren't available.

• Hard stop on exhaustion — Direct APIs run indefinitely as long as your wallet has money; Copilot can cut off abruptly.

🟡 Needs confirmation: As of May 2026, Gemini models were removed from the web Copilot Chat. Whether they remain selectable in IDE plugins such as VS Code and JetBrains requires further verification.

🎯 Takeaways and Implications

This change reads as a signal that GitHub is repositioning Copilot from an "AI subscription service" to an "AI infrastructure platform." Light users see almost no real change, while power users took a direct hit.

💼 Short-term cushion: Business and Enterprise plans get temporary promotional credits from June through August (worth $30 for Business, $70 for Enterprise). But further backlash is expected after the September expiry.

💼 Strategic implication: The core strength — multi-model switching — is still alive. But we've moved into an era where you must reframe its value not as "a cheap flat fee" but as "the right model for the right task," and pair it with cost management (favor lightweight models, reach for high-end ones only when needed).

💼 Churn risk: The most-shared community sentiment is, "Being able to hand off long tasks without worrying about cost was Copilot's strength — now OpenRouter or a direct API might be the better bet." In other words, this change is likely to be an inflection point that pushes heavy users to look for alternatives.

🧠 Ultimately, the real question isn't "did Copilot get more expensive?" but "is my workload light or heavy?" If you mostly use autocomplete and light chat, keep using it as you do. But if you run agent sessions for hours a day, a "model diet" — defaulting to lightweight models and pulling out high-end ones only when truly needed — becomes an essential habit for the new era.

📚 References

▶ GitHub Blog — Official announcement of the move to usage-based billing

▶ TechTimes — Backlash over 10x–50x cost spikes

▶ gHacks — Developer backlash over rapid credit depletion

▶ Xebia — A technical comparison of PRUs vs. token billing

▶ GitHub Blog — Which AI model should I use?

▶ GitHub Docs — Models and pricing

▶ TechCrunch — Copilot goes multi-model

※ The pricing, plan, and model details summarized here are current as of June 2026, and service policies may change without prior notice. Always confirm the latest terms in the official documentation before any billing applies.

SW Develope

Notes on software development

I gather material from a software-development perspective, organize it myself, and give it one more check before publishing.

Blog

This article was written based on publicly available data and sources. Last updated: June 8, 2026

이 블로그 검색