Microsoft Ships Its First Copilot-Native Model: MAI-Code-1-Flash

GitHub Copilot has always been a platform for other companies’ models. OpenAI, Anthropic, and Google have all had models available through the Copilot model picker. MAI-Code-1-Flash, which started rolling out June 2, is different: it’s the first model Microsoft built specifically for the platform.

What it is

MAI-Code-1-Flash is positioned as a small, efficient coding model. GitHub describes it as delivering “best-in-class quality for its size,” which suggests the design goal was maximizing capability at low inference cost rather than pushing benchmark scores at the top end.

The name implies it’s part of a series. A “Flash” suffix typically indicates a lighter, faster variant with a more capable counterpart in development.

The model is available in the VS Code model picker. You select it the same way you select any Copilot model.

Who gets it

The rollout covers Copilot Free, Pro, Pro+, Max, and Student plans. A June 5 update to the announcement added Student subscribers to the list, clarifying they’re included on the same timeline as paid tiers. GitHub is rolling it out gradually, so it may not appear in the model picker immediately for all users.

Why this matters for token billing

The timing is notable. MAI-Code-1-Flash arrived just days after GitHub switched Copilot to usage-based billing on June 1. Under the new model, every interaction with Copilot consumes AI credits, and more capable (or context-heavy) models cost more credits per query.

A purpose-built small model with efficient inference has a direct impact on credit consumption. For completions, quick questions, and other high-frequency lightweight tasks, using a model designed for that workload costs less than reaching for GPT-5 or Claude Opus 4.8.

Whether MAI-Code-1-Flash performs well enough on those tasks to displace other models in practice is something developers will test as the rollout expands. The stated claim is best-in-class for its size, which is a reasonable bar — but the real test is whether code completions are accurate enough that you don’t end up reaching for a heavier model anyway and negating the savings.

Source: GitHub Changelog, June 2 2026