Tuesday Morning After Presidents' Day: The AI Coding News You Missed

Hope you enjoyed the long weekend. The AI coding world did not take a break. Here’s everything worth knowing from the past two weeks, sorted by what matters most if you write code for a living.

The Big Releases

OpenAI Shipped a Desktop Coding App and Three New Models

OpenAI had a busy February. On February 2 they launched the Codex desktop app for macOS, a standalone application for managing multiple AI coding agents at once. Each agent can work independently for up to 30 minutes before returning completed code. Available on Plus, Pro, Business, Enterprise, and Edu plans, with temporary free access for Free and Go users.

Three days later they released GPT-5.3-Codex, their strongest agentic coding model so far. 25% faster than GPT-5.2-Codex, new highs on SWE-Bench Pro and Terminal-Bench. One detail worth noting: the Codex team used early versions of the model to debug its own training pipeline. A model that helped build itself.

On February 12, GPT-5.3-Codex-Spark followed, a smaller variant optimized for real-time coding that pushes over 1,000 tokens per second. This is the first OpenAI model running on Cerebras hardware, part of a multi-year deal worth over $10 billion. Research preview only, ChatGPT Pro users.

And GPT-5.1-Codex-Max is built for long-running, project-scale work. Natively trained to operate across multiple context windows through “compaction,” it can handle millions of tokens in a single task. Internally, 95% of OpenAI engineers use Codex weekly, shipping roughly 70% more pull requests since adoption.

Anthropic Released Opus 4.6

On the exact same day OpenAI released GPT-5.3-Codex (February 5), Anthropic put out Claude Opus 4.6. That timing was deliberate.

What coders should care about:

1 million token context window (up from 200K), currently in beta
128K max output tokens with adaptive thinking
Agent Teams: parallel coding workflows where multiple Claude agents collaborate on the same codebase
Better planning, code review, debugging, and reliability in large codebases
Same pricing as Opus 4.5 ($5/$25 per million input/output tokens), no price increase despite the capability jump

The most talked-about result: during testing, Opus 4.6 independently discovered over 500 previously unknown zero-day vulnerabilities in open-source libraries including Ghostscript, OpenSC, and CGIF. It used standard tools like debuggers and fuzzers with no specialized instructions. All were validated by Anthropic staff or outside researchers.

The release also moved software stocks, which keeps happening with Anthropic launches.

Claude Code Keeps Shipping

The Claude Code CLI (now in the 2.1.x series) picked up a bunch of updates in February:

Auth subcommands: claude auth login, claude auth status, claude auth logout
Windows ARM64 native binary support
Agent Teams in research preview (multi-agent collaboration via environment flag)
/rename command that auto-generates session names from conversation context
/debug command for troubleshooting sessions
Fast mode for Opus 4.6
Skills from additional directories (--add-dir) now load automatically
PDF handling with page-range support for large documents
Bug fixes for running Claude Code inside another session, MCP image streaming, and Bedrock/Vertex model identifiers

The Retirements

OpenAI Killed GPT-4o

On February 13, OpenAI retired GPT-4o, GPT-4.1, GPT-4.1 mini, and o4-mini from ChatGPT. The models are still available through the API, but they’re gone from the consumer product. OpenAI says only 0.1% of users were still choosing GPT-4o daily.

The retirement generated real backlash, mostly from users who had grown attached to GPT-4o’s conversational style. Turns out model personality is a product feature people care about, even when benchmark scores say the newer model is better at everything.

IDE and Editor Updates

GitHub Copilot

Copilot for JetBrains got Agent Skills in public preview, letting agent mode support skills that tailor Copilot for specific workflows. New settings page with individual toggles for Agent mode, Coding Agent, and Custom Agent.

GitHub also introduced Agentic Workflows in technical preview: Markdown-based workflows that convert to GitHub Actions, available via the gh aw CLI with sandboxed execution across multiple coding agents.

The Copilot CLI (v0.0.408) added /streamer-mode to hide preview model names and quota details. Nice for anyone streaming their dev workflow.

Cursor

Cursor has been busy:

Cursor Blame: enterprise feature that shows whether code came from tab completions, agent runs (broken down by model), or human edits. Useful for teams who want to understand how much AI-generated code is actually in production.
Interactive Q&A for agents: agents can now ask clarifying questions mid-task while continuing to read files, make edits, or run commands in the background.
Subagent architecture: parallel processing of complex tasks via independent sub-agents.
CLI improvements: plan mode (design before code), ask mode (explore without changes), handoff to cloud agents, word-level inline diffs, and MCP authentication.

Salesforce adopted Cursor across their engineering org and reported 30%+ faster velocity and double-digit code quality improvements across 20,000 developers.

Windsurf

Windsurf added Opus 4.6 (standard and fast mode) and GPT-5.3-Codex-Spark support in the past two weeks. They also launched an Arena Mode Leaderboard for competitive model comparison. Bolt v2 is pushing to be the first platform that bundles databases, hosting, auth, and payments directly in the browser alongside vibe coding agents.

Google Gemini

Google released a major update to Gemini 3 Deep Think on February 12, aimed at science, research, and engineering problems. Gemini 3 Flash is now the default model in the Gemini app, with reasoning that holds up against the larger models.

The move that probably affects the most developers day to day: deep Gemini 3 integration into Chrome for macOS, Windows, and Chromebook Plus, with an AI side panel for multitasking across the web. Not coding-specific, but Gemini is now sitting right where developers spend half their time anyway.

The Money

Funding numbers in AI continue to be absurd:

Anthropic’s $30B Series G at a $380 billion valuation. Second-largest venture deal ever. Revenue run rate is $14 billion, with Claude Code alone at $2.5 billion annualized.
OpenAI signed hardware deals with Nvidia ($100B investment intent), AMD (6 GW of MI450 GPUs), and Cerebras ($10B+ multi-year agreement). They are clearly trying to reduce their dependence on any single chip supplier.
Snowflake and OpenAI announced a $200 million partnership bringing AI into enterprise data workflows.
Disney became Sora’s first major content partner with a three-year deal and a $1 billion equity investment in OpenAI.
Big Tech AI spending across Google, Microsoft, Meta, and Amazon is approaching $700 billion in 2026.
Other notable raises: ElevenLabs ($500M Series D at $11B), Runway ($315M Series E at $5.3B), Ricursive Intelligence ($300M Series A at $4B).

Industry Signals

Amazon wants 80% of its developers using AI coding tools at least once a week. Not a suggestion. A directive.

Fortune reports developers are saying they’ve “abandoned traditional programming” in favor of Codex and Claude-based workflows. MIT Technology Review named “Generative Coding” one of its 2026 Breakthrough Technologies.

The Pentagon is fighting with Anthropic over military use of Claude. They reportedly threatened to cancel a $200 million contract unless Anthropic allows “all lawful purposes.” Anthropic hasn’t budged.

Replit revamped its pricing. The new Pro plan at $100/month targets teams. Agent 3 has 10x more autonomy than v2 and includes a self-healing loop that tests apps in a live browser.

Snowflake unveiled Cortex Code, an AI coding agent built around enterprise data context. Your data warehouse now wants to be your IDE.

On the Horizon

Claude Sonnet 4.6 dropped hours after this roundup was published. Full coverage here. The short version: near-Opus performance at one-fifth the cost, now the default model for Free and Pro users.

Apple Xcode 26.3 now has the Claude Agent SDK integrated directly, so developers can use Claude’s autonomous coding capabilities inside Apple’s IDE without leaving Xcode.

So What

OpenAI and Anthropic released competing flagship coding models within minutes of each other on February 5. Both companies are shipping desktop apps, agent architectures, and multi-model strategies. Google is putting Gemini in Chrome. Cursor is logging which lines of code your AI wrote versus which ones you wrote. Amazon is telling developers to use AI tools or else.

Two weeks of news, and the pace is only picking up.

Happy Tuesday.