Why is OpenAI playing catch-up to Claude Code instead?

By: blockbeats|2026/03/13 18:00:01

Original Article Title: Inside OpenAI's Race to Catch Up to Claude Code
Original Article Author: Maxwell Zeff, Wired
Translation: Peggy, BlockBeats

Editor's Note: In the rapidly evolving landscape of AI programming agents, OpenAI, which once led the generative AI wave with ChatGPT, has unexpectedly become a "chaser" in this key race. In stark contrast, Anthropic, founded by former OpenAI members, has quickly gained popularity in the developer community and enterprise market with Claude Code, emerging as one of the key leaders in the AI programming tool space.

This article, through interviews with OpenAI executives, engineers, and several developers, reveals the true process behind this competition: from the early days when the OpenAI Codex project was split, resources were shifted to ChatGPT and multimodal models, to internal team reintegration, and accelerated launch of AI programming products, OpenAI is undergoing a transformation from strategic oversight to comprehensive catch-up. In a sense, this is not a lag in technical capabilities, but a misalignment of strategic pace: the ChatGPT breakthrough shifted the company's priorities, the partnership with Microsoft constrained the product roadmap, while Anthropic had earlier bet on the AI programming track.

Beyond this competition, deeper issues are gradually emerging: as AI agents begin to take on more cognitive work, software development processes and even white-collar labor itself may be redefined.

The following is the original article:

OpenAI CEO Sam Altman rests his legs on the office chair, tilts his head back to look up at the ceiling, as if contemplating a not-yet-formed answer. In a sense, this is also related to the environment.

OpenAI's new headquarters in Mission Bay, San Francisco, is a modern building made of glass and light wood, almost resembling a "tech temple." On the display shelf behind the front desk are manuals introducing the "Eras of AI," as if depicting a path to technological enlightenment. The stairwell wall is filled with milestone posters of artificial intelligence development, one of which records such a moment: thousands of viewers witnessed live as a machine defeated a top esports team in a "Dota 2" match. In the corridors, researchers wearing team slogan-themed hoodies move back and forth, one of which says, "Good research takes time." Of course, ideally, not too long.

We are sitting in a large conference room. The question I posed to Altman was about the ongoing AI programming revolution and why OpenAI doesn't seem to be leading the charge in this wave.

Today, millions of software engineers have begun delegating some of their programming work to AI, causing many in Silicon Valley to confront a reality for the first time: automation may touch their own jobs. Coding agents have thus become one of the few use cases where companies are willing to pay a premium for AI. In theory, such a moment could and should be the next "triumph" on the OpenAI stairwell poster. But now, the top-billed name is not theirs.

The company in question has a rival, Anthropic, an AI company founded by former OpenAI members. With its coding agent product Claude Code, Anthropic has experienced explosive growth. The company revealed in February that the product contributed nearly a fifth of its business, corresponding to an annualized revenue of over $2.5 billion. In contrast, as of the end of January, OpenAI's in-house programming product, OpenAI Codex, had an annualized revenue just slightly above $1 billion, according to a source familiar with the matter.

The question is: why is OpenAI lagging behind in this AI programming race?

"The value of being first is immense," Sam Altman said after a moment of contemplation. "We've experienced this with ChatGPT."

However, in his view, now is the time for OpenAI to fully embrace AI programming. He believes that the company's existing model capabilities are powerful enough to support highly complex coding agents. Of course, such capabilities are not coincidental; the company has invested billions of dollars in model training for this purpose.

"This will be a huge business," Altman said. "Not only because of the economic value it brings itself but also because of the general productivity that programming can unleash." He paused for a moment and added, "I rarely use this term lightly, but I think this is likely to be one of those markets that reaches a scale of trillions of dollars."

Furthermore, he believes that OpenAI Codex may be the "most probable path" to achieving Artificial General Intelligence (AGI). According to OpenAI's definition, AGI is an AI system that can surpass human performance in most economically valuable work.

Why is OpenAI playing catch-up to Claude Code instead?

Sam Altman, CEO of OpenAI. Photo: Mark Jayson Quines.

However, despite Altman's confident assessment in a laid-back manner, the reality inside the company over the past few years has been much more complex. To understand the more complete internal story, I interviewed over 30 sources, including current OpenAI executives and employees interviewed with company approval, as well as some former employees who provided insights into the company's internal operations on condition of anonymity. Combining these narratives reveals an uncommon situation: OpenAI is striving to catch up.

Let's go back to 2021. At that time, Altman and other OpenAI executives invited WIRED journalist Steven Levy to their early office located in San Francisco's Mission District to watch a new technology demo. This was a project derived from GPT-3, trained using a large amount of open-source code from GitHub.

During the on-site demo, the executives showcased how this tool, called OpenAI Codex, could receive natural language instructions and generate simple pieces of code.

"It can actually perform actions in the computer world for you," explained OpenAI President and Co-founder Greg Brockman at the time. "What you have is a system that can truly execute commands." Even at that time, OpenAI researchers widely believed that Codex would become a key technology for building a "super assistant."

During that period, Altman and Brockman's schedules were almost entirely filled with meetings with Microsoft—the software giant is OpenAI's largest investor. Microsoft planned to leverage Codex to provide technical support for one of its first commercial AI products: a code completion tool called GitHub Copilot that could be directly embedded into developers' daily-used development environments.

An early OpenAI employee recalled that at that stage, Codex "basically could only do autocomplete." But Microsoft executives still saw it as a significant signal of the AI era's arrival.

In June 2022, when GitHub Copilot was officially released, it attracted hundreds of thousands of users within a few short months.

Greg Brockman, President of OpenAI. Photo: Mark Jayson Quines.

The original OpenAI team responsible for Codex was later reassigned to other projects. A former early employee recalled that the company's rationale at the time was that future models would inherently possess programming capabilities, so there was no need to sustain an independent Codex project team in the long term. Some engineers were reassigned to contribute to the development of DALL-E 2, while others pivoted to training GPT-4. It was seen as a key path to bringing OpenAI closer to AGI.

Subsequently, in November 2022, ChatGPT was launched and gained over 100 million users within two months. Virtually all other projects within the company were consequently put on hold. In the years that followed, OpenAI effectively did not have a dedicated team working on AI programming products. A former member who had been involved in the Codex project stated that after the success of ChatGPT, AI programming seemed to no longer fall within the company's new "consumer product-first" strategic focus. Meanwhile, the industry perceived this field to have been largely "covered" by GitHub Copilot, which was fundamentally Microsoft's turf. OpenAI primarily provided foundational model support.

Therefore, in 2023 and 2024, OpenAI's resources were more directed towards multimodal AI models and intelligent agents. These systems were designed to understand text, images, videos, and audio simultaneously, and interact with a cursor and keyboard like humans. This direction appeared more aligned with industry trends at the time: Midjourney's image generation models rapidly gained popularity on social media, and the industry widely believed that large language models must be able to "see" and "hear" the world to truly advance to higher levels of intelligence.

In contrast, Anthropic chose a different path. While the company was also developing chatbots and multimodal models, it seemed to have recognized the potential of programming ability earlier. In a recent podcast, Brockman also acknowledged that Anthropic had been "deeply focused on programming ability" from an early stage. He noted that Anthropic not only used complex programming questions from academic competitions when training models, but also integrated a significant amount of "messy" code issues from real code repositories.

"That was a lesson that we only later realized," Brockman said.

In early 2024, Anthropic began using this real code repository data to train Claude 3.5 Sonnet. When this model was released in June, many users were impressed by its programming abilities.

This trend was particularly evident in a startup company named Cursor. Founded by a group of young people in their twenties, this company developed an AI programming tool that allows developers to describe requirements in natural language and have the AI directly modify the code. After Cursor integrated Anthropic's new model, its user base grew rapidly, revealed a source close to the company.

Months later, Anthropic began internally testing its own programming agent product, Claude Code.

As Cursor's popularity continued to rise, OpenAI made a brief attempt to acquire the startup. However, according to multiple sources close to the company, Cursor's founding team turned down the proposal before negotiations could progress further. Believing in the significant potential of the AI programming industry, they wished to remain independent and continue their development.

Andrey Mishchenko, Head of OpenAI Codex Research. Photo by: Mark Jayson Quines.

At that time, OpenAI was training its first so-called "reasoning model," OpenAI o1. These models can iteratively reason through a problem before providing an answer. Upon release, OpenAI stated that this model excelled particularly in "accurately generating and debugging complex code."

Mishchenko explained that a key reason why AI models have made significant progress in programming ability is that programming is a "verifiable task." Code either runs or it doesn't, providing very clear feedback to the model. Once an error occurs, the system can quickly identify the issue. OpenAI leveraged this feedback loop to continually train o1 on more complex programming problems.

"Without the ability to freely explore the codebase, make modifications, and test its own results—all part of the 'reasoning' capability—today's programming agents could not have reached their current level," he said.

By December 2024, multiple small teams within OpenAI had begun focusing on AI programming agents internally. One of these teams was co-led by Mishchenko and Thibault Sottiaux. Sottiaux, formerly of Google DeepMind, is now the Head of Codex at OpenAI.

Initially, their interest in the programming agent stemmed mainly from internal R&D needs, aiming to leverage AI to automate a large amount of repetitive engineering work, such as managing model training tasks, monitoring GPU cluster operation status, etc.

Another parallel effort was led by Alexander Embiricos. He had previously been in charge of OpenAI's multimodal agent project and now serves as Codex's product lead. Embiricos had developed a demo project called Jam, which quickly spread throughout the company.

Thibault Sottiaux, Head of OpenAI Codex. Photo: Mark Jayson Quines.

Unlike controlling a computer through a mouse and keyboard, Jam can directly access the computer's command line. The 2021 Codex demo only showcased AI generating code for humans to manually run, unlike Embiricos's version, which could execute this code on its own. He recalled watching a real-time record of Jam's actions refreshing and updating on his laptop screen, feeling almost shocked.

"For a while, I kept thinking that multimodal interaction might be the path to achieving our mission. For example, humans sharing screens with AI all day, working together," Embiricos said, "Then it suddenly became very clear: perhaps allowing models direct programmatic access to computers is the true way to achieve this goal."

These scattered projects took several months to gradually integrate into a unified direction. By early 2025, when OpenAI completed training on OpenAI o3, a model further optimized for programming tasks than OpenAI o1, the company finally had the technological foundation to build a true AI programming product. However, concurrently, Anthropic's Claude Code was ready for a public release.

Prior to the release of Claude Code (launched in February 2025 as a "limited research preview" and fully launched in May), the mainstream model in the AI programming field was still called "vibe coding." Developers drove project progress through AI-assisted tools, with humans in control of the direction while AI supplemented the specific implementations along the way. Such tools had already attracted billions of dollars in investments.

But Anthropic's new product changed this paradigm. Like the Jam demo, Claude Code could run directly through the computer's command line, meaning it could access all of developers' files and applications. Programming was no longer just "AI-assisted"; rather, developers could entrust the entire task directly to the AI agent.

Facing this change, OpenAI began accelerating the release of competing products. Sottiaux recalled that in March 2025, he assembled a "sprint team" whose task was to integrate multiple internal teams within the company in just a few weeks to quickly launch an AI programming product.

Meanwhile, Altman also attempted a "shortcut" through acquisition, offering $3 billion to acquire the AI programming startup Windsurf. OpenAI's leadership believed that this deal would bring to the company a mature AI programming product, an experienced team, and an existing enterprise customer base.

However, this acquisition later stalled. According to The Wall Street Journal, the issue arose with OpenAI's biggest partner, Microsoft. Microsoft sought access to Windsurf's intellectual property rights. Since 2021, Microsoft has been using OpenAI's models to support GitHub Copilot, a product that has become a highlight in Microsoft's earnings conference calls. But with the release of new AI programming agents like Cursor, Windsurf, and Claude Code, GitHub Copilot began to seem stuck in the previous generation of AI tools. If OpenAI were to launch another new programming product, it might not bode well for Microsoft.

This acquisition negotiation happened at a time when OpenAI's relationship with Microsoft was most strained. The two parties were renegotiating their cooperation agreement, with OpenAI seeking to reduce Microsoft's control over its AI products and compute resources. Eventually, the Windsurf acquisition became a casualty of this power play. By July, OpenAI had abandoned the deal. Subsequently, Google hired Windsurf's founding team, while the remaining employees were acquired by another AI programming company, Cognition.

"I certainly hoped at the time that the deal could go through," Altman said, "but not every deal is within our control." He mentioned that although he had hoped the acquisition of Windsurf "would, to some extent, accelerate our progress," he was equally impressed by the momentum of the Codex team. While negotiations were ongoing, Sottiaux and Embiricos continued to develop the product and roll out updates.

By August, Altman decided to ramp up efforts across the board.

Alexander Embiricos, OpenAI Codex Product Lead. Photo: Mark Jayson Quines.

Greg Brockman's favorite way to measure AI capability is through a game he designed himself, the "Reverse Turing Test." He wrote the code for this game himself a few years ago and now delegates the task to an AI agent to re-implement from scratch.

The game rules are simple: two human players sit in front of separate computers, each seeing two chat windows on their screen. One window connects to another human player, while the other connects to an AI. Players must guess which window is AI and try to convince their opponent that they themselves are the AI.

Brockman said that for most of last year, OpenAI's most powerful models took hours to set up such a game, requiring a lot of explicit human instructions and assistance along the way. However, by December of last year, Codex was able to directly generate a fully playable version through a well-crafted prompt, powered by the new GPT-5.2 model.

This change has not gone unnoticed by Brockman alone. Developers around the world have also begun to realize that the AI programming agents' capabilities have suddenly seen a significant leap. Discussions around AI programming, initially focused on Claude Code, quickly broke into the mainstream media's attention outside the Silicon Valley tech circle.

Even some non-programmer regular users have started using AI to directly create their software projects.

This surge in usage is no coincidence. During this time, both Anthropic and OpenAI have invested heavily to onboard more AI programming agent users. Several developers told WIRED that their $200 per month Codex or Claude Code subscription plans actually provide over $1000 worth of usage credits. This rather "generous" limit is essentially a market strategy: first, get developers accustomed to using AI programming tools in their daily work, and then charge based on usage in enterprise settings.

According to multiple sources, as of September 2025, Codex's usage was only about 5% of Claude Code's. By January 2026, Codex's user base had risen to around 40% of Claude Code's.

Developer George Pickett, who has been working at a tech startup for 10 years, has recently even started organizing offline meetups with Codex as the theme.

「It seems pretty clear to me that we are replacing white-collar work with AI agents,」 Pickett said, 「As for what this means for society, honestly, no one can say for sure. It will definitely have a huge impact, but I'm generally optimistic about the future.」

Meanwhile, Simon Last, co-founder of the efficiency software company Notion, with a valuation of around $11 billion, stated that after the release of GPT-5.2, he and the company's core engineering team have transitioned to using Codex primarily due to its better stability.

「I found that Claude Code would often 'lie to me',」 Last said, 「It would say the task is running when it really isn't.」

Katy Shi, OpenAI researcher. Photo: Mark Jayson Quines.

Katy Shi at OpenAI, responsible for researching Codex model behavior, stated that while some describe Codex's default style as 'dry bread', more and more users are starting to appreciate this unapologetic communication style. 「A lot of engineering work is fundamentally about being able to take critical feedback without seeing it as an offense,」 she said.

Meanwhile, some large corporations have also begun adopting Codex. Fidji Simo, CEO of OpenAI's Applied Business, stated, 「ChatGPT has become synonymous with AI, giving us a huge advantage in the B2B market. Enterprises are more willing to deploy technology that employees are already familiar with.」 She added that OpenAI's core strategy in selling Codex is to bundle it with ChatGPT and other OpenAI products.

Cisco's President and Chief Product Officer Jeetu Patel made it clear to employees that they need not worry about the cost of using Codex, as the key is to familiarize themselves with the tool as quickly as possible. When employees express concerns about 「Will using these tools make me lose my job,」 Patel's response is 「No. But I can guarantee you that if you don’t use them, you will lose your job because you will no longer be competitive.」

Today, anxiety surrounding AI programming agents has far exceeded the Silicon Valley tech circle. Last month, The Wall Street Journal attributed part of a trillion-dollar tech stock sell-off to Claude Code, with investors fearing that software development could soon be massively replaced by AI. Weeks later, after Anthropic announced that Claude Code could be used to revamp old COBOL-running systems (which are very common on IBM machines), IBM experienced its worst day in 25 years.

Meanwhile, OpenAI is also working to bring AI programming agents into the public discussion spotlight. The company even spent millions of dollars to air an ad about OpenAI Codex during the Super Bowl, instead of promoting ChatGPT.

Inside OpenAI's headquarters in Mission Bay, almost no one needs to be convinced to use Codex. Many engineers I interviewed said they rarely write code themselves these days, spending most of their time just talking to Codex. Sometimes, they even "pair program."

At headquarters, I sat in on a Codex hackathon. Around 100 engineers crammed into a large room, each given four hours to showcase their best Codex-made projects. An OpenAI executive stood at the front, looking at their laptop, announcing team names through a microphone. Nervous team representatives took to the stage, presenting their AI projects with slightly shaky voices. The ultimate winner received a Patagonia backpack as a prize.

Many projects were both built with Codex and aimed to help engineers use Codex better. For example, one team developed a tool to automatically summarize Slack messages into weekly reports; another created an internal AI guide similar to Wikipedia to explain OpenAI's various internal services. In the past, such prototypes often took days or even weeks to complete, but now, an afternoon is enough.

As I left, I ran into Kevin Weil, former Instagram executive now leading OpenAI's "OpenAI for Science" division, at the door. He told me that Codex was working overnight to complete some project tasks for him, which he would review the next morning. This work cadence has become the daily norm for him and hundreds of OpenAI staff. One of OpenAI's goals for 2026 is to develop an "automated intern" for researching AI itself.

Simo envisions that in the future, Codex will not only be used for programming but will also aim to become the task execution engine for ChatGPT and all OpenAI products, handling various real-world tasks for users. Altman also expressed a strong desire to launch a universal version of Codex but remains concerned about security risks.

He mentioned that in late January 2026, a friend with no technical background once asked him to help install a trending AI programming agent called OpenClaw. Altman declined the request because, in his view, "it obviously wasn't a good idea yet," as OpenClaw might accidentally delete important files.

Ironically, a few weeks later, OpenAI announced that they had hired the developer of OpenClaw.

Many developers have told me that the competition between Codex and Claude Code has never been so intense. But as these tools continue to improve and are increasingly integrated into corporate workflows, the societal questions at hand are no longer just about "which AI coding tool to use."

Amelia Glaese, OpenAI's VP of Research and Alignment Lead. Photo: Mark Jayson Quines.

Some watchdogs are concerned that in the race to catch up with Claude Code, OpenAI may be sidelining security issues. A non-profit organization called the Midas Project has accused OpenAI of watering down its security commitments when releasing GPT-5.3-Codex, failing to fully disclose the model's potential cybersecurity risks.

In response, Glaese countered that OpenAI did not sacrifice security in advancing Codex, with the company also stating that the Midas Project misunderstood its security commitments.

Even Greg Brockman, the OpenAI co-founder who last year donated $25 million to an AI-friendly Super PAC and a pro-Donald Trump organization, and still optimistically proclaimed "we're on track towards AGI," harbors mixed feelings about this new reality.

Within the Silicon Valley engineer circle, Brockman has always been known for his "deeply involved" management style: the type of boss who would still be digging through the codebase the night before a product launch. To some extent, this more "hands-off" approach now makes him feel at ease. "You realize that your brain was occupied by many unnecessary details in the past," he says.

However, at the same time, when you become the "CEO of hundreds of thousands of AI agents," letting these systems execute your goals and vision, it's hard to dive into the specifics of each problem resolution anymore.

"In a sense, it makes you feel like you're losing the 'pulse' of the problem itself," Brockman says.

[Original Article Link]

-- Price