Завантаження...
The artificial intelligence industry reached a pivotal moment this week as two major players simultaneously released products built around the same revolutionary concept: transforming users from AI chatters into AI managers. Anthropic and OpenAI both shipped platforms designed to move beyond single-assistant conversations toward supervising teams of AI agents that divide work and operate in parallel.
This coordinated shift represents a fundamental reimagining of human-AI interaction, moving from AI as conversation partner to AI as delegated workforce. The timing is particularly striking given that this very concept reportedly contributed to wiping $285 billion off software stock valuations during the same week.
Anthropic's Agent Teams Revolution
Anthropic's contribution centers on Claude Opus 4.6, a substantial upgrade to their flagship AI model, combined with a groundbreaking "agent teams" feature in Claude Code. This system allows developers to spawn multiple AI agents that automatically split complex tasks into independent pieces, coordinate autonomously, and execute work concurrently.
The practical implementation resembles a sophisticated split-screen terminal environment where developers can navigate between subagents using keyboard shortcuts, take direct control of any agent, and monitor others as they continue working. Anthropic positions this as ideal for "tasks that split into independent, read-heavy work like codebase reviews," though the feature remains in research preview.
The underlying Claude Opus 4.6 model represents significant technical advancement. For the first time in the Opus family, it supports up to 1 million tokens of context, enabling processing of much larger codebases or documents in single sessions. Performance improvements are dramatic: the model scored 68.8% on ARC AGI 2 compared to 37.6% for its predecessor, and achieved 76% accuracy on long-context retrieval tasks versus just 18.5% for Anthropic's Sonnet model.
OpenAI's Enterprise Agent Platform
OpenAI's Frontier platform takes a different but complementary approach, positioning itself as a way to "hire AI co-workers who take on many of the tasks people already do on a computer." Each AI agent receives individual identity, permissions, and memory systems, with integration capabilities spanning CRMs, ticketing tools, and data warehouses.
The company also released GPT-5.3-Codex, which powers their new macOS desktop app for the Codex coding tool. This model achieved 77.3% on Terminal-Bench 2.0, exceeding Anthropic's Opus 4.6 by approximately 12 percentage points on this agentic coding benchmark. OpenAI claims their own team used early versions of GPT-5.3-Codex to debug the model's training, manage deployment, and diagnose test results.
Market Disruption and Investor Concerns
The agent revolution has created significant market volatility. When Anthropic released 11 open-source plugins for their Cowork productivity tool, extending it into legal contract review, compliance workflows, financial analysis, and other professional domains, investors reacted dramatically. Software stocks experienced their steepest single-session decline since April, with Thomson Reuters leading an 18% drop that spread across global markets.
Investor fears center on AI companies packaging complete workflows that directly compete with established SaaS vendors. OpenAI's Frontier particularly concerns markets, as Fortune described it as a bid to become "the operating system of the enterprise" by enabling AI agents to log into applications and manage work with minimal human involvement.
The Reality Behind the Hype
Despite marketing language about "AI co-workers," practical experience suggests these tools function best as skill amplifiers rather than autonomous replacements. Current AI agents still require heavy human intervention to catch errors, and no independent evaluation has confirmed that multi-agent systems reliably outperform individual developers working alone.
The shift transforms users into what essentially amounts to middle managers of AI – not writing code or performing analysis directly, but delegating tasks, reviewing output, and intervening when agents need direction. Anthropic's Scott White coined the term "vibe working" to describe this transition, suggesting people can now "do things with their ideas" through AI supervision.
Technical Benchmarks and Performance
Benchmark results show mixed leadership between the platforms. While GPT-5.3-Codex leads on Terminal-Bench 2.0, Claude Opus 4.6 demonstrates superior performance on ARC AGI 2 and other reasoning tasks. Both companies claim advantages over Google's Gemini 3 Pro across various evaluations, though benchmark interpretation remains challenging given the relatively new science of AI capability measurement.
The convergence around agent supervision represents a significant industry bet that the future of AI lies not in better chatbots, but in more sophisticated delegation and management systems. Whether this vision proves practical and beneficial remains an open question, but the simultaneous commitment from major players suggests this transformation is accelerating regardless of current limitations.
Related Links:
$285 billion
Company Valuation
Note: This analysis was compiled by AI Power Rankings based on publicly available information. Metrics and insights are extracted to provide quantitative context for tracking AI tool developments.