This is what happened and the challenges that lie ahead in 2026

0
9


In artificial intelligence, 2025 marked a decisive change. Systems previously confined to research laboratories and prototypes began to appear as everyday tools. At the center of this transition was the rise of AI agents: AI systems that can use other software tools and act on their own.

Although researchers have studied AI for more than 60 years, and the term “agent” has long been part of the field’s vocabulary, 2025 was the year the concept became concrete for developers and consumers alike.

AI agents moved from theory to infrastructure, transforming the way people interact with large language models, the systems that power chatbots like ChatGPT.

In 2025, the definition of an AI agent changed from the academic focus of systems that perceive, reason, and act to AI company Anthropic’s description of large language models capable of using software tools and taking autonomous actions. Although large language models have long excelled in text-based responses, the recent change is their increasing ability to act, using tools, calling APIs, coordinating with other systems, and completing tasks independently.

This change did not happen overnight. A key turning point came in late 2024, when Anthropic launched the Model Context Protocol. The protocol allowed developers to connect large language models to external tools in a standardized way, effectively giving the models the ability to act beyond generating text. With that, the stage was set for 2025 to become the year of AI agents.

Don’t miss: AI-driven inflation is the most underrated risk of 2026, according to investors

The milestones that defined 2025

The momentum accelerated rapidly. In January, the launch of China’s DeepSeek-R1 as an open-weight model shattered assumptions about who could build large, high-performance language models, briefly shaking up markets and intensifying global competition. An open weight model is an AI model whose training, reflected in values ​​called weights, is publicly available.

Throughout 2025, large US labs such as OpenAI, Anthropic, Google and xAI launched larger, high-performance models on the market, while Chinese technology companies such as Alibaba, Tencent and DeepSeek expanded the open model ecosystem to the point that Chinese models have been downloaded more than American ones.

Another turning point came in April, when Google introduced its Agent2Agent protocol. While Anthropic’s Model Context Protocol focused on how agents use tools, Agent2Agent addressed how agents communicate with each other.

The crucial thing is that both protocols were designed to work together. Later that year, both Anthropic and Google donated their protocols to the open-source software nonprofit Linux Foundation, cementing them as open standards rather than proprietary experiments.

These developments were quickly integrated into consumer products. In mid-2025, “agent browsers” began to appear. Tools like Perplexity’s Comet, Browser Company’s Dia, OpenAI’s GPT Atlas, Microsoft’s Copilot on Edge, ASI For example, instead of helping you find vacation details, it plays a role in vacation booking.

At the same time, workflow builders like n8n and Google’s Antigravity lowered the technical barrier to creating custom agent systems beyond what has already happened with scheduling agents like Cursor and GitHub Copilot.

New power, new risks

As agents became more capable, their risks became harder to ignore. In November, Anthropic revealed how its agent Claude Code had been misused to automate parts of a cyberattack. The incident illustrated a broader concern: By automating technical and repetitive jobs, AI agents can also lower the barrier to malicious activity.

This tension defined much of 2025. AI agents expanded what individuals and organizations could do, but they also amplified existing vulnerabilities. Systems that were once isolated text generators became interconnected actors, using tools and operating with little human supervision.

You may be interested in: CES 2026 kicks off in Las Vegas projecting 5 billion AI users by 2030

What to watch in 2026

Looking ahead, several open questions will likely shape the next phase of AI agents.

One is benchmarks. Traditional benchmarks, which are like a structured exam with a series of questions and standardized scoring, work well for individual models, but agents are systems made up of models, tools, memory, and decision logic. More and more researchers want to evaluate not only the results, but also the processes. This would be like asking students to show their work, not just respond.

Progress here will be critical to improving reliability and trust, and ensuring that an AI agent performs the task at hand. One approach is to establish clear definitions around AI agents and AI workflows. Organizations will need to plan exactly where AI will be integrated into workflows or introduce new ones.

Another development to follow is governance. In late 2025, the Linux Foundation announced the creation of the Agentic AI Foundation, signaling an effort to establish shared standards and best practices. If successful, it could play a similar role to the World Wide Web Consortium in shaping an open and interoperable agent ecosystem.

There is also a growing debate about the size of models. While large, general-purpose models dominate the headlines, smaller, specialized models are often better suited for specific tasks. As agents become configurable tools for both consumers and businesses, whether through browsers or workflow management software, the power to choose the right model is increasingly shifting to users rather than labs or corporations.

The challenges ahead

Despite the optimism, significant sociotechnical challenges remain. The expansion of data center infrastructure strains energy grids and impacts local communities. In the workplace, officers express concerns about automation, job displacement, and surveillance.

From a security perspective, connecting models with tools and stacking agents multiplies risks that are already unresolved in large standalone language models. Specifically, AI professionals are addressing the dangers of indirect prompt injections, where prompts are hidden in open web spaces that can be readable by AI agents and that result in harmful or unintended actions.

Regulation is another unresolved issue. Compared to Europe and China, the United States has relatively limited oversight of algorithmic systems. As AI agents integrate into digital life, questions about access, accountability, and boundaries remain largely unanswered.

Meeting these challenges will require more than technical advances. It requires rigorous engineering practices, careful design, and clear documentation of how systems work and fail. Only by treating AI agents as sociotechnical systems and not simple software components, I believe, will we be able to build an AI ecosystem that is innovative and secure.

*Thomas Şerban von Davier is an affiliate faculty member at the Carnegie Mellon Institute for Strategy and Technology at Carnegie Mellon University.

This text was originally published in The Conversation

Little text and great information in our X, follow us!




LEAVE A REPLY

Please enter your comment!
Please enter your name here