What Are AI Agents and How Should We Use Them?

After decades of promise and hype, AI has (finally) come of age. The 2010s saw increasingly powerful deep learning systems surpass human levels of performance in tasks like image and speech recognition, skin and breast cancer detection, and winning at championship-level Go. The 2020s saw the advent of generative AI systems, with their impressive ability to create high-quality original content, — e.g., text, images, video, audio or software code, — in response to a a user’s natural language prompts. And now, we’re seeing the emergence of agentic AI.

Wikipedia defines agentic AI as autonomous AI systems “that can make decisions and perform tasks without human intervention.” Agentic AI systems are designed to “autonomously make decisions and act, with the ability to pursue complex goals with limited supervision. It brings together the flexible characteristics of large language models (LLMs) with the accuracy of traditional programming.”

A few weeks ago I posted a blog, Agentic AI: The Evolution of Application Development, mostly based on a McKinsey Digital report, “Why agents are the next frontier of generative AI.” Agentic AI systems are now taking AI to the next level, enabling us to automate processes that can plan and execute their actions with limited human intervention. “In short, the technology is moving from thought to action.”

“Broadly speaking, agentic systems refer to digital systems that can independently interact in a dynamic world. While versions of these software systems have existed for years, the natural-language capabilities of gen AI unveil new possibilities, enabling systems that can plan their actions, use online tools to complete those tasks, collaborate with other agents and people, and learn to improve their performance.”

“Gen AI agents eventually could act as skilled virtual coworkers, working with humans in a seamless and natural manner,” the McKinsey report added. “A virtual assistant, for example, could plan and book a complex personalized travel itinerary, handling logistics across multiple travel platforms. Using everyday language, an engineer could describe a new software feature to a programmer agent, which would then code, test, iterate, and deploy the tool it helped create.”

We’ve been developing applications to automate tasks and processes with programming languages and tools since the very early days of computers. Over the years, application development platforms have become increasingly sophisticated. Significantly higher level languages and tools have enabled us to automate more and more of the labor involved in developing highly complex applications.

Programmers are now becoming high level application engineers, whose job is to coordinate the design and development of the overall system, including the creation of AI agents that are integral components of the system. A major part of the job of the application engineers is to define the goals and actions that the AI agents should follow using LLMs and other tools, as well as to test and make sure that the overall applications system will work as intended under a variety of conditions.

Some of the articles I’ve read about agentic AI seem to imply that more people can now become application developers because you can just tell the system what you are after using an LLM and the system automatically does it, much like asking a chatbot a question. While that might work for very simple applications, let’s remember that the development of complex applications is a very sophisticated engineering discipline. Moreover, the knowledge and experience required to develop software applications goes up very rapidly with the complexity of the overall system being developed, much as it does with just about any engineering job. CAD/CAM tools for example, have had a huge impact on the design and manufacturing of complex physical objects like bridges, cars, and skyscrapers, but no amount of AI will enable inexperienced engineers to do so.

“The way humans interact and collaborate with AI is taking a dramatic leap forward with agentic AI,” wrote economics and technology advisor Mark Purdy in a recent Harvard Business Review (HBR) article, “What Is Agentic AI, and How Will It Change Work?,” which nicely explains the potential value of agentic AI.

“While previous AI assistants were rules-based and had limited ability to act independently, agentic AI will be empowered to do more on our behalf,” wrote Purdy. “The agentic AI system understands what the goal or vision of the user is and the context to the problem they are trying to solve.” Some of the possibilities opened up by agentic AI systems include: “AI-powered agents that can plan your next trip overseas and make all the travel arrangements; humanlike bots that act as virtual caregivers for the elderly; or AI-powered supply-chain specialists that can optimize inventories on the fly in response to fluctuations in real-time demand.”

“To achieve this level of autonomous decision-making and action, agentic AI relies on a complex ensemble of different machine learning, natural language processing, and automation technologies. While agentic AI systems harness the creative abilities of generative AI models such as ChatGPT, they differ in several ways: they are focused on making decisions rather than on creating content; they don’t rely on human prompts, but rather on optimizing particular goals or objectives, such as maximizing sales, customer satisfaction scores, or efficiency in supply-chain processes; and “unlike generative AI, they can also carry out complex sequences of activities, independently searching databases or triggering workflows to complete activities.”

The HBR article cites three of the main benefits that we can expect from agentic AI systems:

Greater workforce specialization. “Specialization brings greater efficiency, learning by doing, and innovation — but can be difficult to implement as businesses run up against workforce shortages and mismatches between roles and available human skills. Because agentic models are explicitly designed to carry out very granular tasks, they enable much greater specialization of roles compared with previous broad-brush automation systems. What’s more, multiple agentic roles can be created rapidly.”

Greater information trustworthiness. The greater cognitive reasoning of agentic AI systems means that they are less likely to suffer from the so-called hallucinations (or invented information) common to generative AI systems. Agentic AI systems also have significantly greater ability to sift and differentiate information sources for quality and reliability, increasing the degree of trust in their decisions.”

Enhance innovation. With their enhanced judgement and powers of execution, Agentic AI systems are ideal for experimentation and innovation. … Multi-agent AI models can also scan and analyse vast research spaces — such as scientific articles and databases — in a fraction of the time it would take teams of human scientists and researchers.”

Let me conclude by discussing a recent article, “Building effective agents,” by the AI research startup ANTHROP/C that offers sound advice based on their experience with dozens of teams building LLM-based agents across a variety of industries.

The article starts out by explaining the difference between AI agents and workflows. “Some customers define agents as fully autonomous systems that operate independently over extended periods, using various tools to accomplish complex tasks. Others use the term to describe more prescriptive implementations that follow predefined workflows.” While both are variations of agentic systems, there is an important architectural distinction between workflows and agents:

Workflows are systems where LLMs and tools are orchestrated through predefined code paths.
Agents, on the other hand, are systems where LLMs dynamically direct their own processes and tool usage, maintaining control over how they accomplish tasks.

The article offers advice as to when (and when not) to use agents:

“Consistently, the most successful implementations weren’t using complex frameworks or specialized libraries. Instead, they were building with simple, composable patterns.”

“When building applications with LLMs, we recommend finding the simplest solution possible, and only increasing complexity when needed. This might mean not building agentic systems at all. Agentic systems often trade latency and cost for better task performance, and you should consider when this tradeoff makes sense.”

“When more complexity is warranted, workflows offer predictability and consistency for well-defined tasks, whereas agents are the better option when flexibility and model-driven decision-making are needed at scale. For many applications, however, optimizing single LLM calls with retrieval and in-context examples is usually enough.

Success in the LLM space isn't about building the most sophisticated system. It's about building the right system for your needs. Start with simple prompts, optimize them with comprehensive evaluation, and add multi-step agentic systems only when simpler solutions fall short. When implementing agents, try to follow these three core principles:

Maintain simplicity in your agent's design.
Prioritize transparency by explicitly showing the agent’s planning steps.
Carefully craft your agent-computer interface (ACI) through thorough tool documentation and testing.

“By following these principles, you can create agents that are not only powerful but also reliable, maintainable, and trusted by their users.”

Irving Wladawsky-Berger

RECENT POSTS

CATEGORIES

Subscribe to this blog via email