An Analysis of Anthropic's Guide to Building Effective Agents
A critical analysis of Anthropic's Effective Agents Guide — highlighting ambiguities, great advice, and doubtful advice.
In December 2024 Anthropic released a guide to building effective AI agents where they provide some definitions of what they consider “agentic” systems alongside guidelines, patterns and recommendations on how to design them. This is catnip for AgentsDecoded as it discusses both definitions - definitely a favourite subject over here - and provides more practical recommendations.
Before we jump in let me preface this analysis by saying that I really like their guide and think it will be useful for lots of developers. It will sound like I am being overly critical but that is just a function of wanting to explore the topic in hopes of getting to better understanding. Even though it may sound overly harsh I am simply going through the process of thinking it through so as to better understand it myself.
What are agents according to Anthropic
Anthropic uses three terms to describe these systems.
“Agentic” - this, as far as I can tell, represents anything anyone else is willing to call an agent as long as it uses an LLM. So any application that makes use of LLMs in some way could be called “agentic” except maybe single prompt API calls. These “agentic” systems are then divided into workflows and agents.
I think agents are first and foremost a design stance on how to approach a problem, where you are working with goal-directed components. LLMs are just a technology. It may seem unlikely today but we will eventually have agents that don’t use LLMs - or at least LLMs are not their only choice. Definitions based on the how and not the what will not stand the test of time. Anyway, back to Anthropic’s definitions.
“Workflows” are a system where prompts are orchestrated through a pre-determined code path. They are “agentic” but not quite full agents (apparently).
“Agents” are systems where LLMs direct their own processes and tool usage while “maintaining control” over how they accomplish tasks.
Unfortunately, these definitions are not that clear or even consistent between them.
If we have a look at the workflows that are presented in the same post they describe a number of reasonable patterns of decision-making using LLMs. Comparing them to something like AutoGen then could be taken to represent what AutoGen actually defines as multi-agent systems! The only aspect missing is calling out to a tool. However, most such workflows in real-life implementations would then be connected to APIs to affect change. It’s just that those APIs are not embedded in the LLM. The result, however, can be identical to how an LLM plus tools would behave.
The LLM can do the reasoning and on the back of that our system can decide to execute other code that will perform some form of function. Relegating these systems to second class citizens because they don’t use the tools that are embedded in the LLM does not seem to provide some actual design benefit.
Take the routing workflow that Anthropic defines as in the image below:
Now, let’s embed this within our system and add a function ourselves - not a tool in the LLM sense but some logic that based on the output of an LLM Call will call a function:
I don’t think this is conceptually different from agents that “direct their own processes” and maintain control. They key difference is that the function execution is decoupled from the LLM more explicitly. Instead of happening within the LLM infrastructure coming from Anthropic or another provider it is happening within our own system. I am sure there are pros and cons to each approach and I am not advocating to not use tools, it’s just that using tools should not be a condition for something to be called an agent.
When to use agents according to Anthropic
The next bit of advice from Anthropic is gold though.
“When building applications with LLMs, we recommend finding the simplest solution possible, and only increasing complexity when needed. This might mean not building agentic systems at all. Agentic systems often trade latency and cost for better task performance, and you should consider when this tradeoff makes sense.”
💯 - this is the biggest risk I see with all the excitement and hype around AI Agents.
Teams will jump on the bandwagon and want to do something “cool” and “correct”, so they will want a system that can be correctly described as an AI Agent from a technical perspective but had no need to be that from a functional perspective. This takes me back to why definitions are important!
When to use frameworks according to Anthropic
The next bit I find a bit strange from Anthropic. They suggest that developers should stir away from frameworks because they may not fully grasp the code of the framework. Instead they should start directly with the API.
First of all, an LLM provider concerned with using systems that may not be fully understood is just funny. Jokes aside though, frameworks are a well established way to speed up development. Frameworks capture the collective wisdom of the community on how to approach a problem and solve it. Starting from the APIs may well lead teams down the path of re-inventing the wheel. A more nuanced approach would be better suited here.
Conclusions
Anthropic's guide to building AI agents offers useful insights but has areas that could do with refinement. While practical, its definitions of “agentic,” “workflows,” and “agents” are not consistent. I will find myself repeating this over and over, but agent definitions should focus on the what and not the how. Agents should be seen as a design approach rather than tied to specific technologies like LLMs.
Anthropic’s advice to prioritize simplicity over complexity is spot-on. Developers should avoid over-engineering AI systems just to match trends, focusing instead on functional needs.
The caution against frameworks is debatable. While understanding underlying principles is important, frameworks save time and reflect community best practices. A balanced approach combining frameworks and APIs is more practical.
Overall, the guide provides valuable recommendations, but developers, as ever, should critically assess definitions, frameworks, and design choices.