Chances are someone on your team has mentioned “agents” in the last few months, autonomous systems supported by large language models (LLMs) that can carry out entire workflows on behalf of users, customers and maybe even internally. If you’re intrigued, you’re not alone. Everyone is working on them or should be.
Agents are the logical evolution from chatbots and automation scripts. They don’t just answer questions or fill out forms. They reason, act, adapt, interact. But there’s the catch: creating this is complex and that complexity is invisible to end users. As product leaders, we have a responsibility to understand how that complexity impacts experience, trust, and ultimately the value of the products we help create.
Image by rawpixel.com on Freepik
What Makes Agents Different?
Most people have interacted with LLMs through tools like ChatGPT. These models can generate content or answer questions in a single turn, but they don’t own the outcome. Agents do.
A true agent is more than a prompt wrapped in an interface. It executes defined multi-step workflows, makes decisions based on evolving context, uses external content from APIs, and knows when to pause or escalate. For example, a customer support agent could process a refund request, check for policy compliance, attempt an API call to the finance system, and, if needed, transfer the case to a human…all autonomously!
This blend of independence and task ownership is what makes agents a new class of product experience. But it’s also what makes them fragile if not designed carefully.
Agent Foundations: Model, Tools, Instructions
Thanks to OpenAI, every agent is built on three key components: the model, the tools it can use, and the instructions that guide its behavior.
The model powers the agent’s reasoning. Start with a capable model like GPT-4o to establish a baseline. Once the system works, consider optimizing with smaller, faster models for specific subtasks. For instance, you might use a lightweight model for document classification and reserve the heavyweight reasoning for complex policy checks.
Tools allow the agent to take action… sending emails, querying databases, or updating CRM records. These should be modular and well-documented, ideally exposed as APIs that PMs must have access to. Don’t underestimate how many workflows break when an agent calls the wrong function due to vague tool definitions. You will need to troubleshoot these with your development teams. Know where to look!
Instructions are the rules of the road. They tell the agent what to do and how to do it. Unlike traditional code, instructions can be generated or updated dynamically, ie: used on the fly at any time during runs. That said, a best practice is to base them on real documents like SOPs or knowledge base articles and convert them into step-by-step routines. This reduces ambiguity and increases consistency.
Orchestration: Scaling Intelligence Without Breaking UX
Orchestration is agents actually execute workflows.
The simplest setup is a single-agent system. One agent, one set of tools, one loop. This works well for MVPs and narrow use cases. You can incrementally add tools or adapt prompt templates without introducing unnecessary complexity.
But as workflows expand, so does the need to break things apart. That’s where multi-agent systems come in. These fall into two main patterns:
The manager pattern, where a central agent delegates tasks to specialized sub-agents (think translation manager handing off to language-specific bots).
The decentralized pattern, where agents pass control among themselves based on context, like triage, support, and sales agents routing a customer inquiry.
Both patterns have pros and cons. Manager-led systems maintain coherence and control. Decentralized systems offer flexibility and parallelism. The key is to design for maintainability and clarity, not just technical elegance.
Instructions That Don’t Confuse Your Agent
Instruction design deserves more attention than it gets. Poorly written instructions lead to hallucinations, tool misuse, and circular logic. Good instructions follow a few simple rules:
Break dense policies into numbered steps
Explicitly define expected actions and outputs
Anticipate edge cases like missing inputs or ambiguous user requests
Need help generating clean instructions? Use the model itself. You can prompt an LLM to convert a help center article into executable routines with minimal manual editing.
Guardrails: Safety Isn’t Optional
Here’s where things get real. Agents can hallucinate. They can leak prompts. They can issue refunds when they shouldn’t. That’s why guardrails are essential.
You need layers of defense:
Relevance classifiers to keep conversations on-topic
Safety classifiers to catch jailbreaks or prompt injections
PII filters to prevent leaking personal data
Moderation filters to block toxic or harmful content
Tool safeguards to assign risk levels and pause high-impact actions
Rules-based filters for banned keywords or SQL patterns
Output validation to enforce brand tone and prevent embarrassing content
No single guardrail is perfect, but layered together, they reduce the chance of unexpected behavior. Just as importantly, agents should have human fallback logic. If the system fails repeatedly or hits a risky decision point, escalate.
Architecture = Experience = Value
Let’s tie this back to product value.
A well-architected agent feels smart. It gets things done quickly, speaks clearly, and doesn’t fumble edge cases. A poorly built one feels flaky — like a demo that went too far into production. And users can tell.
Done right, agent-powered workflows drive efficiency, reduce costs, and boost CSAT. (the trifecta of success) But those benefits only emerge when product and engineering teams work together to understand how agent systems function under the hood.
TLDR
Agents aren’t just another fancy AI feature. They’re mandatory systems: dynamic, evolving, and powerful. As product leaders, we owe it to our users (and everyone we work with) to engage with that complexity, not abstract it away.
Start with one agent. Observe it in the wild. Parallel your dev team. Add structure, safety, and orchestration as your use case matures. The results will speak for themselves — not just in automation metrics, but in user trust and real business outcomes.
Inspiration and Attribution
Image by rawpixel.com on Freepik
How ChatGPT Paves The Way For AI Agents, by Melissa Heikkila, MIT Technology Review, November 2024
A Practical Guide to Building Agents, OpenAI, 2024