Modern software development already relies on AI coding assistants that react to user inputs. Although autonomous AI agents are still in their infancy, they have the potential to revolutionize the field even further. We talk about:

  • handling tasks without prior strict rules,
  • detecting anomalies,
  • predicting and mitigating potential issues before they arise,
  • providing valuable insights to novice and experienced devs.

These results are achievable with intelligent, adaptive agents that enhance system resilience and accelerate project timelines.

In this guide, we'll explore AI agents, provide examples, and show you how to create your own AI agent with n8n, a source-available AI-native workflow automation tool!

Table of contents

To give you a sneak peek of what you will be able to do by the end of this guide:

Let’s dive in!

What are AI agents?

An AI agent is an autonomous system that receives data, makes rational decisions, and acts within its environment to achieve specific goals.

💡
In this guide, we rely heavily on definitions from AIMA – a book on modern Artificial Intelligence. These definitions may sound familiar to you if you’ve studied computer science. For everyone else, we offer a greatly simplified and condensed version, but you can always refer to the original book.

While a simple agent perceives its environment through sensors and acts on it through actuators, a true AI agent includes an "engine". This engine autonomously makes rational decisions based on the environment and its actions. According to AIMA: “For each possible percept sequence, a rational agent should select an action that is expected to maximize its performance measure, given the evidence provided by the percept sequence and whatever built-in knowledge the agent has”.

Large Language Models and multimodal LLMs are at the core of modern AI agents as they provide this layer of reasoning and can readily measure performance. We’ll talk about it more in the next sections.

💡
Large Language Models can be proprietary or open source. Take a look at our overview of various open-source LLMs.

The most advanced AI agents can also learn and adapt their behavior over time. Not all agents need this, but sometimes it’s mandatory:

0:00
/
This Atlas robot needs a little bit more practice. Source: https://www.youtube.com/watch?v=EezdinoG4mk

Such autonomous agents come in different shapes and colors: be it a robot, a self-driving car, a vacuum cleaner or a piece of software.

What are the types of agents in AI?

The AIMA textbook discusses several main types of agent programs based on their capabilities:

  1. Simple reflex agents: These agents are fairly straightforward - they make decisions based only on what they perceive at the moment, without considering the past. They do their job when the right decision can be made just by looking at the current situation.
  2. Model-based reflex agents: These agents are a little more sophisticated. They keep track of what's happening behind the scenes, even if they can't observe it directly. They use a "transition model" to update their understanding of the world based on what they've seen before, and a "sensor model" to translate that understanding into what's actually happening around them.
  3. Goal-based agents: These agents are all about achieving a specific goal. They think ahead and plan a sequence of actions to reach their desired outcome. It's if they have a map and are trying to find the best route to their destination.
  4. Utility-based agents: These agents are even more advanced. They assign a "goodness" score to each possible state based on a utility function. They not only focus on a single goal but also take into account factors like uncertainty, conflicting goals and the relative importance of each goal. They choose actions that maximize their expected utility, much like a superhero trying to save the day while minimizing collateral damage.
  5. Learning agents: These agents are the ultimate adaptors. They start with a basic set of knowledge and skills, but constantly improve based on their experiences. They have a learning element that receives feedback from a critic who tells them how well they're doing. The learning element then tweaks the agent's performance to do better next time. It's like having a built-in coach that helps the agent perform their task better and better over time.

These theoretical concepts are great for understanding the basics of AI agents, but modern software agents powered by large language models (LLMs) are like a mashup of all these types. LLMs can juggle multiple tasks, plan for the future, and even estimate how useful different actions might be.

Can LLM act as an AI agent?

Although LLMs cannot yet act as standalone AI agents, they are becoming a key component of modern autonomous agents. Let’s see why and how.

Modern AI research revolves around neural networks. However, most networks have only been able to perform a single task or a bundle or closely related tasks: a great example is DeepMind's Agent57, which could play all 57 Atari games with a single model and achieved superhuman performance on most of them.

Very impressive, even though such models were still limited to a specific domain.

This all changed with the advent of modern transformer-based large language models.

Early GPT (generative pre-trained transformer) models could only serve as fancy chatbots with encyclopedic knowledge. However, as the models grew in size, they began to exhibit interesting behavior.

As the number of model parameters increased, almost every modern LLM grasped many ideas simply from the textual data. In other words: no one specifically trained the model to translate text or even fix code.

0:00
/
What an LLM can do largely depends on the number of model parameters alone. Source: https://research.google/blog/pathways-language-model-palm-scaling-to-540-billion-parameters-for-breakthrough-performance/

The mixture of huge amounts of training data and a large number of model parameters was enough to “bake in” a lot of ideas from the real world into the model.

Through further fine-tuning, these Transformer-based models were able to follow instructions even better.

All this incredible progress allows developers to create AI agents with just a set of sophisticated instructions called prompts.

Is ChatGPT an AI agent?

While the ChatGPT has impressive capabilities and some useful tools, including web browsing and data analytics, it still lacks some crucial components typical for agents. For example, it's not yet autonomous and requires human input at each iteration.

Choosing between ChatGPT and custom AI agents depends on what you need and the context you'll use it in. ChatGPT is great for general conversational AI, creating content and broad applications. For specialized tasks, real-time data processing, and integrated system solutions, custom AI agents might be a better fit.

💡
To maximize AI agents (or just AI) potential in your projects, take a look at our recent articles on AI coding assistants and the best AI chatbots.

What are the key components of an AI agent?

In essence, an AI agent gathers data with sensors, comes up with rational solutions via a reasoning engine, with control systems, performs actions with actuators and learns from mistakes through its learning system. But what does this process look like in detail?

Let's break down the steps of an LLM-powered software Agent.

A simplified diagram of an AI agent, adapted from: Artificial Intelligence: A Modern Approach, 4th Global ed.
A simplified diagram of an AI agent, adapted from: Artificial Intelligence: A Modern Approach, 4th Global ed.

Sensors

Information about the environment usually comes in the form of text information. This can be:

  • Plain text in some natural language like a user request or a question;
  • Semi-structured information, such as simple Markdown or JSON;
  • Various diagrams or graphs in a text format, i.e. Mermaid flowcharts;
  • More structured text in tabular form, log streams, time series data;
  • Code snippets or even complete programs in many programming languages;

Multimodal LLMs can receive images or even audio data as input.

Actuators

Most language models can only produce textual output. However, this output can be in a structured format such as XML, JSON, short snippets of code or even complete API calls with all query and body parameters.

Now it’s a developer’s job to feed the outputs from LLMs into other systems (i.e. make an actual API call or run an n8n workflow).

Action results can go back into the model to provide feedback and update the information about the environment.

Reasoning engine (aka the "brain")

The "brain" of an LLM-powered AI agent is, well, a large language model itself. Its main goal is to come up with rational decisions based on goals to maximize a certain performance. If necessary, the reasoning engine receives feedback from the environment, self-controls, and adapts its actions.

But how exactly does it work?

Giant pre-trained models such as GPT-4, Claude 3, Llama 3 and many others have a "baked in'' understanding of the world they gained from the piles of text during training. Multimodal large language models such as GPT-4o go beyond  and receive images and audio data in addition to text for training. Further fine-tuning allows these models to improve at specific tasks.

What these specific tasks are is largely an area of ongoing research, but we already know that large LLMs are able to:

  • follow instructions,
  • imitate human-like reasoning,
  • understand the implied intent just from the user commands (known as prompts).

All that remains is the final step: how to build a series (or chains) of prompts so that LLM can simulate an autonomous behavior.

And this is exactly where LangChain comes into play!

Why use LangChain for AI agents?

In the context of AI agents, LangChain is a framework that lets you leverage large language models (LLMs) to design and build these agents.

Traditionally, you’d program in code the whole sequence of actions an agent takes.

LangChain simplifies this process by providing prompt templates and tools that an agent gets access to. An LLM acts as the reasoning engine behind your agent and decides what actions to take and in which order. LangChain hides the complexity of this decision making behind its own API. Note that this is not a REST API, but rather an internal API designed specifically for interacting with these models to streamline agent development.

💡
n8n takes it a step further by providing a low-code interface to LangChain. In n8n, you can simply drag and drop LangChain nodes onto the canvas and configure them. Advanced users can even write JS code for some of the LangChain modules. n8n supports the JavaScript implementation of LangChain. Finally, LangChain supports several prompting techniques suitable for making AI agents.
An example of a conversation agent
An example of a conversation agent.

This simple conversation agent uses window buffer memory and a tool for making Google search requests. With n8n you can easily swap Language Models, provide different types of chat memory and add extra tools.

There are several articles on prompting techniques to activate LLM's abilities to reason, self control, pick the available tools to perform actions and observe the results. The LangChain developers have implemented these techniques so that they are available without additional configuration:

  1. The ReAct (Reason, Act) agent is designed to reason about a given task, determine the necessary actions and then execute them. It follows a cycle of reasoning and acting until the task is completed. The ReAct agent can break down complex tasks into smaller sub-tasks, prioritize them and execute them one after the other.
  2. The Plan and Execute agent is similar to the ReAct agent but with a focus on planning. It first creates a high-level plan to solve the given task and then executes the plan step by step. This agent is particularly useful for tasks that require a structured approach and careful planning.
💡
In n8n, you can create both types of agents by combining LangChain nodes with tool nodes that perform specific actions, such as calling another n8n workflow or making a direct API request.

Additionally, LangChain offers various other agents:

  • The Conversational agent is designed to have human-like conversations. It can maintain context, understand user intent and provide relevant answers. This agent is typically used for building chatbots, virtual assistants and customer support systems.
  • The Tools agent uses external tools and APIs to perform actions and retrieve information. It can understand the capabilities of different tools and determine which tool to use depending on the task. This agent helps integrate LLMs with various external services and databases.
  • The SQL agent is specifically designed to interact with databases using SQL queries. It can understand natural language questions, convert them into SQL queries, execute the queries and present the results in a user-friendly format. This agent is valuable for building natural language interfaces to databases.

As you can see, the LangChain definitions for the software agents differ from the theoretical framework. You may need to combine several LangChain nodes to make a really autonomous agent.

💡
You are free to use any other n8n nodes or even write a JS code for LangChain nodes.

AI agent examples for developers

Before we look at real-world examples, let's consider what AI agents developers can create for themselves. Here are just a few examples:

  • GitHub AI agent. In addition to classical GitHub actions that automate the development process, an agent can monitor user activity in the repo. It can suggest code fixes for simple bug reports and make pull requests. It can track user communication and prevent spam activity.
  • AI agent for dependency checks. Modern projects often have complex library dependencies. An agent that keeps an eye on new library versions can assess potential impact or even identify breaking changes before the dev team moves to a newer version.
  • DevSecOps AI agent. In large organizations, AI agents augment traditional security practices. For example, they could monitor logs and detect unusual patterns that are not caught by simple alert rules. AI agents can check for common vulnerabilities and exposures (CVE) and assess the impact of newly available exploits. Last but not least, agents can track container build commands for potential vulnerabilities.

You can do a lot more with AI agents if you know how to code. Here are a few starting points:

To sum up, AI agents extend traditional algorithms and automations and shine in situations where the next step is often not known in advance. This ability makes AI agents invaluable for Sec/IT/DevOps professionals in large enterprises looking to enhance efficiency, security and operational excellence.

How to create an AI agent with n8n?

Let’s get to work and create a simple SQL Agent that can provide answers based on the database content.

To create your workflow you first need to sign up for a cloud n8n account or self-host your own n8n instance.

Once you are in, browse a template page and click “Use workflow”. Alternatively, create the workflow from scratch.

A simple SQL Agent to “talk” to the local SQLite database
A simple SQL Agent to “talk” to the local SQLite database.

Step 1. Download and save SQLite file

The upper part of the workflow begins with the Manual trigger.

  1. The HTTP Request node downloads a Chinook example database as a zip archive;
  2. Compression node extracts the content via decompress operation;
  3. Finally, the Read/Write Files from Disk node saves the .db file locally.

Run this part manually just once.

Step 2. Receive a chat message and load the local SQLite file

The lower part of the workflow is a bit more complex, let’s take a closer look at how it works.

  1. The first node is a chat trigger. This is where you can send queries, such as "What is the revenue by genre?"
  2. Immediately afterwards, the local chinook.db is loaded into the memory.
  3. The next Set node combines the binary data with the Chat Trigger input. Select the JSON mode and provide the following expression: {{ $('Chat Trigger').item.json }}. Also, turn on the "Include Binary File" toggle (you can find it by clicking on Add Options).
Use Set node to combine JSON and binary data from different sources
Use Set node to combine JSON and binary data from different sources

Step 3. Add and configure the LangChain Agent node

Let’s take a look at the LangChain Agent node

Select the SQL Agent type and SQLite database source. This allows you to work with the local SQLite file without connecting to remote sources.

Make sure that the Input Binary Field name matches the binary data name.

The LangChain SQL Agent makes several requests before providing the final answer
The LangChain SQL Agent makes several requests before providing the final answer

Keep the other settings, close the config window and connect 2 extra nodes: Windows Buffer Memory node - to store past answers - and the Model, e.g. OpenAI Chat model node. Pick the model name (i.e. gpt-4-turbo) and adjust the temperature. For coding tasks lower values work better, e.g. 0.3.

Now you can ask various questions about your data, even non-trivial ones! Compare the following user inputs:

  • "What are the names of the employees?" requires just 2 SQL queries VS
  • "What are the revenues by genre?", where the agent has to make several requests before arriving at a solution.

This agent can still be improved, e.g. you can always pass the schema so that the agent doesn’t waste time figuring out the structure each time.

Wrap up

In this guide, we have given a brief introduction to what an AI agent is, what elements it should have from a theoretical point of view, as well as what the modern software LLM-powered agents look like.

We then discussed how AI agents can be helpful for developers, especially in less predictable situations.

Finally, we showed how to create a LangChain SQL Agent in n8n that can analyze a local SQLite file and provide answers based on its content.

What’s next?

Now that you have an overview and a practical example of how to create AI agents, it’s time to challenge the status quo and create an agent for your real tasks.

Thanks to n8n’s low-code capabilities, you can focus on designing, testing and upgrading the agent. All the details are hidden under the hood, but you can of course write your own JS code in LangChain nodes if needed.

Whether you're working alone, in a small team, or in an enterprise, n8n got you covered. Choose from our cloud plans and jump-start right away or explore powerful features of the Enterprise edition. If you are a small growing startup, there is a dedicated plan for you on the pricing page.

Join the community forum and share your success or seek support!