Build Self-Improving Agent Skills with cognee and n8n

This Verified Node Spotlight is a guest post written by Vasilije Marković, CEO and Founder, cognee.

Claude Code and other agent skill files are easy to write and easier to neglect. You create skills for code review, tests, migrations, API conventions, and at first everything works. The agent follows the instructions, the reviews come back useful, so you keep extending the library.

Then the project moves on and the skill files don’t. A skill that was indispensable a few months ago is now missing a critical check — it never got updated, because manually auditing a growing pile of Markdown instruction files is exactly the kind of task nobody picks up under deadline pressure.

This tutorial builds a maintenance loop for those skills. When a review run scores below a threshold, the workflow records the feedback, asks cognee to propose a rewrite, routes that proposal through an approval gate in n8n, and, only after approval, writes the change back to the skill, with a before/after diff you can inspect, post to Slack, or attach to a pull request.

A weak run becomes an approved rewrite to SKILL.md, improving future agent runs.

There are a few ways to run this:

Pick the one that fits how you work.

Visual building in the n8n editor (start here): Every cognee step — ingest, review, propose, review the diff, apply — is an operation on the cognee verified node, and n8n handles the scoring, the approval gate, and the diff. No Python, no shell, no scripts to write: import the template, connect it to a cognee server, and run. Which server you use depends on how you run n8n:
- On n8n Cloud, use Cognee Cloud. It's the fastest setup, with nothing to host.
- If you self-host n8n, use your own cognee server to keep everything in-house.
Building with the cognee SDK: This build calls the cognee Python SDK directly (remember, search, improve_skill) from a small runner, wired into n8n through Execute Command nodes. You own and customize every step of the loop, all on open-source cognee.

Both produce the same outcome — a weak run becomes an approvable edit to a skill file — and both use the same example: a code-review skill and an authorization-boundary review task.

Most of this post walks through visual building in the editor from a template. If you'd rather drive the loop from code, jump to the SDK build discussed at the end under the section “Self-hosted with Python SDK build”.

Want to run it fully local? Both paths can run fully on your own infrastructure using self-hosted n8n, a self-hosted cognee server, and a local LLM and embedding model. Nothing has to leave your machine.

What cognee and n8n each do

cognee is a memory layer for AI agents that turns data from multiple document formats into structured knowledge graphs. Instead of retrieving information based solely on conceptual similarity, it extracts entities, maps their relationships, and captures provenance so agents can understand how pieces of information are connected.

In this workflow the Cognee verified node handles six jobs, each a node operation:

Ingest Skill — store a SKILL.md as a dataset-scoped skill
Review Skill — run the review task with that skill loaded (agentic completion)
Propose Improvement — record the weak run and generate a proposed rewrite
Get Proposal — fetch the before/after procedure, rationale, and confidence
Apply Improvement — apply the approved proposal
Get Skill — read the updated skill back

n8n handles everything around those: orchestration, scoring, the threshold check, the approval gate, the diff, notifications, and branching.

The workflow never quietly rewrites your skill files. A proposal is always created first, and the proposed text is visible before anything is applied, and you decide what happens next in your n8n workflow.

What you will build

The visual workflow takes a sub-threshold-scoring agent run and turns it into a reviewable, approvable update to a Markdown skill file.

The self-improving skill workflow in the n8n editor

It runs these steps:

Ingest Skill loads the local code-review skill into cognee
Review Skill runs the review task with the skill loaded
The review comes back with a score
Should Improve? checks the score against the threshold
Propose Improvement records feedback and creates a proposal (not applied)
Get Proposal fetches the before/after for review
A Code node builds the diff; the Approved? gate decides
Apply Improvement writes the approved change
Get Skill + Show Skill Delta confirm and output the result

The example is an authorization-boundary review: the code under review retrieves a dataset by ID, and the agent should flag that it never checks whether that dataset belongs to the authenticated user, never handles a missing record, and has no tests for the owner / non-owner / missing cases.

Before you start

n8n instance with Cognee node (v0.5.1 or higher) installed. From a local n8n instance, search for “cognee” right from the editor's Node panel, then click install. Details in verified-install docs.

Don't have n8n? Start with a free 14-day trial. No credit card needed.

A cognee server with an LLM and embedding provider configured, since the review and proposal steps call the model. The easiest option is Cognee Cloud, which is fully managed and serverless, so there is nothing to install and the model and embedding providers are handled for you. You can also connect to a self-hosted cognee instance, but you will need to configure those providers yourself. The no-code Skill operations use cognee’s /api/v1/skillendpoints, which are available for self-hosted instances and are being rolled out on Cognee Cloud.

Ready to add persistent memory to your agents? Try cognee with Cloud deployment (serverless or private infrastructure) or book a call with cognee to discuss on-prem solutions.

Cognee API credentials in n8n (Credentials → New → Cognee API): enter your server’s Base URL, without a trailing /api, and API key. For Cognee Cloud, copy both values directly from your tenant dashboard at platform.cognee.ai.

Everything in the workflow itself happens in the n8n UI (no terminal commands needed).

Import the template to your n8n instance.

Get the workflow

Step 1: Configure the run

The workflow starts with a Manual Trigger (or a Webhook Trigger in production) and a Demo Controls Set node. This is where you set the values you adjust most while testing:

JSON

{
  "skill_name": "code-review",
  "dataset_name": "n8n-skill-self-improvement",
  "skill_markdown": "<the SKILL.md body>",
  "task_text": "<the review task>",
  "score_instructions": "Grade your own review per dimension (coverage, correctness, ...) from 0.0 to 1.0.",
  "score_threshold": 0.9,
  "approved": "1"
}

skill_name / skill_markdown — the skill to run and improve, inline
task_text — the review task to run with the skill loaded
score_instructions — the rubric the agent uses to grade its own review per dimension; the node turns those grades into a single score
score_threshold — the minimum score required to skip the improvement path
approved — whether the proposal passes the approval gate ("1" in the local demo)

Step 2: Ingest the skill (verified node)

The Ingest Skill operation posts the inline SKILL.md to cognee and stores it as a dataset-scoped skill:

JSON

{
  "status": "completed",
  "dataset_name": "n8n-skill-self-improvement",
  "dataset_id": "3d8f047d-dc45-5fee-9b59-b82e4206711c",
  "items": [{ "name": "code-review", "kind": "skill", "declared_tools": ["memory_search"] }]
}

The returned dataset_id is reused by the later Get steps.

Step 3: Review the skill against a real task (verified node)

The Review Skill operation runs an agentic completion with the skill loaded. The query is the task plus the score_instructions rubric, so the agent reviews the change and grades its own review. The node parses that into a flat result:

JSON

{
  "score": 0.42,
  "missing_instruction": "No explicit owner / non-owner / missing-dataset test requirement",
  "result_summary": "Flagged the ownership gap but missed the test matrix",
  "dimensions": [{ "name": "coverage", "score": 0.4 }, { "name": "correctness", "score": 0.45 }]
}

That score is what the next step branches on.

The sample task is the authorization-boundary review. With the skill loaded, the agent returns a structured review that flags the missing ownership check, the missing 404 handling, and the missing owner/non-owner/missing-dataset tests.

Step 4: Let n8n decide whether to improve

The Should Improve? IF node compares the review’s score (from Review Skill) to the threshold:

score < score_threshold

If the review scored below the threshold, the workflow takes the improvement path. If the score meets or exceeds the threshold, it routes to No Improvement Needed and exits cleanly. The threshold lives in n8n, is visible and adjustable, and is easy to swap for a stricter evaluator (an LLM-as-judge node, a CI score) without touching the cognee steps on either side of it.

The split of responsibility: the agent grades its own review against the score_instructions rubric, the verified node parses those per-dimension grades into one score, and n8n owns the threshold decision.

Step 5: Propose an improvement (verified node)

The Propose Improvement operation records the weak run and asks cognee for a rewrite. Crucially, it creates a proposal but does not apply it:

JSON

{
  "items": [
    { "kind": "skill_run", "run_id": "82a14c99…", "success_score": 0.42 },
    { "kind": "skill_improvement_proposal", "proposal_id": "48d6f3f3-430f-460c-bf5b-2987086a43dd", "status": "proposed" }
  ]
}

The proposed change stays in a holding state until n8n decides what to do with it. A single weak review can never silently rewrite your instructions.

Step 6: Review the diff before approving (verified node)

This is the step that makes the loop safe. The “Get Proposal” operation returns the proposal’s full before/after, plus the model’s rationale and confidence:

JSON

{
  "proposal_id": "48d6f3f3-430f-460c-bf5b-2987086a43dd",
  "skill_id": "a49f5ceb-fe79-5094-bee6-ee1de4d45671",
  "confidence": 0.93,
  "rationale": "Clarifies and tightens existing rules so agent outputs strictly-structured, minimal issue reports focused on authorization, tests, and audit logging…",
  "old_procedure": "# code-review …",
  "proposed_procedure": "# code-review …"
}

A Code node turns old_procedure vs proposed_procedure into a unified diff (skill_delta_markdown) which makes it ready to drop into a Slack message or a PR. Here’s the real change from this run (excerpt):

Diff

  1. Read the diff file-by-file and hunk-by-hunk.
- Do not run or apply changes.
+ Do not run, apply, or modify code. Do not mutate state.

  3. Authorization and data-access checks:
-    Apply one of these two patterns and verify the diff shows it:
-      - Caller-side validation (preferred if get_dataset doesn't accept a requesting user):
+    Caller-side validation is REQUIRED unless the get_dataset-enforces-acl pattern is present:
         dataset = get_dataset(requested_dataset)
         if dataset is None: return 404
         if dataset.owner_id != user.id and not user_has_access(user, dataset): return 403
+    If the check is missing or weakened, mark the issue Critical and supply the exact pseudo-diff.

+ 12. Failure-evidence handling: turn a "return get_dataset(requested_dataset)" example into a
+     Critical issue with the caller-side patch, repro steps, and a short-term mitigation.
+ 13. Enforcement checklist: before finishing, confirm every issue has the required fields.

The ownership-check lines were already in the skill — what changed is the framing: a “preferred” pattern became required, a Critical severity now fires when the check is missing, and two enforcement sections were added.

Step 7: Apply the approved change (verified node)

The Approved? IF node checks the approval flag. In the demo that’s the approved control; in production you’d replace it with a Slack message plus a Wait-for-webhook approval (see the sticky note on the canvas).

On approval, Apply Improvement updates the stored skill in cognee (status flips to applied):

JSON

{ "items": [{ "kind": "skill_improvement_proposal", "proposal_id": "48d6f3f3…", "status": "applied" }] }

Step 8: Confirm the result

Get Skill reads the skill back so you can confirm the new procedure is live, and Show Skill Delta emits the final skill_delta / skill_delta_markdown. One missed authorization check → one concrete, approved edit to the instruction that was responsible for missing it.

Who does what

The whole point of the visual build: you run the entire loop without writing a line of code.

Ways to adapt the visual build

The demo keeps the approval flow simple so the loop is easy to inspect end to end. For production, swap Demo Controls for:

A Webhook Trigger fired by Claude Code, your CI pipeline, or another agent runner
A Slack node that posts the skill_delta_markdown packet — diff, rationale, confidence — for review
A Wait node or webhook callback that pauses until someone approves
A GitHub branch and pull request in place of the direct apply, so the skill change goes through normal code review
A datastore table for accepted and rejected proposals, for an audit trail
Different thresholds per skill type — tighter for security-adjacent reviews, lighter for formatting

The division of labor stays the same: n8n owns orchestration, scoring, approval, notifications, and the audit trail; the Verified Node owns memory, retrieval, proposal generation, proposal review, and application.

Self-hosted with Python SDK build

If you are a terminal-comfortable developer who wants the whole loop running locally for free against open-source cognee, the advanced/ template is for you. It does the same six cognee jobs through the Python SDK (cognee.remember, cognee.search with AGENTIC_COMPLETION, and improve_skill) driven by a small runner script and n8n Execute Command nodes:

Bash

python run_self_improve_skill.py init-state
python run_self_improve_skill.py remember-skills
python run_self_improve_skill.py run-agent
python run_self_improve_skill.py record-feedback
python run_self_improve_skill.py review-packet
python run_self_improve_skill.py apply-proposal

n8n 2.x excludes executeCommand by default for security, so this build enables it explicitly. It’s the most hackable version and never leaves your machine — the trade-off is the terminal setup. See advanced/README.md for the environment variables and run instructions.

A maintenance loop that keeps pace with the project

Agent skills are most valuable when they reflect how the project works today — not how it worked when someone first wrote the instruction file. This workflow gives that problem a structure: run, score, propose, review the diff, approve, apply. An authorization check that slipped through a review doesn’t have to slip through again — it becomes a specific, approvable edit to the skill, visible before it shapes the next run.

Download the workflow template, install the Cognee verified node and start with the visual build and one skill – a review habit your team already trusts. Let the workflow show you exactly how the skill changes after feedback, then move onto the advance template.

Build Self-Improving Agent Skills with cognee and n8n

There are a few ways to run this:

What cognee and n8n each do

What you will build

Before you start

Step 1: Configure the run

Step 2: Ingest the skill (verified node)

Step 3: Review the skill against a real task (verified node)

Step 4: Let n8n decide whether to improve

Step 5: Propose an improvement (verified node)

Step 6: Review the diff before approving (verified node)

Step 7: Apply the approved change (verified node)

Step 8: Confirm the result

Who does what

Ways to adapt the visual build

Self-hosted with Python SDK build

A maintenance loop that keeps pace with the project

Useful resources

Where there's a will, there's already a workflow

Build Self-Improving Agent Skills with cognee and n8n

There are a few ways to run this:

What cognee and n8n each do

What you will build

Before you start

Step 1: Configure the run

Step 2: Ingest the skill (verified node)

Step 3: Review the skill against a real task (verified node)

Step 4: Let n8n decide whether to improve

Step 5: Propose an improvement (verified node)

Step 6: Review the diff before approving (verified node)

Step 7: Apply the approved change (verified node)

Step 8: Confirm the result

Who does what

Ways to adapt the visual build

Self-hosted with Python SDK build

A maintenance loop that keeps pace with the project

Useful resources

Share with us

Other guides to get you going

Where there's a will, there's already a workflow