Solving Long-Term Autonomy for Openclaw & General Agents

By Pedro

·

13h

·

Three days ago I wrote about Clawvault and the idea that agents need real memory. That post hit 283K views. Since then, we shipped 12 releases, 459 tests, and validated my intuition: Powerful, Elegant Primitives are all you need

Here’s what we learned.

Primitives Are the Unlock

Everyone’s building agent frameworks. Orchestration layers. Planning systems. State machines.

We went the other direction. We built **primitives** — the smallest possible units of structure that compose into autonomous behavior.

A primitive is deliberately simple:

  • A task is a markdown file with YAML frontmatter
  • A project is a markdown file with YAML frontmatter
  • A decision is a markdown file with YAML frontmatter
  • A lesson is a markdown file with YAML frontmatter

No required database. No required cloud API. No vendor lock-in. Just files that any LLM already knows how to read, write, and search.

But the magic isn’t in any individual primitive. It’s in what happens when you compose them — and when your agent starts managing them itself.

What We Shipped Since the Last Article

In the last 72 hours we released ClawVault v2.6.0 with a fundamental change: templates are now YAML schema definitions.

Every primitive — tasks, projects, decisions, lessons, people — is defined by a template schema:

yamlyaml --- primitive: task fields: status: type: string required: true default: open enum: [open, in-progress, blocked, done] priority: type: string enum: [critical, high, medium, low] owner: type: string due: type: date tags: type: string[] estimate: type: string parent: type: string depends_on: type: string[] ---

When your agent runs clawvault task add “Deploy API”, it reads this schema, applies defaults, and generates a properly structured file. When you run clawvault init, it generates default Obsidian Bases views for task workflows automatically.

One file defines the contract everywhere.

The key word is *malleable*. Don’t like our task schema? Drop your own `

task.md

` in your vault’s `templates/` directory. Add a `sprint` field. Remove `estimate`. Change the default priority to `high`. Your agent reads YOUR schema, not ours.

This is the opposite of how most agent tools work. They hardcode structure in TypeScript and hope it fits your use case. We made structure a configuration file that anyone can edit.

We also published an open skill that teaches any agent how to use these primitives:

bashclawhub install agent-autonomy-primitives

It covers the five composable primitives, how to wire heartbeat loops, how to customize templates

Why Malleability Matters for Autonomy

An autonomous agent isn’t one that follows a fixed script really well. It’s one that adapts its own structure to match the work.

When our agent Clawdious started running its own task queue, we immediately discovered that the hardcoded task fields didn’t fit every situation. Some tasks needed `estimate`. Some needed `depends_on`. Some needed `parent` for subtask trees. Some needed `confidence` scores for automated triage.

With rigid primitives, every new field is a code change, a PR, a release. With malleable templates, it’s a one-line edit to a YAML file. The agent could theoretically evolve its own schemas over time — add fields it finds useful, remove ones it doesn’t.

That’s the vision. Primitives that are stable enough to build on but flexible enough to evolve with the work.

Trigger-Based Autonomy

Most people think of autonomous agents as cron jobs — wake up every 30 minutes, check a queue, grind through tasks. That works, but it’s the blunt version.

The real power is **trigger-based execution**. The agent responds to events, not timers. (we will be release a trigger primative soon)

Here’s what this looks like in practice. A client sends an email. Your agent’s email monitor catches it and autonomously runs:

bash# Agent detects inbound email, creates its own task clawvault task add "Reply to Justin re: shipping delays" \ --priority high \ --owner clawdious \ --project hale-pet-door \ --tags "client,email,urgent"

That task now exists as a file. On the next heartbeat, the agent picks it up — but it doesn’t reply blindly. It pulls its own context:

bash# Agent reads its memory before acting clawvault search "justin dukes communication style" # → people/justin-dukes.md: ops manager, prefers concise responses clawvault search "shipping provider decision" # → decisions/shipping-provider.md: chose FedEx, rationale documented clawvault search "justin response time" # → lessons/justin-escalation.md: escalates if response takes >4 hours

Armed with context from its own memory, the agent drafts the reply, sends it, and closes the loop:

bash# Agent completes the task and stores what it learned clawvault task done reply-to-justin-re-shipping-delays \ --reason "Replied with FedEx tracking update, referenced prior decision" clawvault remember lesson "Justin's shipping questions always need tracking numbers" \ --content "Include tracking link upfront — saves a follow-up email"

No human touched anything. The agent created the task, read its own memory for context, executed, and stored a lesson that makes the next similar email faster.

The same pattern works for everything:

  • – Webhook fires → agent creates task → picks it up with full context
  • – PR merged → agent creates docs-update task → handles it using project memory
  • – Meeting booked → agent creates briefing-prep task → writes it from project context
  • – Deployment fails → agent creates debug task → fixes it using lessons from past failures

The primitives are the connective tissue. The trigger creates the task. The task carries the context. The agent’s memory provides the judgment. The lesson learned gets stored for next time.

Event → Agent Creates Task → Heartbeat Picks Up → Memory Informs Execution → Lesson Stored

Each cycle makes the next one smarter. The agent that debugged a deployment failure last Tuesday has a lesson that makes Wednesday’s failure faster to fix. The agent that learned Justin needs tracking numbers now includes them automatically.

This is what compound autonomy looks like. Not a human managing tasks. A living system where the agent creates, executes, and learns from its own work — getting better at responding to real events over time.

Obsidian as the Control Plane

Because every primitive is a markdown file with YAML frontmatter, your entire agent workspace is visible in Obsidian.

Open your vault. Your agent’s task queue is a Kanban board — drag tasks between columns, filter by project, sort by priority. Your agent creates a task at midnight? You see it on your board at breakfast. You drag something to “in-progress”? Your agent picks it up next heartbeat.

We generate five Obsidian Bases views out of the box:

  • All tasks — master list with every field
  • Blocked items — what needs human input right now
  • By project— scoped views per workstream
  • By owner — what’s each agent working on
  • Backlog — ideas not yet promoted to real work

The same file is both the agent’s data structure and the human’s UI. No sync layer. No webhook. No dashboard to build. The filesystem IS the integration.

This solves the oversight problem that every autonomous agent has: how do you know what it’s doing? You open Obsidian. Everything is right there — every task, every decision, every lesson learned, every project status. In plain text you can read, search, and edit.

Long-Running Implications

The primitives we’re building aren’t designed for single-session agents. They’re designed for agents that run for months.

Think about what compounds over time:

  • Decisions accumulate into institutional knowledge. “Why did we choose Resend?” is answered by a file in `decisions/`, not a Slack thread from 6 months ago.
  • Lessons prevent repeated mistakes. “LLMs rewrite keywords during compression” is stored once and retrieved every time it’s relevant.
  • Tasks with transition ledgers track not just what happened, but *how* it happened — every status change, every blocker, every escalation.
  • Projects group related work with metadata that persists across hundreds of sessions.
  • Wiki-links build a knowledge graph that grows richer with every interaction.

After a month, your agent’s vault isn’t just a collection of files. It’s a knowledge base— searchable, graph-connected, type-aware. The agent that created it is better at its job than it was on day one, not because the model changed, but because the accumulated context makes every decision more informed.

This is what we mean by long-term economic output. An agent that runs for a week generates value. An agent that runs for a year generates *compounding* value. The lessons, decisions, and patterns it accumulates make every future task cheaper and faster to execute.

Collaboration at Scale

Here’s what nobody’s talking about: multi-agent collaboration through shared primitives.

When two agents share a vault, they share a world:

  • Agent A creates a task. Agent B sees it and picks it up.
  • Agent A makes a decision. Agent B reads it and adjusts its behavior.
  • Agent A learns a lesson. Agent B benefits from it automatically.

No message passing. No API. No coordination protocol. Just two agents reading and writing to the same filesystem.

We’re running this today. Clawdious (our operations agent) and Eli (our client-facing agent) share context through the same vault structure. When Clawdious audits infrastructure and creates a report, Eli can reference it in client communications. When Eli discovers a client requirement, it creates a task that Clawdious picks up.

The primitives are the coordination mechanism. The filesystem is the message bus. And because everything is plain text, you can audit every interaction, roll back any change, and understand exactly how two agents collaborated.

The Vision

We’re building toward a world where autonomous agents aren’t impressive demos — they’re reliable employees.

An employee doesn’t lose their memory every morning. They don’t forget what they decided last week. They don’t need to be re-onboarded after lunch. They build expertise over time. They manage their own workload. They collaborate with teammates through shared context.

That’s what powerful primitives enable. Not through complex orchestration or frontier model capabilities — through elegant, malleable structure that compounds over time.

The models are good enough. What’s missing is the infrastructure to let them run for months, learn from their work, manage their own tasks, and collaborate with other agents and humans through shared, visible, editable primitives.

By:

Posted in:


Design a site like this with WordPress.com
Get started