Files
Aetheel/docs/research/Openclaw deep dive.md
tanmay11k 82c2640481 feat: openclaw-style secrets (env.vars + \) and per-task model routing
- Replace python-dotenv with config.json env.vars block + \ substitution
- Add models section for per-task model routing (heartbeat, subagent, default)
- Heartbeat/subagent tasks can use different models/providers than main chat
- Remove python-dotenv from dependencies
- Update all docs to reflect new config approach
- Reorganize docs into project/ and research/ subdirectories
2026-02-20 23:49:05 -05:00

7.7 KiB

OpenClaw Architecture Deep Dive

What is OpenClaw?

OpenClaw is an open source AI assistant created by Peter Steinberger (founder of PSP PDF kit) that gained 100,000 GitHub stars in 3 days - one of the fastest growing repositories in GitHub history.

Technical Definition: An agent runtime with a gateway in front of it.

Despite viral stories of agents calling owners at 3am, texting people's wives autonomously, and browsing Twitter overnight, OpenClaw isn't sentient. It's elegant event-driven engineering.

Core Architecture

The Gateway

  • Long-running process on your machine
  • Constantly accepts connections from messaging apps (WhatsApp, Telegram, Discord, iMessage, Slack)
  • Routes messages to AI agents
  • Doesn't think, reason, or decide - only accepts inputs and routes them

The Agent Runtime

  • Processes events from the queue
  • Executes actions using available tools
  • Has deep system access: shell commands, file operations, browser control

State Persistence

  • Memory stored as local markdown files
  • Includes preferences, conversation history, context from previous sessions
  • Agent "remembers" by reading these files on each wake-up
  • Not real-time learning - just file reading

The Event Loop

All events enter a queue → Queue gets processed → Agents execute → State persists → Loop continues

The Five Input Types

1. Messages (Human Input)

How it works:

  • You send text via WhatsApp, iMessage, or Slack
  • Gateway receives and routes to agent
  • Agent responds

Key details:

  • Sessions are per-channel (WhatsApp and Slack are separate contexts)
  • Multiple requests queue up and process in order
  • No jumbled responses - finishes one thought before moving to next

2. Heartbeats (Timer Events)

How it works:

  • Timer fires at regular intervals (default: every 30 minutes)
  • Gateway schedules an agent turn with a preconfigured prompt
  • Agent responds to instructions like "Check inbox for urgent items" or "Review calendar"

Key details:

  • Configurable interval, prompt, and active hours
  • If nothing urgent: agent returns heartbeat_okay token (suppressed from user)
  • If something urgent: you get a ping
  • This is the secret sauce - makes OpenClaw feel proactive

Example prompts:

  • "Check my inbox for anything urgent"
  • "Review my calendar"
  • "Look for overdue tasks"

3. Cron Jobs (Scheduled Events)

How it works:

  • More control than heartbeats
  • Specify exact timing and custom instructions
  • When time hits, event fires and prompt sent to agent

Examples:

  • 9am daily: "Check email and flag anything urgent"
  • Every Monday 3pm: "Review calendar for the week and remind me of conflicts"
  • Midnight: "Browse my Twitter feed and save interesting posts"
  • 8am: "Text wife good morning"
  • 10pm: "Text wife good night"

Real example: The viral story of agent texting someone's wife was just cron jobs firing at scheduled times. Agent wasn't deciding - it was responding to scheduled prompts.

4. Hooks (Internal State Changes)

How it works:

  • System itself triggers these events
  • Event-driven development pattern

Types:

  • Gateway startup → fires hook
  • Agent begins task → fires hook
  • Stop command issued → fires hook

Purpose:

  • Save memory on reset
  • Run setup instructions on startup
  • Modify context before agent runs
  • Self-management

5. Webhooks (External System Events)

How it works:

  • External systems notify OpenClaw of events
  • Agent responds to entire digital life

Examples:

  • Email hits inbox → webhook fires → agent processes
  • Slack reaction → webhook fires → agent responds
  • Jira ticket created → webhook fires → agent researches
  • GitHub event → webhook fires → agent acts
  • Calendar event approaches → webhook fires → agent reminds

Supported integrations: Slack, Discord, GitHub, and basically anything with webhook support

Bonus: Agent-to-Agent Messaging

How it works:

  • Multi-agent setups with isolated workspaces
  • Agents pass messages between each other
  • Each agent has different profile/specialization

Example:

  • Research Agent finishes gathering info
  • Queues up work for Writing Agent
  • Writing Agent processes and produces output

Reality: Looks like collaboration, but it's just messages entering queues

Why It Feels Alive

The combination creates an illusion of autonomy:

Time (heartbeats, crons) → EventsQueueAgent ExecutionState PersistenceLoop

The 3am Phone Call Example

What it looked like:

  • Agent autonomously decided to get phone number
  • Agent decided to call owner
  • Agent waited until 3am to execute

What actually happened:

  1. Some event fired (cron or heartbeat) - exact configuration unknown
  2. Event entered queue
  3. Agent processed with available tools and instructions
  4. Agent acquired Twilio phone number
  5. Agent made the call
  6. Owner didn't ask in the moment, but behavior was enabled in setup

Key insight: Nothing was thinking overnight. Nothing was deciding. Time produced event → Event kicked off agent → Agent followed instructions.

The Complete Event Flow

Event Sources:

  • Time creates events (heartbeats, crons)
  • Humans create events (messages)
  • External systems create events (webhooks)
  • Internal state creates events (hooks)
  • Agents create events for other agents

Processing: All events → Enter queue → Queue processed → Agents execute → State persists → Loop continues

Memory:

  • Stored in local markdown files
  • Agent reads on wake-up
  • Remembers previous conversations
  • Not learning - just reading files you could open in text editor

Security Concerns

The Analysis

Cisco's security team analyzed OpenClaw ecosystem:

  • 31,000 available skills examined
  • 26% contain at least one vulnerability
  • Called it "a security nightmare"

Why It's Risky

OpenClaw has deep system access:

  • Run shell commands
  • Read and write files
  • Execute scripts
  • Control browser

Specific Risks

  1. Prompt injection through emails or documents
  2. Malicious skills in marketplace
  3. Credential exposure
  4. Command misinterpretation that deletes unintended files

OpenClaw's Own Warning

Documentation states: "There's no perfectly secure setup"

Mitigation Strategies

  • Run on secondary machine
  • Use isolated accounts
  • Limit enabled skills
  • Monitor logs actively
  • Use Railway's one-click deployment (runs in isolated container)

Key Architectural Takeaways

The Four Components

  1. Time that produces events
  2. Events that trigger agents
  3. State that persists across interactions
  4. Loop that keeps processing

Building Your Own

You don't need OpenClaw specifically. You need:

  • Event scheduling mechanism
  • Queue system
  • LLM for processing
  • State persistence layer

The Pattern

This architecture will appear everywhere. Every AI agent framework that "feels alive" uses some version of:

  • Heartbeats
  • Cron jobs
  • Webhooks
  • Event loops
  • Persistent state

Understanding vs Hype

Understanding this architecture means you can:

  • Evaluate agent tools intelligently
  • Build your own implementations
  • Avoid getting caught up in viral hype
  • Recognize the pattern in new frameworks

The Bottom Line

OpenClaw isn't magic. It's not sentient. It doesn't think or reason.

It's inputs, queues, and a loop.

The "alive" feeling comes from well-designed event-driven architecture that makes a reactive system appear proactive. Time becomes an input. External systems become inputs. Internal state becomes inputs. All processed through the same queue with persistent memory.

Elegant engineering, not artificial consciousness.

Further Resources

  • OpenClaw documentation
  • Clairvo's original thread (inspiration for this breakdown)
  • Cisco security research on OpenClaw ecosystem