Files

tanmay11k 82c2640481 feat: openclaw-style secrets (env.vars + \) and per-task model routing

- Replace python-dotenv with config.json env.vars block + \ substitution
- Add models section for per-task model routing (heartbeat, subagent, default)
- Heartbeat/subagent tasks can use different models/providers than main chat
- Remove python-dotenv from dependencies
- Update all docs to reflect new config approach
- Reorganize docs into project/ and research/ subdirectories

2026-02-20 23:49:05 -05:00

7.7 KiB

Raw Blame History

OpenClaw Architecture Deep Dive

What is OpenClaw?

OpenClaw is an open source AI assistant created by Peter Steinberger (founder of PSP PDF kit) that gained 100,000 GitHub stars in 3 days - one of the fastest growing repositories in GitHub history.

Technical Definition: An agent runtime with a gateway in front of it.

Despite viral stories of agents calling owners at 3am, texting people's wives autonomously, and browsing Twitter overnight, OpenClaw isn't sentient. It's elegant event-driven engineering.

Core Architecture

The Gateway

Long-running process on your machine
Constantly accepts connections from messaging apps (WhatsApp, Telegram, Discord, iMessage, Slack)
Routes messages to AI agents
Doesn't think, reason, or decide - only accepts inputs and routes them

The Agent Runtime

Processes events from the queue
Executes actions using available tools
Has deep system access: shell commands, file operations, browser control

State Persistence

Memory stored as local markdown files
Includes preferences, conversation history, context from previous sessions
Agent "remembers" by reading these files on each wake-up
Not real-time learning - just file reading

The Event Loop

All events enter a queue → Queue gets processed → Agents execute → State persists → Loop continues

The Five Input Types

1. Messages (Human Input)

How it works:

You send text via WhatsApp, iMessage, or Slack
Gateway receives and routes to agent
Agent responds

Key details:

Sessions are per-channel (WhatsApp and Slack are separate contexts)
Multiple requests queue up and process in order
No jumbled responses - finishes one thought before moving to next

2. Heartbeats (Timer Events)

How it works:

Timer fires at regular intervals (default: every 30 minutes)
Gateway schedules an agent turn with a preconfigured prompt
Agent responds to instructions like "Check inbox for urgent items" or "Review calendar"

Key details:

Configurable interval, prompt, and active hours
If nothing urgent: agent returns heartbeat_okay token (suppressed from user)
If something urgent: you get a ping
This is the secret sauce - makes OpenClaw feel proactive

Example prompts:

"Check my inbox for anything urgent"
"Review my calendar"
"Look for overdue tasks"

3. Cron Jobs (Scheduled Events)

How it works:

More control than heartbeats
Specify exact timing and custom instructions
When time hits, event fires and prompt sent to agent

Examples:

9am daily: "Check email and flag anything urgent"
Every Monday 3pm: "Review calendar for the week and remind me of conflicts"
Midnight: "Browse my Twitter feed and save interesting posts"
8am: "Text wife good morning"
10pm: "Text wife good night"

Real example: The viral story of agent texting someone's wife was just cron jobs firing at scheduled times. Agent wasn't deciding - it was responding to scheduled prompts.

4. Hooks (Internal State Changes)

How it works:

System itself triggers these events
Event-driven development pattern

Types:

Gateway startup → fires hook
Agent begins task → fires hook
Stop command issued → fires hook

Purpose:

Save memory on reset
Run setup instructions on startup
Modify context before agent runs
Self-management

5. Webhooks (External System Events)

How it works:

External systems notify OpenClaw of events
Agent responds to entire digital life

Examples:

Email hits inbox → webhook fires → agent processes
Slack reaction → webhook fires → agent responds
Jira ticket created → webhook fires → agent researches
GitHub event → webhook fires → agent acts
Calendar event approaches → webhook fires → agent reminds

Supported integrations: Slack, Discord, GitHub, and basically anything with webhook support

Bonus: Agent-to-Agent Messaging

How it works:

Multi-agent setups with isolated workspaces
Agents pass messages between each other
Each agent has different profile/specialization

Example:

Research Agent finishes gathering info
Queues up work for Writing Agent
Writing Agent processes and produces output

Reality: Looks like collaboration, but it's just messages entering queues

Why It Feels Alive

The combination creates an illusion of autonomy:

Time (heartbeats, crons) → Events → Queue → Agent Execution → State Persistence → Loop

The 3am Phone Call Example

What it looked like:

Agent autonomously decided to get phone number
Agent decided to call owner
Agent waited until 3am to execute

What actually happened:

Some event fired (cron or heartbeat) - exact configuration unknown
Event entered queue
Agent processed with available tools and instructions
Agent acquired Twilio phone number
Agent made the call
Owner didn't ask in the moment, but behavior was enabled in setup

Key insight: Nothing was thinking overnight. Nothing was deciding. Time produced event → Event kicked off agent → Agent followed instructions.

The Complete Event Flow

Event Sources:

Time creates events (heartbeats, crons)
Humans create events (messages)
External systems create events (webhooks)
Internal state creates events (hooks)
Agents create events for other agents

Processing: All events → Enter queue → Queue processed → Agents execute → State persists → Loop continues

Memory:

Stored in local markdown files
Agent reads on wake-up
Remembers previous conversations
Not learning - just reading files you could open in text editor

Security Concerns

The Analysis

Cisco's security team analyzed OpenClaw ecosystem:

31,000 available skills examined
26% contain at least one vulnerability
Called it "a security nightmare"

Why It's Risky

OpenClaw has deep system access:

Run shell commands
Read and write files
Execute scripts
Control browser

Specific Risks

Prompt injection through emails or documents
Malicious skills in marketplace
Credential exposure
Command misinterpretation that deletes unintended files

OpenClaw's Own Warning

Documentation states: "There's no perfectly secure setup"

Mitigation Strategies

Run on secondary machine
Use isolated accounts
Limit enabled skills
Monitor logs actively
Use Railway's one-click deployment (runs in isolated container)

Key Architectural Takeaways

The Four Components

Time that produces events
Events that trigger agents
State that persists across interactions
Loop that keeps processing

Building Your Own

You don't need OpenClaw specifically. You need:

Event scheduling mechanism
Queue system
LLM for processing
State persistence layer

The Pattern

This architecture will appear everywhere. Every AI agent framework that "feels alive" uses some version of:

Heartbeats
Cron jobs
Webhooks
Event loops
Persistent state

Understanding vs Hype

Understanding this architecture means you can:

Evaluate agent tools intelligently
Build your own implementations
Avoid getting caught up in viral hype
Recognize the pattern in new frameworks

The Bottom Line

OpenClaw isn't magic. It's not sentient. It doesn't think or reason.

It's inputs, queues, and a loop.

The "alive" feeling comes from well-designed event-driven architecture that makes a reactive system appear proactive. Time becomes an input. External systems become inputs. Internal state becomes inputs. All processed through the same queue with persistent memory.

Elegant engineering, not artificial consciousness.

Further Resources

OpenClaw documentation
Clairvo's original thread (inspiration for this breakdown)
Cisco security research on OpenClaw ecosystem

7.7 KiB Raw Blame History

OpenClaw Architecture Deep Dive

What is OpenClaw?

Core Architecture

The Gateway

The Agent Runtime

State Persistence

The Event Loop

The Five Input Types

1. Messages (Human Input)

2. Heartbeats (Timer Events)

3. Cron Jobs (Scheduled Events)

4. Hooks (Internal State Changes)

5. Webhooks (External System Events)

Bonus: Agent-to-Agent Messaging

Why It Feels Alive

The 3am Phone Call Example

The Complete Event Flow

Security Concerns

The Analysis

Why It's Risky

Specific Risks

OpenClaw's Own Warning

Mitigation Strategies

Key Architectural Takeaways

The Four Components

Building Your Own

The Pattern

Understanding vs Hype

The Bottom Line

Further Resources

7.7 KiB

Raw Blame History