feat: openclaw-style secrets (env.vars + \) and per-task model routing

- Replace python-dotenv with config.json env.vars block + \ substitution - Add models section for per-task model routing (heartbeat, subagent, default) - Heartbeat/subagent tasks can use different models/providers than main chat - Remove python-dotenv from dependencies - Update all docs to reflect new config approach - Reorganize docs into project/ and research/ subdirectories
2026-02-20 23:49:05 -05:00
parent 55c6767e69
commit 82c2640481
35 changed files with 2904 additions and 422 deletions
--- a/docs/research/Openclaw
+++ b/docs/research/Openclaw
@@ -0,0 +1,237 @@
+
+# OpenClaw Architecture Deep Dive
+
+## What is OpenClaw?
+
+OpenClaw is an open source AI assistant created by Peter Steinberger (founder of PSP PDF kit) that gained 100,000 GitHub stars in 3 days - one of the fastest growing repositories in GitHub history.
+
+**Technical Definition:** An agent runtime with a gateway in front of it.
+
+Despite viral stories of agents calling owners at 3am, texting people's wives autonomously, and browsing Twitter overnight, OpenClaw isn't sentient. It's elegant event-driven engineering.
+
+## Core Architecture
+
+### The Gateway
+- Long-running process on your machine
+- Constantly accepts connections from messaging apps (WhatsApp, Telegram, Discord, iMessage, Slack)
+- Routes messages to AI agents
+- **Doesn't think, reason, or decide** - only accepts inputs and routes them
+
+### The Agent Runtime
+- Processes events from the queue
+- Executes actions using available tools
+- Has deep system access: shell commands, file operations, browser control
+
+### State Persistence
+- Memory stored as local markdown files
+- Includes preferences, conversation history, context from previous sessions
+- Agent "remembers" by reading these files on each wake-up
+- Not real-time learning - just file reading
+
+### The Event Loop
+All events enter a queue → Queue gets processed → Agents execute → State persists → Loop continues
+
+## The Five Input Types
+
+### 1. Messages (Human Input)
+**How it works:**
+- You send text via WhatsApp, iMessage, or Slack
+- Gateway receives and routes to agent
+- Agent responds
+
+**Key details:**
+- Sessions are per-channel (WhatsApp and Slack are separate contexts)
+- Multiple requests queue up and process in order
+- No jumbled responses - finishes one thought before moving to next
+
+### 2. Heartbeats (Timer Events)
+**How it works:**
+- Timer fires at regular intervals (default: every 30 minutes)
+- Gateway schedules an agent turn with a preconfigured prompt
+- Agent responds to instructions like "Check inbox for urgent items" or "Review calendar"
+
+**Key details:**
+- Configurable interval, prompt, and active hours
+- If nothing urgent: agent returns `heartbeat_okay` token (suppressed from user)
+- If something urgent: you get a ping
+- **This is the secret sauce** - makes OpenClaw feel proactive
+
+**Example prompts:**
+- "Check my inbox for anything urgent"
+- "Review my calendar"
+- "Look for overdue tasks"
+
+### 3. Cron Jobs (Scheduled Events)
+**How it works:**
+- More control than heartbeats
+- Specify exact timing and custom instructions
+- When time hits, event fires and prompt sent to agent
+
+**Examples:**
+- 9am daily: "Check email and flag anything urgent"
+- Every Monday 3pm: "Review calendar for the week and remind me of conflicts"
+- Midnight: "Browse my Twitter feed and save interesting posts"
+- 8am: "Text wife good morning"
+- 10pm: "Text wife good night"
+
+**Real example:** The viral story of agent texting someone's wife was just cron jobs firing at scheduled times. Agent wasn't deciding - it was responding to scheduled prompts.
+
+### 4. Hooks (Internal State Changes)
+**How it works:**
+- System itself triggers these events
+- Event-driven development pattern
+
+**Types:**
+- Gateway startup → fires hook
+- Agent begins task → fires hook
+- Stop command issued → fires hook
+
+**Purpose:**
+- Save memory on reset
+- Run setup instructions on startup
+- Modify context before agent runs
+- Self-management
+
+### 5. Webhooks (External System Events)
+**How it works:**
+- External systems notify OpenClaw of events
+- Agent responds to entire digital life
+
+**Examples:**
+- Email hits inbox → webhook fires → agent processes
+- Slack reaction → webhook fires → agent responds
+- Jira ticket created → webhook fires → agent researches
+- GitHub event → webhook fires → agent acts
+- Calendar event approaches → webhook fires → agent reminds
+
+**Supported integrations:** Slack, Discord, GitHub, and basically anything with webhook support
+
+### Bonus: Agent-to-Agent Messaging
+**How it works:**
+- Multi-agent setups with isolated workspaces
+- Agents pass messages between each other
+- Each agent has different profile/specialization
+
+**Example:**
+- Research Agent finishes gathering info
+- Queues up work for Writing Agent
+- Writing Agent processes and produces output
+
+**Reality:** Looks like collaboration, but it's just messages entering queues
+
+## Why It Feels Alive
+
+The combination creates an illusion of autonomy:
+
+**Time** (heartbeats, crons) → **Events** → **Queue** → **Agent Execution** → **State Persistence** → **Loop**
+
+### The 3am Phone Call Example
+
+**What it looked like:**
+- Agent autonomously decided to get phone number
+- Agent decided to call owner
+- Agent waited until 3am to execute
+
+**What actually happened:**
+1. Some event fired (cron or heartbeat) - exact configuration unknown
+2. Event entered queue
+3. Agent processed with available tools and instructions
+4. Agent acquired Twilio phone number
+5. Agent made the call
+6. Owner didn't ask in the moment, but behavior was enabled in setup
+
+**Key insight:** Nothing was thinking overnight. Nothing was deciding. Time produced event → Event kicked off agent → Agent followed instructions.
+
+## The Complete Event Flow
+
+**Event Sources:**
+- Time creates events (heartbeats, crons)
+- Humans create events (messages)
+- External systems create events (webhooks)
+- Internal state creates events (hooks)
+- Agents create events for other agents
+
+**Processing:**
+All events → Enter queue → Queue processed → Agents execute → State persists → Loop continues
+
+**Memory:**
+- Stored in local markdown files
+- Agent reads on wake-up
+- Remembers previous conversations
+- Not learning - just reading files you could open in text editor
+
+## Security Concerns
+
+### The Analysis
+Cisco's security team analyzed OpenClaw ecosystem:
+- 31,000 available skills examined
+- 26% contain at least one vulnerability
+- Called it "a security nightmare"
+
+### Why It's Risky
+OpenClaw has deep system access:
+- Run shell commands
+- Read and write files
+- Execute scripts
+- Control browser
+
+### Specific Risks
+1. **Prompt injection** through emails or documents
+2. **Malicious skills** in marketplace
+3. **Credential exposure**
+4. **Command misinterpretation** that deletes unintended files
+
+### OpenClaw's Own Warning
+Documentation states: "There's no perfectly secure setup"
+
+### Mitigation Strategies
+- Run on secondary machine
+- Use isolated accounts
+- Limit enabled skills
+- Monitor logs actively
+- Use Railway's one-click deployment (runs in isolated container)
+
+## Key Architectural Takeaways
+
+### The Four Components
+1. **Time** that produces events
+2. **Events** that trigger agents
+3. **State** that persists across interactions
+4. **Loop** that keeps processing
+
+### Building Your Own
+You don't need OpenClaw specifically. You need:
+- Event scheduling mechanism
+- Queue system
+- LLM for processing
+- State persistence layer
+
+### The Pattern
+This architecture will appear everywhere. Every AI agent framework that "feels alive" uses some version of:
+- Heartbeats
+- Cron jobs
+- Webhooks
+- Event loops
+- Persistent state
+
+### Understanding vs Hype
+Understanding this architecture means you can:
+- Evaluate agent tools intelligently
+- Build your own implementations
+- Avoid getting caught up in viral hype
+- Recognize the pattern in new frameworks
+
+## The Bottom Line
+
+OpenClaw isn't magic. It's not sentient. It doesn't think or reason.
+
+**It's inputs, queues, and a loop.**
+
+The "alive" feeling comes from well-designed event-driven architecture that makes a reactive system appear proactive. Time becomes an input. External systems become inputs. Internal state becomes inputs. All processed through the same queue with persistent memory.
+
+Elegant engineering, not artificial consciousness.
+
+## Further Resources
+- OpenClaw documentation
+- Clairvo's original thread (inspiration for this breakdown)
+- Cisco security research on OpenClaw ecosystem