- Replace python-dotenv with config.json env.vars block + \ substitution - Add models section for per-task model routing (heartbeat, subagent, default) - Heartbeat/subagent tasks can use different models/providers than main chat - Remove python-dotenv from dependencies - Update all docs to reflect new config approach - Reorganize docs into project/ and research/ subdirectories
292 lines
9.7 KiB
Markdown
292 lines
9.7 KiB
Markdown
# 🦞 OpenClaw — Architecture & How It Works
|
|
|
|
> **Full-Featured Personal AI Assistant** — Massive TypeScript codebase with 15+ channels, companion apps, and enterprise-grade features.
|
|
|
|
## Overview
|
|
|
|
OpenClaw is the most feature-complete personal AI assistant in this space. It's a TypeScript monorepo with a WebSocket-based Gateway as the control plane, supporting 15+ messaging channels, companion macOS/iOS/Android apps, browser control, live canvas, voice wake, and extensive automation.
|
|
|
|
| Attribute | Value |
|
|
|-----------|-------|
|
|
| **Language** | TypeScript (Node.js ≥22) |
|
|
| **Codebase Size** | 430k+ lines, 50+ source modules |
|
|
| **Config** | `~/.openclaw/openclaw.json` (JSON5) |
|
|
| **AI Runtime** | Pi Agent (custom RPC), multi-model |
|
|
| **Channels** | 15+ (WhatsApp, Telegram, Slack, Discord, Signal, iMessage, Teams, Matrix, Zalo, WebChat, etc.) |
|
|
| **Package Mgr** | pnpm (monorepo) |
|
|
|
|
---
|
|
|
|
## Architecture Flowchart
|
|
|
|
```mermaid
|
|
graph TB
|
|
subgraph Channels["📱 Messaging Channels (15+)"]
|
|
WA["WhatsApp\n(Baileys)"]
|
|
TG["Telegram\n(grammY)"]
|
|
SL["Slack\n(Bolt)"]
|
|
DC["Discord\n(discord.js)"]
|
|
GC["Google Chat"]
|
|
SIG["Signal\n(signal-cli)"]
|
|
BB["BlueBubbles\n(iMessage)"]
|
|
IM["iMessage\n(legacy)"]
|
|
MST["MS Teams"]
|
|
MTX["Matrix"]
|
|
ZL["Zalo"]
|
|
WC["WebChat"]
|
|
end
|
|
|
|
subgraph Gateway["🌐 Gateway (Control Plane)"]
|
|
WS["WebSocket Server\nws://127.0.0.1:18789"]
|
|
SES["Session Manager"]
|
|
RTE["Channel Router"]
|
|
PRES["Presence System"]
|
|
Q["Message Queue"]
|
|
CFG["Config Manager"]
|
|
AUTH["Auth / Pairing"]
|
|
end
|
|
|
|
subgraph Agent["🧠 Pi Agent (RPC)"]
|
|
AGENT["Agent Runtime"]
|
|
TOOLS["Tool Registry"]
|
|
STREAM["Block Streaming"]
|
|
PROV["Provider Router\n(multi-model)"]
|
|
end
|
|
|
|
subgraph Apps["📲 Companion Apps"]
|
|
MAC["macOS Menu Bar"]
|
|
IOS["iOS Node"]
|
|
ANDR["Android Node"]
|
|
end
|
|
|
|
subgraph ToolSet["🔧 Tools & Automation"]
|
|
BROWSER["Browser Control\n(CDP/Chromium)"]
|
|
CANVAS["Live Canvas\n(A2UI)"]
|
|
CRON["Cron Jobs"]
|
|
WEBHOOK["Webhooks"]
|
|
GMAIL["Gmail Pub/Sub"]
|
|
NODES["Nodes\n(camera, screen, location)"]
|
|
SKILLS_T["Skills Platform"]
|
|
SESS_T["Session Tools\n(agent-to-agent)"]
|
|
end
|
|
|
|
subgraph Workspace["💾 Workspace"]
|
|
AGENTS_MD["AGENTS.md"]
|
|
SOUL_MD["SOUL.md"]
|
|
USER_MD["USER.md"]
|
|
TOOLS_MD["TOOLS.md"]
|
|
SKILLS_W["Skills/"]
|
|
end
|
|
|
|
Channels --> Gateway
|
|
Apps --> Gateway
|
|
Gateway --> Agent
|
|
Agent --> ToolSet
|
|
Agent --> Workspace
|
|
Agent --> PROV
|
|
```
|
|
|
|
---
|
|
|
|
## Message Flow
|
|
|
|
```mermaid
|
|
sequenceDiagram
|
|
participant User
|
|
participant Channel as Channel (WA/TG/Slack/etc.)
|
|
participant GW as Gateway (WS)
|
|
participant Session as Session Manager
|
|
participant Agent as Pi Agent (RPC)
|
|
participant LLM as LLM Provider
|
|
participant Tools as Tools
|
|
|
|
User->>Channel: Send message
|
|
Channel->>GW: Forward via channel adapter
|
|
GW->>Session: Route to session (main/group)
|
|
GW->>GW: Check auth (pairing/allowlist)
|
|
Session->>Agent: Invoke agent (RPC)
|
|
Agent->>Agent: Build prompt (AGENTS.md, SOUL.md, tools)
|
|
Agent->>LLM: Stream request (with tool definitions)
|
|
|
|
loop Tool Use Loop
|
|
LLM-->>Agent: Tool call (block stream)
|
|
Agent->>Tools: Execute tool
|
|
Tools-->>Agent: Tool result
|
|
Agent->>LLM: Continue with result
|
|
end
|
|
|
|
LLM-->>Agent: Final response (block stream)
|
|
Agent-->>Session: Return response
|
|
Session->>GW: Add to outbound queue
|
|
GW->>GW: Chunk if needed (per-channel limits)
|
|
GW->>Channel: Send chunked replies
|
|
Channel->>User: Display response
|
|
|
|
Note over GW: Typing indicators, presence updates
|
|
```
|
|
|
|
---
|
|
|
|
## Key Components
|
|
|
|
### 1. Gateway (`src/gateway/`)
|
|
The central control plane — everything connects through it:
|
|
- **WebSocket server** on `ws://127.0.0.1:18789`
|
|
- Session management (main, group, per-channel)
|
|
- Multi-agent routing (different agents for different channels)
|
|
- Presence tracking and typing indicators
|
|
- Config management and hot-reload
|
|
- Health checks, doctor diagnostics
|
|
|
|
### 2. Pi Agent (`src/agents/`)
|
|
Custom RPC-based agent runtime:
|
|
- Tool streaming and block streaming
|
|
- Multi-model support with failover
|
|
- Session pruning for long conversations
|
|
- Usage tracking (tokens, cost)
|
|
- Thinking level control (off → xhigh)
|
|
|
|
### 3. Channel System (`src/channels/` + per-channel dirs)
|
|
15+ channel adapters, each with:
|
|
- Auth handling (pairing codes, allowlists, OAuth)
|
|
- Message format conversion
|
|
- Media pipeline (images, audio, video)
|
|
- Group routing with mention gating
|
|
- Per-channel chunking (character limits differ)
|
|
|
|
### 4. Security System (`src/security/`)
|
|
Multi-layered security:
|
|
- **DM Pairing** — unknown senders get a pairing code, must be approved
|
|
- **Allowlists** — per-channel user whitelists
|
|
- **Docker Sandbox** — non-main sessions can run in per-session Docker containers
|
|
- **Tool denylist** — block dangerous tools in sandbox mode
|
|
- **Elevated bash** — per-session toggle for host-level access
|
|
|
|
### 5. Browser Control (`src/browser/`)
|
|
- Dedicated OpenClaw-managed Chrome/Chromium instance
|
|
- CDP (Chrome DevTools Protocol) control
|
|
- Snapshots, actions, uploads, profiles
|
|
- Full web automation capabilities
|
|
|
|
### 6. Canvas & A2UI (`src/canvas-host/`)
|
|
- Agent-driven visual workspace
|
|
- A2UI (Agent-to-UI) — push HTML/JS to canvas
|
|
- Canvas eval, snapshot, reset
|
|
- Available on macOS, iOS, Android
|
|
|
|
### 7. Voice System
|
|
- **Voice Wake** — always-on speech detection
|
|
- **Talk Mode** — continuous conversation overlay
|
|
- ElevenLabs TTS integration
|
|
- Available on macOS, iOS, Android
|
|
|
|
### 8. Companion Apps
|
|
- **macOS app**: Menu bar, Voice Wake/PTT, WebChat, debug tools
|
|
- **iOS node**: Canvas, Voice Wake, Talk Mode, camera, Bonjour pairing
|
|
- **Android node**: Canvas, Talk Mode, camera, screen recording, SMS
|
|
|
|
### 9. Session Tools (Agent-to-Agent)
|
|
- `sessions_list` — discover active sessions
|
|
- `sessions_history` — fetch transcript logs
|
|
- `sessions_send` — message another session with reply-back
|
|
|
|
### 10. Skills Platform (`src/plugins/`, `skills/`)
|
|
- **Bundled skills** — pre-installed capabilities
|
|
- **Managed skills** — installed from ClawHub registry
|
|
- **Workspace skills** — user-created in workspace
|
|
- Install gating and UI
|
|
- ClawHub registry for community skills
|
|
|
|
### 11. Automation
|
|
- **Cron jobs** — scheduled recurring tasks
|
|
- **Webhooks** — external trigger surface
|
|
- **Gmail Pub/Sub** — email-triggered actions
|
|
|
|
### 12. Ops & Deployment
|
|
- Docker support with compose
|
|
- Tailscale Serve/Funnel for remote access
|
|
- SSH tunnels with token/password auth
|
|
- `openclaw doctor` for diagnostics
|
|
- Nix mode for declarative config
|
|
|
|
---
|
|
|
|
## Project Structure (Simplified)
|
|
|
|
```
|
|
openclaw/
|
|
├── src/
|
|
│ ├── agents/ # Pi agent runtime
|
|
│ ├── gateway/ # WebSocket gateway
|
|
│ ├── channels/ # Channel adapter base
|
|
│ ├── whatsapp/ # WhatsApp adapter
|
|
│ ├── telegram/ # Telegram adapter
|
|
│ ├── slack/ # Slack adapter
|
|
│ ├── discord/ # Discord adapter
|
|
│ ├── signal/ # Signal adapter
|
|
│ ├── imessage/ # iMessage adapters
|
|
│ ├── browser/ # Browser control (CDP)
|
|
│ ├── canvas-host/ # Canvas & A2UI
|
|
│ ├── sessions/ # Session management
|
|
│ ├── routing/ # Message routing
|
|
│ ├── security/ # Auth, pairing, sandbox
|
|
│ ├── cron/ # Scheduled jobs
|
|
│ ├── memory/ # Memory system
|
|
│ ├── providers/ # LLM providers
|
|
│ ├── plugins/ # Plugin/skill system
|
|
│ ├── media/ # Media pipeline
|
|
│ ├── tts/ # Text-to-speech
|
|
│ ├── web/ # Control UI + WebChat
|
|
│ ├── wizard/ # Onboarding wizard
|
|
│ └── cli/ # CLI commands
|
|
├── apps/ # Companion app sources
|
|
├── packages/ # Shared packages
|
|
├── extensions/ # Extension channels
|
|
├── skills/ # Bundled skills
|
|
├── ui/ # Web UI source
|
|
└── Swabble/ # macOS/iOS Swift source
|
|
```
|
|
|
|
---
|
|
|
|
## CLI Commands
|
|
|
|
| Command | Description |
|
|
|---------|-------------|
|
|
| `openclaw onboard` | Guided setup wizard |
|
|
| `openclaw gateway` | Start the gateway |
|
|
| `openclaw agent --message "..."` | Chat with agent |
|
|
| `openclaw message send` | Send to any channel |
|
|
| `openclaw doctor` | Diagnostics & migration |
|
|
| `openclaw pairing approve` | Approve DM pairing |
|
|
| `openclaw update` | Update to latest version |
|
|
| `openclaw channels login` | Link WhatsApp |
|
|
|
|
---
|
|
|
|
## Chat Commands (In-Channel)
|
|
|
|
| Command | Description |
|
|
|---------|-------------|
|
|
| `/status` | Session status (model, tokens, cost) |
|
|
| `/new` / `/reset` | Reset session |
|
|
| `/compact` | Compact session context |
|
|
| `/think <level>` | Set thinking level |
|
|
| `/verbose on\|off` | Toggle verbose mode |
|
|
| `/usage off\|tokens\|full` | Usage footer |
|
|
| `/restart` | Restart gateway |
|
|
| `/activation mention\|always` | Group activation mode |
|
|
|
|
---
|
|
|
|
## Key Design Decisions
|
|
|
|
1. **Gateway as control plane** — Single WebSocket server everything connects to
|
|
2. **Multi-agent routing** — Different agents for different channels/groups
|
|
3. **Pairing-based security** — Unknown DMs get pairing codes by default
|
|
4. **Docker sandboxing** — Non-main sessions can be isolated
|
|
5. **Block streaming** — Responses streamed as structured blocks
|
|
6. **Extension-based channels** — MS Teams, Matrix, Zalo are extensions
|
|
7. **Companion apps** — Native macOS/iOS/Android for device-level features
|
|
8. **ClawHub** — Community skill registry
|