9.7 KiB
9.7 KiB
🦞 OpenClaw — Architecture & How It Works
Full-Featured Personal AI Assistant — Massive TypeScript codebase with 15+ channels, companion apps, and enterprise-grade features.
Overview
OpenClaw is the most feature-complete personal AI assistant in this space. It's a TypeScript monorepo with a WebSocket-based Gateway as the control plane, supporting 15+ messaging channels, companion macOS/iOS/Android apps, browser control, live canvas, voice wake, and extensive automation.
| Attribute | Value |
|---|---|
| Language | TypeScript (Node.js ≥22) |
| Codebase Size | 430k+ lines, 50+ source modules |
| Config | ~/.openclaw/openclaw.json (JSON5) |
| AI Runtime | Pi Agent (custom RPC), multi-model |
| Channels | 15+ (WhatsApp, Telegram, Slack, Discord, Signal, iMessage, Teams, Matrix, Zalo, WebChat, etc.) |
| Package Mgr | pnpm (monorepo) |
Architecture Flowchart
graph TB
subgraph Channels["📱 Messaging Channels (15+)"]
WA["WhatsApp\n(Baileys)"]
TG["Telegram\n(grammY)"]
SL["Slack\n(Bolt)"]
DC["Discord\n(discord.js)"]
GC["Google Chat"]
SIG["Signal\n(signal-cli)"]
BB["BlueBubbles\n(iMessage)"]
IM["iMessage\n(legacy)"]
MST["MS Teams"]
MTX["Matrix"]
ZL["Zalo"]
WC["WebChat"]
end
subgraph Gateway["🌐 Gateway (Control Plane)"]
WS["WebSocket Server\nws://127.0.0.1:18789"]
SES["Session Manager"]
RTE["Channel Router"]
PRES["Presence System"]
Q["Message Queue"]
CFG["Config Manager"]
AUTH["Auth / Pairing"]
end
subgraph Agent["🧠 Pi Agent (RPC)"]
AGENT["Agent Runtime"]
TOOLS["Tool Registry"]
STREAM["Block Streaming"]
PROV["Provider Router\n(multi-model)"]
end
subgraph Apps["📲 Companion Apps"]
MAC["macOS Menu Bar"]
IOS["iOS Node"]
ANDR["Android Node"]
end
subgraph ToolSet["🔧 Tools & Automation"]
BROWSER["Browser Control\n(CDP/Chromium)"]
CANVAS["Live Canvas\n(A2UI)"]
CRON["Cron Jobs"]
WEBHOOK["Webhooks"]
GMAIL["Gmail Pub/Sub"]
NODES["Nodes\n(camera, screen, location)"]
SKILLS_T["Skills Platform"]
SESS_T["Session Tools\n(agent-to-agent)"]
end
subgraph Workspace["💾 Workspace"]
AGENTS_MD["AGENTS.md"]
SOUL_MD["SOUL.md"]
USER_MD["USER.md"]
TOOLS_MD["TOOLS.md"]
SKILLS_W["Skills/"]
end
Channels --> Gateway
Apps --> Gateway
Gateway --> Agent
Agent --> ToolSet
Agent --> Workspace
Agent --> PROV
Message Flow
sequenceDiagram
participant User
participant Channel as Channel (WA/TG/Slack/etc.)
participant GW as Gateway (WS)
participant Session as Session Manager
participant Agent as Pi Agent (RPC)
participant LLM as LLM Provider
participant Tools as Tools
User->>Channel: Send message
Channel->>GW: Forward via channel adapter
GW->>Session: Route to session (main/group)
GW->>GW: Check auth (pairing/allowlist)
Session->>Agent: Invoke agent (RPC)
Agent->>Agent: Build prompt (AGENTS.md, SOUL.md, tools)
Agent->>LLM: Stream request (with tool definitions)
loop Tool Use Loop
LLM-->>Agent: Tool call (block stream)
Agent->>Tools: Execute tool
Tools-->>Agent: Tool result
Agent->>LLM: Continue with result
end
LLM-->>Agent: Final response (block stream)
Agent-->>Session: Return response
Session->>GW: Add to outbound queue
GW->>GW: Chunk if needed (per-channel limits)
GW->>Channel: Send chunked replies
Channel->>User: Display response
Note over GW: Typing indicators, presence updates
Key Components
1. Gateway (src/gateway/)
The central control plane — everything connects through it:
- WebSocket server on
ws://127.0.0.1:18789 - Session management (main, group, per-channel)
- Multi-agent routing (different agents for different channels)
- Presence tracking and typing indicators
- Config management and hot-reload
- Health checks, doctor diagnostics
2. Pi Agent (src/agents/)
Custom RPC-based agent runtime:
- Tool streaming and block streaming
- Multi-model support with failover
- Session pruning for long conversations
- Usage tracking (tokens, cost)
- Thinking level control (off → xhigh)
3. Channel System (src/channels/ + per-channel dirs)
15+ channel adapters, each with:
- Auth handling (pairing codes, allowlists, OAuth)
- Message format conversion
- Media pipeline (images, audio, video)
- Group routing with mention gating
- Per-channel chunking (character limits differ)
4. Security System (src/security/)
Multi-layered security:
- DM Pairing — unknown senders get a pairing code, must be approved
- Allowlists — per-channel user whitelists
- Docker Sandbox — non-main sessions can run in per-session Docker containers
- Tool denylist — block dangerous tools in sandbox mode
- Elevated bash — per-session toggle for host-level access
5. Browser Control (src/browser/)
- Dedicated OpenClaw-managed Chrome/Chromium instance
- CDP (Chrome DevTools Protocol) control
- Snapshots, actions, uploads, profiles
- Full web automation capabilities
6. Canvas & A2UI (src/canvas-host/)
- Agent-driven visual workspace
- A2UI (Agent-to-UI) — push HTML/JS to canvas
- Canvas eval, snapshot, reset
- Available on macOS, iOS, Android
7. Voice System
- Voice Wake — always-on speech detection
- Talk Mode — continuous conversation overlay
- ElevenLabs TTS integration
- Available on macOS, iOS, Android
8. Companion Apps
- macOS app: Menu bar, Voice Wake/PTT, WebChat, debug tools
- iOS node: Canvas, Voice Wake, Talk Mode, camera, Bonjour pairing
- Android node: Canvas, Talk Mode, camera, screen recording, SMS
9. Session Tools (Agent-to-Agent)
sessions_list— discover active sessionssessions_history— fetch transcript logssessions_send— message another session with reply-back
10. Skills Platform (src/plugins/, skills/)
- Bundled skills — pre-installed capabilities
- Managed skills — installed from ClawHub registry
- Workspace skills — user-created in workspace
- Install gating and UI
- ClawHub registry for community skills
11. Automation
- Cron jobs — scheduled recurring tasks
- Webhooks — external trigger surface
- Gmail Pub/Sub — email-triggered actions
12. Ops & Deployment
- Docker support with compose
- Tailscale Serve/Funnel for remote access
- SSH tunnels with token/password auth
openclaw doctorfor diagnostics- Nix mode for declarative config
Project Structure (Simplified)
openclaw/
├── src/
│ ├── agents/ # Pi agent runtime
│ ├── gateway/ # WebSocket gateway
│ ├── channels/ # Channel adapter base
│ ├── whatsapp/ # WhatsApp adapter
│ ├── telegram/ # Telegram adapter
│ ├── slack/ # Slack adapter
│ ├── discord/ # Discord adapter
│ ├── signal/ # Signal adapter
│ ├── imessage/ # iMessage adapters
│ ├── browser/ # Browser control (CDP)
│ ├── canvas-host/ # Canvas & A2UI
│ ├── sessions/ # Session management
│ ├── routing/ # Message routing
│ ├── security/ # Auth, pairing, sandbox
│ ├── cron/ # Scheduled jobs
│ ├── memory/ # Memory system
│ ├── providers/ # LLM providers
│ ├── plugins/ # Plugin/skill system
│ ├── media/ # Media pipeline
│ ├── tts/ # Text-to-speech
│ ├── web/ # Control UI + WebChat
│ ├── wizard/ # Onboarding wizard
│ └── cli/ # CLI commands
├── apps/ # Companion app sources
├── packages/ # Shared packages
├── extensions/ # Extension channels
├── skills/ # Bundled skills
├── ui/ # Web UI source
└── Swabble/ # macOS/iOS Swift source
CLI Commands
| Command | Description |
|---|---|
openclaw onboard |
Guided setup wizard |
openclaw gateway |
Start the gateway |
openclaw agent --message "..." |
Chat with agent |
openclaw message send |
Send to any channel |
openclaw doctor |
Diagnostics & migration |
openclaw pairing approve |
Approve DM pairing |
openclaw update |
Update to latest version |
openclaw channels login |
Link WhatsApp |
Chat Commands (In-Channel)
| Command | Description |
|---|---|
/status |
Session status (model, tokens, cost) |
/new / /reset |
Reset session |
/compact |
Compact session context |
/think <level> |
Set thinking level |
/verbose on|off |
Toggle verbose mode |
/usage off|tokens|full |
Usage footer |
/restart |
Restart gateway |
/activation mention|always |
Group activation mode |
Key Design Decisions
- Gateway as control plane — Single WebSocket server everything connects to
- Multi-agent routing — Different agents for different channels/groups
- Pairing-based security — Unknown DMs get pairing codes by default
- Docker sandboxing — Non-main sessions can be isolated
- Block streaming — Responses streamed as structured blocks
- Extension-based channels — MS Teams, Matrix, Zalo are extensions
- Companion apps — Native macOS/iOS/Android for device-level features
- ClawHub — Community skill registry