feat: openclaw-style secrets (env.vars + \) and per-task model routing

- Replace python-dotenv with config.json env.vars block + \ substitution - Add models section for per-task model routing (heartbeat, subagent, default) - Heartbeat/subagent tasks can use different models/providers than main chat - Remove python-dotenv from dependencies - Update all docs to reflect new config approach - Reorganize docs into project/ and research/ subdirectories
2026-02-20 23:49:05 -05:00
parent 55c6767e69
commit 82c2640481
35 changed files with 2904 additions and 422 deletions
--- a/docs/research/nanoclaw-comparison.md
+++ b/docs/research/nanoclaw-comparison.md
@@ -0,0 +1,315 @@
+# Aetheel vs NanoClaw — Feature Gap Analysis
+
+Deep comparison of Aetheel (Python, multi-channel AI assistant) and NanoClaw (TypeScript, container-isolated personal AI assistant). Focus: what NanoClaw has that Aetheel is missing.
+
+---
+
+## Architecture Differences
+
+| Aspect | Aetheel | NanoClaw |
+|--------|---------|----------|
+| Language | Python | TypeScript |
+| Agent execution | In-process (shared memory) | Container-isolated (Apple Container / Docker) |
+| Identity model | Shared across all channels (SOUL.md, USER.md, MEMORY.md) | Per-group (each group has its own CLAUDE.md) |
+| Security model | Application-level checks | OS-level container isolation |
+| Config approach | Config-driven (`config.json` with `env.vars` + `${VAR}`) | Code-first (Claude modifies your fork) |
+| Philosophy | Feature-rich framework | Minimal, understandable in 8 minutes |
+
+---
+
+## Features Aetheel Is Missing
+
+### 1. Container Isolation (Critical)
+
+NanoClaw runs every agent invocation inside a Linux container (Apple Container on macOS, Docker on Linux). Each container:
+- Gets only explicitly mounted directories
+- Runs as non-root (uid 1000)
+- Is ephemeral (`--rm` flag, fresh per invocation)
+- Cannot access other groups' files or sessions
+- Cannot access host filesystem beyond mounts
+
+Aetheel runs everything in-process with no sandboxing. The security audit already flagged path traversal, arbitrary code execution via hooks, and unvalidated action tags as critical issues.
+
+**What to build:**
+- Docker-based agent execution (spawn a container per AI request)
+- Mount only the relevant group's workspace directory
+- Pass secrets via stdin, not mounted files
+- Add a `/convert-to-docker` skill or built-in Docker mode
+
+---
+
+### 2. Per-Group Isolation
+
+NanoClaw gives each chat group its own:
+- Filesystem folder (`groups/{name}/`)
+- Memory file (`CLAUDE.md` per group)
+- Session history (isolated `.claude/` directory)
+- IPC namespace (prevents cross-group privilege escalation)
+- Container mounts (only own folder + read-only global)
+
+Aetheel shares SOUL.md, USER.md, and MEMORY.md across all channels and conversations. A Slack channel, Discord server, and Telegram group all see the same memory and identity.
+
+**What to build:**
+- Per-channel or per-group workspace directories
+- Isolated session storage per group
+- A `global/` shared memory that all groups can read but only the main channel can write
+- Group registration system (like NanoClaw's `registerGroup()`)
+
+---
+
+### 3. Working Agent Teams / Swarms
+
+NanoClaw has working agent teams today via Claude Code's experimental `CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1`:
+- Lead agent creates teammates using Claude's native `TeamCreate` / `SendMessage` tools
+- Each teammate runs in its own container
+- On Telegram, each agent gets a dedicated bot identity (pool of pre-created bots renamed dynamically via `setMyName`)
+- The lead agent coordinates but doesn't relay every message — users see teammate messages directly
+- `<internal>` tags let agents communicate without spamming the user
+
+Aetheel has the tools in the allowed list (`TeamCreate`, `TeamDelete`, `SendMessage`) but no actual orchestration, no per-agent identity, and no way for teammates to appear as separate entities in chat.
+
+**What to build:**
+- Enable `CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1` when using Claude runtime
+- Bot pool for Telegram/Discord (multiple bot tokens, one per agent role)
+- IPC routing that respects `sender` field to route messages through the right bot
+- Per-agent CLAUDE.md / SOUL.md files
+- `<internal>` tag stripping in outbound messages
+
+---
+
+### 4. Mount Security / Allowlist
+
+NanoClaw has a tamper-proof mount allowlist at `~/.config/nanoclaw/mount-allowlist.json` (outside the project root, never mounted into containers):
+- Defines which host directories can be mounted
+- Default blocked patterns: `.ssh`, `.gnupg`, `.aws`, `.env`, `private_key`, etc.
+- Symlink resolution before validation (prevents traversal)
+- `nonMainReadOnly` forces read-only for non-main groups
+- Per-root `allowReadWrite` control
+
+Aetheel has no filesystem access control. The AI can read/write anywhere the process has permissions.
+
+**What to build:**
+- External allowlist config (outside workspace, not modifiable by the AI)
+- Blocked path patterns for sensitive directories
+- Symlink resolution and path validation
+- Read-only enforcement for non-primary channels
+
+---
+
+### 5. IPC-Based Communication
+
+NanoClaw uses file-based IPC for all agent-to-host communication:
+- Agents write JSON files to `data/ipc/{group}/messages/` and `data/ipc/{group}/tasks/`
+- Host polls IPC directories and processes files
+- Per-group IPC namespaces prevent cross-group message injection
+- Authorization checks: non-main groups can only send to their own chat, schedule tasks for themselves
+- Error files moved to `data/ipc/errors/` for debugging
+
+Aetheel uses in-memory action tags parsed from AI response text (`[ACTION:remind|...]`, `[ACTION:cron|...]`). No authorization, no isolation, no audit trail.
+
+**What to build:**
+- File-based or queue-based IPC for agent communication
+- Per-group namespaces with authorization
+- Audit trail for all IPC operations
+- Error handling with failed message preservation
+
+---
+
+### 6. Group Queue with Concurrency Control
+
+NanoClaw has a `GroupQueue` class that manages container execution:
+- Max concurrent containers limit (`MAX_CONCURRENT_CONTAINERS`, default 5)
+- Per-group queuing (messages and tasks queue while container is active)
+- Follow-up messages sent to active containers via IPC input files
+- Idle timeout with `_close` sentinel to wind down containers
+- Exponential backoff retry (5s base, max 5 retries)
+- Graceful shutdown (detaches containers, doesn't kill them)
+- Task priority over messages in drain order
+
+Aetheel has a simple concurrent limit of 3 subagents but no queuing, no retry logic, no follow-up message support, and no graceful shutdown.
+
+**What to build:**
+- Proper execution queue with configurable concurrency
+- Per-channel message queuing when agent is busy
+- Follow-up message injection into active sessions
+- Exponential backoff retry on failures
+- Graceful shutdown that lets active agents finish
+
+---
+
+### 7. Task Context Modes
+
+NanoClaw scheduled tasks support two context modes:
+- `group` — uses the group's existing session (shared conversation history)
+- `isolated` — fresh session per task run (no prior context)
+
+Aetheel scheduled tasks always run in a fresh context with no option to share the group's conversation history.
+
+**What to build:**
+- `context_mode` field on scheduled jobs (`group` vs `isolated`)
+- Session ID passthrough for `group` mode tasks
+
+---
+
+### 8. Task Run Logging
+
+NanoClaw logs every task execution:
+- `task_run_logs` table with: task_id, run_at, duration_ms, status, result, error
+- `last_result` summary stored on the task itself
+- Tasks auto-complete after `once` schedule runs
+
+Aetheel's scheduler persists jobs but doesn't log execution history or results.
+
+**What to build:**
+- Task run log table (when it ran, how long, success/error, result summary)
+- Queryable task history (`task history <id>`)
+
+---
+
+### 9. Streaming Output with Idle Timeout
+
+NanoClaw streams agent output in real-time:
+- Container output is parsed as it arrives (sentinel markers for robust parsing)
+- Results are forwarded to the user immediately via `sendMessage`
+- Idle timeout (default 30 min) closes the container if no output for too long
+- Prevents hanging containers from blocking the queue
+
+Aetheel waits for the full AI response before sending anything back.
+
+**What to build:**
+- Streaming response support (send partial results as they arrive)
+- Idle timeout for long-running agent sessions
+- Typing indicators while agent is processing
+
+---
+
+### 10. Skills as Code Transformations
+
+NanoClaw's skills are fundamentally different from Aetheel's:
+- Skills are SKILL.md files that teach Claude Code how to modify the codebase
+- A deterministic skills engine applies code changes (three-way merge, file additions)
+- Skills have state tracking (`.nanoclaw/state.yaml`), backups, and rollback
+- Examples: `/add-telegram`, `/add-discord`, `/add-gmail`, `/add-voice-transcription`, `/convert-to-docker`, `/add-parallel`
+- Each skill is a complete guide: pre-flight checks, code changes, setup, verification, troubleshooting
+
+Aetheel's skills are runtime context injections (markdown instructions added to the system prompt when trigger words match). They don't modify code.
+
+**What to build:**
+- Skills engine that can apply code transformations
+- State tracking for applied skills
+- Rollback support
+- Template skills for common integrations
+
+---
+
+### 11. Voice Message Transcription
+
+NanoClaw has a skill (`/add-voice-transcription`) that:
+- Detects WhatsApp voice notes (`audioMessage.ptt === true`)
+- Downloads audio via Baileys
+- Transcribes using OpenAI Whisper API
+- Stores transcribed content as `[Voice: <text>]` in the database
+- Configurable provider, fallback message, enable/disable
+
+Aetheel has no voice message handling.
+
+**What to build:**
+- Voice message detection per adapter (Telegram, Discord, Slack all support voice)
+- Whisper API integration for transcription
+- Transcribed content injection into the conversation
+
+---
+
+### 12. Gmail / Email Integration
+
+NanoClaw has a skill (`/add-gmail`) with two modes:
+- Tool mode: agent can read/send emails when triggered from chat
+- Channel mode: emails trigger the agent, agent replies via email
+- GCP OAuth setup guide
+- Email polling with deduplication
+- Per-thread or per-sender context isolation
+
+Aetheel has no email integration.
+
+**What to build:**
+- Gmail MCP integration (or direct API)
+- Email as a channel adapter
+- OAuth credential management
+
+---
+
+### 13. WhatsApp Support
+
+NanoClaw's primary channel is WhatsApp via the Baileys library:
+- QR code and pairing code authentication
+- Group metadata sync
+- Message history storage per registered group
+- Bot message filtering (prevents echo loops)
+
+Aetheel supports Slack, Discord, Telegram, and WebChat but not WhatsApp.
+
+**What to build:**
+- WhatsApp adapter using a library like Baileys or the WhatsApp Business API
+- QR code authentication flow
+- Group registration and metadata sync
+
+---
+
+### 14. Structured Message Routing
+
+NanoClaw has a clean channel abstraction:
+- `Channel` interface: `connect()`, `sendMessage()`, `isConnected()`, `ownsJid()`, `disconnect()`, `setTyping?()`
+- `findChannel()` routes outbound messages to the right channel by JID prefix (`tg:`, `dc:`, WhatsApp JIDs)
+- `formatOutbound()` strips `<internal>` tags before sending
+- XML-escaped message formatting for agent input
+
+Aetheel's adapters work but lack JID-based routing, `<internal>` tag support, and typing indicators across all adapters.
+
+**What to build:**
+- JID-based message routing (prefix per channel)
+- `<internal>` tag stripping for agent-to-agent communication
+- Typing indicators for all adapters
+- Unified channel interface with `ownsJid()` pattern
+
+---
+
+## Priority Recommendations
+
+### High Priority (Security + Core Gaps)
+1. Container isolation for agent execution
+2. Fix the 10 critical/high security issues from the security audit
+3. Per-group isolation (memory, sessions, filesystem)
+4. Mount security allowlist
+
+### Medium Priority (Feature Parity)
+5. Working agent teams with per-agent identity
+6. Group queue with concurrency control and retry
+7. Task context modes and run logging
+8. Streaming output with idle timeout
+9. IPC-based communication with authorization
+
+### Lower Priority (Nice to Have)
+10. Voice message transcription
+11. WhatsApp adapter
+12. Gmail/email integration
+13. Skills as code transformations
+14. Structured message routing with JID prefixes
+
+---
+
+## What Aetheel Has That NanoClaw Doesn't
+
+For reference, these are Aetheel strengths to preserve:
+
+- Dual runtime support (OpenCode + Claude Code) with live switching
+- Auto-failover on rate limits
+- Per-request cost tracking and usage stats
+- Local vector search (hybrid: 0.7 vector + 0.3 BM25) with fastembed
+- Built-in multi-channel (Slack, Discord, Telegram, WebChat, Webhooks)
+- WebChat browser UI
+- Heartbeat / proactive task system
+- Lifecycle hooks (gateway:startup, command:reload, agent:response, etc.)
+- Comprehensive CLI (`aetheel start/stop/restart/logs/doctor/config/cron/memory`)
+- Config-driven setup (no code changes needed for basic customization)
+- Self-modification (AI can edit its own config, skills, identity files)
+- Hot reload (`/reload` command)