feat: openclaw-style secrets (env.vars + \) and per-task model routing
- Replace python-dotenv with config.json env.vars block + \ substitution - Add models section for per-task model routing (heartbeat, subagent, default) - Heartbeat/subagent tasks can use different models/providers than main chat - Remove python-dotenv from dependencies - Update all docs to reflect new config approach - Reorganize docs into project/ and research/ subdirectories
This commit is contained in:
315
docs/research/nanoclaw-comparison.md
Normal file
315
docs/research/nanoclaw-comparison.md
Normal file
@@ -0,0 +1,315 @@
|
||||
# Aetheel vs NanoClaw — Feature Gap Analysis
|
||||
|
||||
Deep comparison of Aetheel (Python, multi-channel AI assistant) and NanoClaw (TypeScript, container-isolated personal AI assistant). Focus: what NanoClaw has that Aetheel is missing.
|
||||
|
||||
---
|
||||
|
||||
## Architecture Differences
|
||||
|
||||
| Aspect | Aetheel | NanoClaw |
|
||||
|--------|---------|----------|
|
||||
| Language | Python | TypeScript |
|
||||
| Agent execution | In-process (shared memory) | Container-isolated (Apple Container / Docker) |
|
||||
| Identity model | Shared across all channels (SOUL.md, USER.md, MEMORY.md) | Per-group (each group has its own CLAUDE.md) |
|
||||
| Security model | Application-level checks | OS-level container isolation |
|
||||
| Config approach | Config-driven (`config.json` with `env.vars` + `${VAR}`) | Code-first (Claude modifies your fork) |
|
||||
| Philosophy | Feature-rich framework | Minimal, understandable in 8 minutes |
|
||||
|
||||
---
|
||||
|
||||
## Features Aetheel Is Missing
|
||||
|
||||
### 1. Container Isolation (Critical)
|
||||
|
||||
NanoClaw runs every agent invocation inside a Linux container (Apple Container on macOS, Docker on Linux). Each container:
|
||||
- Gets only explicitly mounted directories
|
||||
- Runs as non-root (uid 1000)
|
||||
- Is ephemeral (`--rm` flag, fresh per invocation)
|
||||
- Cannot access other groups' files or sessions
|
||||
- Cannot access host filesystem beyond mounts
|
||||
|
||||
Aetheel runs everything in-process with no sandboxing. The security audit already flagged path traversal, arbitrary code execution via hooks, and unvalidated action tags as critical issues.
|
||||
|
||||
**What to build:**
|
||||
- Docker-based agent execution (spawn a container per AI request)
|
||||
- Mount only the relevant group's workspace directory
|
||||
- Pass secrets via stdin, not mounted files
|
||||
- Add a `/convert-to-docker` skill or built-in Docker mode
|
||||
|
||||
---
|
||||
|
||||
### 2. Per-Group Isolation
|
||||
|
||||
NanoClaw gives each chat group its own:
|
||||
- Filesystem folder (`groups/{name}/`)
|
||||
- Memory file (`CLAUDE.md` per group)
|
||||
- Session history (isolated `.claude/` directory)
|
||||
- IPC namespace (prevents cross-group privilege escalation)
|
||||
- Container mounts (only own folder + read-only global)
|
||||
|
||||
Aetheel shares SOUL.md, USER.md, and MEMORY.md across all channels and conversations. A Slack channel, Discord server, and Telegram group all see the same memory and identity.
|
||||
|
||||
**What to build:**
|
||||
- Per-channel or per-group workspace directories
|
||||
- Isolated session storage per group
|
||||
- A `global/` shared memory that all groups can read but only the main channel can write
|
||||
- Group registration system (like NanoClaw's `registerGroup()`)
|
||||
|
||||
---
|
||||
|
||||
### 3. Working Agent Teams / Swarms
|
||||
|
||||
NanoClaw has working agent teams today via Claude Code's experimental `CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1`:
|
||||
- Lead agent creates teammates using Claude's native `TeamCreate` / `SendMessage` tools
|
||||
- Each teammate runs in its own container
|
||||
- On Telegram, each agent gets a dedicated bot identity (pool of pre-created bots renamed dynamically via `setMyName`)
|
||||
- The lead agent coordinates but doesn't relay every message — users see teammate messages directly
|
||||
- `<internal>` tags let agents communicate without spamming the user
|
||||
|
||||
Aetheel has the tools in the allowed list (`TeamCreate`, `TeamDelete`, `SendMessage`) but no actual orchestration, no per-agent identity, and no way for teammates to appear as separate entities in chat.
|
||||
|
||||
**What to build:**
|
||||
- Enable `CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1` when using Claude runtime
|
||||
- Bot pool for Telegram/Discord (multiple bot tokens, one per agent role)
|
||||
- IPC routing that respects `sender` field to route messages through the right bot
|
||||
- Per-agent CLAUDE.md / SOUL.md files
|
||||
- `<internal>` tag stripping in outbound messages
|
||||
|
||||
---
|
||||
|
||||
### 4. Mount Security / Allowlist
|
||||
|
||||
NanoClaw has a tamper-proof mount allowlist at `~/.config/nanoclaw/mount-allowlist.json` (outside the project root, never mounted into containers):
|
||||
- Defines which host directories can be mounted
|
||||
- Default blocked patterns: `.ssh`, `.gnupg`, `.aws`, `.env`, `private_key`, etc.
|
||||
- Symlink resolution before validation (prevents traversal)
|
||||
- `nonMainReadOnly` forces read-only for non-main groups
|
||||
- Per-root `allowReadWrite` control
|
||||
|
||||
Aetheel has no filesystem access control. The AI can read/write anywhere the process has permissions.
|
||||
|
||||
**What to build:**
|
||||
- External allowlist config (outside workspace, not modifiable by the AI)
|
||||
- Blocked path patterns for sensitive directories
|
||||
- Symlink resolution and path validation
|
||||
- Read-only enforcement for non-primary channels
|
||||
|
||||
---
|
||||
|
||||
### 5. IPC-Based Communication
|
||||
|
||||
NanoClaw uses file-based IPC for all agent-to-host communication:
|
||||
- Agents write JSON files to `data/ipc/{group}/messages/` and `data/ipc/{group}/tasks/`
|
||||
- Host polls IPC directories and processes files
|
||||
- Per-group IPC namespaces prevent cross-group message injection
|
||||
- Authorization checks: non-main groups can only send to their own chat, schedule tasks for themselves
|
||||
- Error files moved to `data/ipc/errors/` for debugging
|
||||
|
||||
Aetheel uses in-memory action tags parsed from AI response text (`[ACTION:remind|...]`, `[ACTION:cron|...]`). No authorization, no isolation, no audit trail.
|
||||
|
||||
**What to build:**
|
||||
- File-based or queue-based IPC for agent communication
|
||||
- Per-group namespaces with authorization
|
||||
- Audit trail for all IPC operations
|
||||
- Error handling with failed message preservation
|
||||
|
||||
---
|
||||
|
||||
### 6. Group Queue with Concurrency Control
|
||||
|
||||
NanoClaw has a `GroupQueue` class that manages container execution:
|
||||
- Max concurrent containers limit (`MAX_CONCURRENT_CONTAINERS`, default 5)
|
||||
- Per-group queuing (messages and tasks queue while container is active)
|
||||
- Follow-up messages sent to active containers via IPC input files
|
||||
- Idle timeout with `_close` sentinel to wind down containers
|
||||
- Exponential backoff retry (5s base, max 5 retries)
|
||||
- Graceful shutdown (detaches containers, doesn't kill them)
|
||||
- Task priority over messages in drain order
|
||||
|
||||
Aetheel has a simple concurrent limit of 3 subagents but no queuing, no retry logic, no follow-up message support, and no graceful shutdown.
|
||||
|
||||
**What to build:**
|
||||
- Proper execution queue with configurable concurrency
|
||||
- Per-channel message queuing when agent is busy
|
||||
- Follow-up message injection into active sessions
|
||||
- Exponential backoff retry on failures
|
||||
- Graceful shutdown that lets active agents finish
|
||||
|
||||
---
|
||||
|
||||
### 7. Task Context Modes
|
||||
|
||||
NanoClaw scheduled tasks support two context modes:
|
||||
- `group` — uses the group's existing session (shared conversation history)
|
||||
- `isolated` — fresh session per task run (no prior context)
|
||||
|
||||
Aetheel scheduled tasks always run in a fresh context with no option to share the group's conversation history.
|
||||
|
||||
**What to build:**
|
||||
- `context_mode` field on scheduled jobs (`group` vs `isolated`)
|
||||
- Session ID passthrough for `group` mode tasks
|
||||
|
||||
---
|
||||
|
||||
### 8. Task Run Logging
|
||||
|
||||
NanoClaw logs every task execution:
|
||||
- `task_run_logs` table with: task_id, run_at, duration_ms, status, result, error
|
||||
- `last_result` summary stored on the task itself
|
||||
- Tasks auto-complete after `once` schedule runs
|
||||
|
||||
Aetheel's scheduler persists jobs but doesn't log execution history or results.
|
||||
|
||||
**What to build:**
|
||||
- Task run log table (when it ran, how long, success/error, result summary)
|
||||
- Queryable task history (`task history <id>`)
|
||||
|
||||
---
|
||||
|
||||
### 9. Streaming Output with Idle Timeout
|
||||
|
||||
NanoClaw streams agent output in real-time:
|
||||
- Container output is parsed as it arrives (sentinel markers for robust parsing)
|
||||
- Results are forwarded to the user immediately via `sendMessage`
|
||||
- Idle timeout (default 30 min) closes the container if no output for too long
|
||||
- Prevents hanging containers from blocking the queue
|
||||
|
||||
Aetheel waits for the full AI response before sending anything back.
|
||||
|
||||
**What to build:**
|
||||
- Streaming response support (send partial results as they arrive)
|
||||
- Idle timeout for long-running agent sessions
|
||||
- Typing indicators while agent is processing
|
||||
|
||||
---
|
||||
|
||||
### 10. Skills as Code Transformations
|
||||
|
||||
NanoClaw's skills are fundamentally different from Aetheel's:
|
||||
- Skills are SKILL.md files that teach Claude Code how to modify the codebase
|
||||
- A deterministic skills engine applies code changes (three-way merge, file additions)
|
||||
- Skills have state tracking (`.nanoclaw/state.yaml`), backups, and rollback
|
||||
- Examples: `/add-telegram`, `/add-discord`, `/add-gmail`, `/add-voice-transcription`, `/convert-to-docker`, `/add-parallel`
|
||||
- Each skill is a complete guide: pre-flight checks, code changes, setup, verification, troubleshooting
|
||||
|
||||
Aetheel's skills are runtime context injections (markdown instructions added to the system prompt when trigger words match). They don't modify code.
|
||||
|
||||
**What to build:**
|
||||
- Skills engine that can apply code transformations
|
||||
- State tracking for applied skills
|
||||
- Rollback support
|
||||
- Template skills for common integrations
|
||||
|
||||
---
|
||||
|
||||
### 11. Voice Message Transcription
|
||||
|
||||
NanoClaw has a skill (`/add-voice-transcription`) that:
|
||||
- Detects WhatsApp voice notes (`audioMessage.ptt === true`)
|
||||
- Downloads audio via Baileys
|
||||
- Transcribes using OpenAI Whisper API
|
||||
- Stores transcribed content as `[Voice: <text>]` in the database
|
||||
- Configurable provider, fallback message, enable/disable
|
||||
|
||||
Aetheel has no voice message handling.
|
||||
|
||||
**What to build:**
|
||||
- Voice message detection per adapter (Telegram, Discord, Slack all support voice)
|
||||
- Whisper API integration for transcription
|
||||
- Transcribed content injection into the conversation
|
||||
|
||||
---
|
||||
|
||||
### 12. Gmail / Email Integration
|
||||
|
||||
NanoClaw has a skill (`/add-gmail`) with two modes:
|
||||
- Tool mode: agent can read/send emails when triggered from chat
|
||||
- Channel mode: emails trigger the agent, agent replies via email
|
||||
- GCP OAuth setup guide
|
||||
- Email polling with deduplication
|
||||
- Per-thread or per-sender context isolation
|
||||
|
||||
Aetheel has no email integration.
|
||||
|
||||
**What to build:**
|
||||
- Gmail MCP integration (or direct API)
|
||||
- Email as a channel adapter
|
||||
- OAuth credential management
|
||||
|
||||
---
|
||||
|
||||
### 13. WhatsApp Support
|
||||
|
||||
NanoClaw's primary channel is WhatsApp via the Baileys library:
|
||||
- QR code and pairing code authentication
|
||||
- Group metadata sync
|
||||
- Message history storage per registered group
|
||||
- Bot message filtering (prevents echo loops)
|
||||
|
||||
Aetheel supports Slack, Discord, Telegram, and WebChat but not WhatsApp.
|
||||
|
||||
**What to build:**
|
||||
- WhatsApp adapter using a library like Baileys or the WhatsApp Business API
|
||||
- QR code authentication flow
|
||||
- Group registration and metadata sync
|
||||
|
||||
---
|
||||
|
||||
### 14. Structured Message Routing
|
||||
|
||||
NanoClaw has a clean channel abstraction:
|
||||
- `Channel` interface: `connect()`, `sendMessage()`, `isConnected()`, `ownsJid()`, `disconnect()`, `setTyping?()`
|
||||
- `findChannel()` routes outbound messages to the right channel by JID prefix (`tg:`, `dc:`, WhatsApp JIDs)
|
||||
- `formatOutbound()` strips `<internal>` tags before sending
|
||||
- XML-escaped message formatting for agent input
|
||||
|
||||
Aetheel's adapters work but lack JID-based routing, `<internal>` tag support, and typing indicators across all adapters.
|
||||
|
||||
**What to build:**
|
||||
- JID-based message routing (prefix per channel)
|
||||
- `<internal>` tag stripping for agent-to-agent communication
|
||||
- Typing indicators for all adapters
|
||||
- Unified channel interface with `ownsJid()` pattern
|
||||
|
||||
---
|
||||
|
||||
## Priority Recommendations
|
||||
|
||||
### High Priority (Security + Core Gaps)
|
||||
1. Container isolation for agent execution
|
||||
2. Fix the 10 critical/high security issues from the security audit
|
||||
3. Per-group isolation (memory, sessions, filesystem)
|
||||
4. Mount security allowlist
|
||||
|
||||
### Medium Priority (Feature Parity)
|
||||
5. Working agent teams with per-agent identity
|
||||
6. Group queue with concurrency control and retry
|
||||
7. Task context modes and run logging
|
||||
8. Streaming output with idle timeout
|
||||
9. IPC-based communication with authorization
|
||||
|
||||
### Lower Priority (Nice to Have)
|
||||
10. Voice message transcription
|
||||
11. WhatsApp adapter
|
||||
12. Gmail/email integration
|
||||
13. Skills as code transformations
|
||||
14. Structured message routing with JID prefixes
|
||||
|
||||
---
|
||||
|
||||
## What Aetheel Has That NanoClaw Doesn't
|
||||
|
||||
For reference, these are Aetheel strengths to preserve:
|
||||
|
||||
- Dual runtime support (OpenCode + Claude Code) with live switching
|
||||
- Auto-failover on rate limits
|
||||
- Per-request cost tracking and usage stats
|
||||
- Local vector search (hybrid: 0.7 vector + 0.3 BM25) with fastembed
|
||||
- Built-in multi-channel (Slack, Discord, Telegram, WebChat, Webhooks)
|
||||
- WebChat browser UI
|
||||
- Heartbeat / proactive task system
|
||||
- Lifecycle hooks (gateway:startup, command:reload, agent:response, etc.)
|
||||
- Comprehensive CLI (`aetheel start/stop/restart/logs/doctor/config/cron/memory`)
|
||||
- Config-driven setup (no code changes needed for basic customization)
|
||||
- Self-modification (AI can edit its own config, skills, identity files)
|
||||
- Hot reload (`/reload` command)
|
||||
Reference in New Issue
Block a user