Fix container execution and add debug tooling

Container fixes:
- Run as non-root 'node' user (required for --dangerously-skip-permissions)
- Add allowDangerouslySkipPermissions: true to SDK options
- Mount .env file to work around Apple Container -i env var bug
- Use --mount for readonly, -v for read-write (Apple Container quirk)
- Bump SDK to 0.2.29, zod to v4
- Install Claude Code CLI globally in container

Logging improvements:
- Write per-run logs to groups/{folder}/logs/container-*.log
- Add debug-level logging for mounts and container args

Documentation:
- Add /debug skill with comprehensive troubleshooting guide
- Update /setup skill with API key configuration step
- Update SPEC.md with container details, mount syntax, security notes

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
Gavriel
2026-02-01 10:35:08 +02:00
parent 0ccdaaac48
commit 67e0295d82
7 changed files with 436 additions and 27 deletions

View File

@@ -0,0 +1,281 @@
---
name: debug
description: Debug container agent issues. Use when things aren't working, container fails, authentication problems, or to understand how the container system works. Covers logs, environment variables, mounts, and common issues.
---
# NanoClaw Container Debugging
This guide covers debugging the containerized agent execution system.
## Architecture Overview
```
Host (macOS) Container (Linux VM)
─────────────────────────────────────────────────────────────
src/container-runner.ts container/agent-runner/
│ │
│ spawns Apple Container │ runs Claude Agent SDK
│ with volume mounts │ with MCP servers
│ │
├── data/env/env ──────────────> /workspace/env-dir/env
├── groups/{folder} ───────────> /workspace/group
├── data/ipc ──────────────────> /workspace/ipc
└── (main only) project root ──> /workspace/project
```
## Log Locations
| Log | Location | Content |
|-----|----------|---------|
| **Main app logs** | `logs/nanoclaw.log` | Host-side WhatsApp, routing, container spawning |
| **Main app errors** | `logs/nanoclaw.error.log` | Host-side errors |
| **Container run logs** | `groups/{folder}/logs/container-*.log` | Per-run: input, mounts, stderr, stdout |
| **Claude sessions** | `~/.claude/projects/` | Claude Code session history |
## Enabling Debug Logging
Set `LOG_LEVEL=debug` for verbose output:
```bash
# For development
LOG_LEVEL=debug npm run dev
# For launchd service, add to plist EnvironmentVariables:
<key>LOG_LEVEL</key>
<string>debug</string>
```
Debug level shows:
- Full mount configurations
- Container command arguments
- Real-time container stderr
## Common Issues
### 1. "Claude Code process exited with code 1"
**Check the container log file** in `groups/{folder}/logs/container-*.log`
Common causes:
#### Missing API Key
```
Invalid API key · Please run /login
```
**Fix:** Ensure `.env` file exists in project root with valid `ANTHROPIC_API_KEY`:
```bash
cat .env # Should show: ANTHROPIC_API_KEY=sk-ant-...
```
#### Root User Restriction
```
--dangerously-skip-permissions cannot be used with root/sudo privileges
```
**Fix:** Container must run as non-root user. Check Dockerfile has `USER node`.
### 2. Environment Variables Not Passing
**Apple Container Bug:** Environment variables passed via `-e` are lost when using `-i` (interactive/piped stdin).
**Workaround:** The system mounts `.env` as a file and sources it inside the container.
To verify env vars are reaching the container:
```bash
echo '{}' | container run -i \
--mount type=bind,source=$(pwd)/data/env,target=/workspace/env-dir,readonly \
--entrypoint /bin/bash nanoclaw-agent:latest \
-c 'export $(cat /workspace/env-dir/env | xargs); echo "API key length: ${#ANTHROPIC_API_KEY}"'
```
### 3. Mount Issues
**Apple Container quirks:**
- Only mounts directories, not individual files
- `-v` syntax does NOT support `:ro` suffix - use `--mount` for readonly:
```bash
# Readonly: use --mount
--mount "type=bind,source=/path,target=/container/path,readonly"
# Read-write: use -v
-v /path:/container/path
```
To check what's mounted inside a container:
```bash
container run --rm --entrypoint /bin/bash nanoclaw-agent:latest -c 'ls -la /workspace/'
```
Expected structure:
```
/workspace/
├── env-dir/env # Environment file (ANTHROPIC_API_KEY)
├── group/ # Current group folder (cwd)
├── project/ # Project root (main channel only)
├── global/ # Global CLAUDE.md (non-main only)
├── ipc/ # Inter-process communication
│ ├── messages/ # Outgoing WhatsApp messages
│ └── tasks/ # Scheduled task commands
└── extra/ # Additional custom mounts
```
### 4. Permission Issues
The container runs as user `node` (uid 1000). Check ownership:
```bash
container run --rm --entrypoint /bin/bash nanoclaw-agent:latest -c '
whoami
ls -la /workspace/
ls -la /app/
'
```
All of `/workspace/` and `/app/` should be owned by `node`.
### 5. MCP Server Failures
If an MCP server fails to start, the agent may exit. Test MCP servers individually:
```bash
# Test Gmail MCP
container run --rm --entrypoint /bin/bash nanoclaw-agent:latest -c '
npx -y @gongrzhe/server-gmail-autoauth-mcp --help
'
```
## Manual Container Testing
### Test the full agent flow:
```bash
# Set up env file
mkdir -p data/env groups/test
cp .env data/env/env
# Run test query
echo '{"prompt":"What is 2+2?","groupFolder":"test","chatJid":"test@g.us","isMain":false}' | \
container run -i \
--mount "type=bind,source=$(pwd)/data/env,target=/workspace/env-dir,readonly" \
-v $(pwd)/groups/test:/workspace/group \
-v $(pwd)/data/ipc:/workspace/ipc \
nanoclaw-agent:latest
```
### Test Claude Code directly:
```bash
container run --rm --entrypoint /bin/bash \
--mount "type=bind,source=$(pwd)/data/env,target=/workspace/env-dir,readonly" \
nanoclaw-agent:latest -c '
export $(cat /workspace/env-dir/env | xargs)
claude -p "Say hello" --dangerously-skip-permissions --allowedTools ""
'
```
### Interactive shell in container:
```bash
container run --rm -it --entrypoint /bin/bash nanoclaw-agent:latest
```
## SDK Options Reference
The agent-runner uses these Claude Agent SDK options:
```typescript
query({
prompt: input.prompt,
options: {
cwd: '/workspace/group',
allowedTools: ['Bash', 'Read', 'Write', ...],
permissionMode: 'bypassPermissions',
allowDangerouslySkipPermissions: true, // Required with bypassPermissions
settingSources: ['project'],
mcpServers: { ... }
}
})
```
**Important:** `allowDangerouslySkipPermissions: true` is required when using `permissionMode: 'bypassPermissions'`. Without it, Claude Code exits with code 1.
## Rebuilding After Changes
```bash
# Rebuild main app
npm run build
# Rebuild container (use --no-cache for clean rebuild)
./container/build.sh
# Or force full rebuild
container builder prune -af
./container/build.sh
```
## Checking Container Image
```bash
# List images
container images
# Check what's in the image
container run --rm --entrypoint /bin/bash nanoclaw-agent:latest -c '
echo "=== Node version ==="
node --version
echo "=== Claude Code version ==="
claude --version
echo "=== Installed packages ==="
ls /app/node_modules/
'
```
## Session Persistence
Claude sessions are stored in `~/.claude/` which is mounted into the container. To clear sessions:
```bash
# Clear all sessions
rm -rf ~/.claude/projects/
# Clear sessions for a specific group
rm -rf ~/.claude/projects/*workspace-group*/
```
## IPC Debugging
The container communicates back to the host via files in `/workspace/ipc/`:
```bash
# Check pending messages
ls -la data/ipc/messages/
# Check pending task operations
ls -la data/ipc/tasks/
# Read a specific IPC file
cat data/ipc/messages/*.json
```
## Quick Diagnostic Script
Run this to check common issues:
```bash
echo "=== Checking NanoClaw Container Setup ==="
echo -e "\n1. API Key configured?"
[ -f .env ] && grep -q "ANTHROPIC_API_KEY=sk-" .env && echo "OK" || echo "MISSING - create .env with ANTHROPIC_API_KEY"
echo -e "\n2. Env file copied for container?"
[ -f data/env/env ] && echo "OK" || echo "MISSING - will be created on first run"
echo -e "\n3. Container image exists?"
container images 2>/dev/null | grep -q nanoclaw-agent && echo "OK" || echo "MISSING - run ./container/build.sh"
echo -e "\n4. Apple Container running?"
container system info &>/dev/null && echo "OK" || echo "NOT RUNNING - run: container system start"
echo -e "\n5. Groups directory?"
ls -la groups/ 2>/dev/null || echo "MISSING - run setup"
echo -e "\n6. Recent container logs?"
ls -t groups/*/logs/container-*.log 2>/dev/null | head -3 || echo "No container logs yet"
```

View File

@@ -37,7 +37,47 @@ container system start 2>/dev/null || true
container --version
```
## 3. Build Container Image
## 3. Configure API Key
Ask the user:
> Do you have an Anthropic API key configured elsewhere that I should copy, or should I create a `.env` file for you to fill in?
**If copying from another location:**
```bash
# Extract only the ANTHROPIC_API_KEY line from the source file
grep "^ANTHROPIC_API_KEY=" /path/to/other/.env > .env
```
Verify the key exists (only show first/last few chars for security):
```bash
KEY=$(grep "^ANTHROPIC_API_KEY=" .env | cut -d= -f2)
if [ -n "$KEY" ]; then
echo "API key configured: ${KEY:0:10}...${KEY: -4}"
else
echo "API key missing or invalid"
fi
```
**If creating new:**
```bash
echo 'ANTHROPIC_API_KEY=' > .env
```
Tell the user:
> I've created `.env` in the project root. Please add your Anthropic API key after the `=` sign.
> You can get an API key from https://console.anthropic.com/
Wait for user confirmation, then verify (only show first/last few chars):
```bash
KEY=$(grep "^ANTHROPIC_API_KEY=" .env | cut -d= -f2)
if [ -n "$KEY" ]; then
echo "API key configured: ${KEY:0:10}...${KEY: -4}"
else
echo "API key missing or invalid"
fi
```
## 4. Build Container Image
Build the NanoClaw agent container:
@@ -45,15 +85,15 @@ Build the NanoClaw agent container:
./container/build.sh
```
This creates the `nanoclaw-agent:latest` image with Node.js, Chromium, and agent-browser.
This creates the `nanoclaw-agent:latest` image with Node.js, Chromium, Claude Code CLI, and agent-browser.
Verify the image was created:
Verify the build succeeded (the `container images` command may not work due to a plugin issue, so we verify by running a simple test):
```bash
container images | grep nanoclaw-agent || echo "Image not found"
echo '{}' | container run -i --entrypoint /bin/echo nanoclaw-agent:latest "Container OK" || echo "Container build failed"
```
## 4. WhatsApp Authentication
## 5. WhatsApp Authentication
**USER ACTION REQUIRED**
@@ -73,7 +113,7 @@ Wait for the script to output "Successfully authenticated" then continue.
If it says "Already authenticated", skip to the next step.
## 5. Configure Assistant Name
## 6. Configure Assistant Name
Ask the user:
> What trigger word do you want to use? (default: `Andy`)
@@ -82,7 +122,7 @@ Ask the user:
Store their choice - you'll use it when creating the registered_groups.json and when telling them how to test.
## 6. Register Main Channel
## 7. Register Main Channel
Ask the user:
> Do you want to use your **personal chat** (message yourself) or a **WhatsApp group** as your main control channel?
@@ -126,7 +166,7 @@ Ensure the groups folder exists:
mkdir -p groups/main/logs
```
## 7. Gmail Authentication (Optional)
## 8. Gmail Authentication (Optional)
Ask the user:
> Do you want to enable Gmail integration for reading/sending emails?
@@ -153,7 +193,7 @@ npx -y @gongrzhe/server-gmail-autoauth-mcp
This will open a browser for OAuth consent. After authorization, credentials are cached.
## 8. Configure launchd Service
## 9. Configure launchd Service
Get the actual paths:
@@ -212,7 +252,7 @@ Verify it's running:
launchctl list | grep nanoclaw
```
## 9. Test
## 10. Test
Tell the user (using the assistant name they configured):
> Send `@ASSISTANT_NAME hello` in your registered chat.

36
SPEC.md
View File

@@ -54,7 +54,7 @@ A personal Claude assistant accessible via WhatsApp, with persistent memory per
│ │ Volume mounts: │ │
│ │ • groups/{name}/ → /workspace/group │ │
│ │ • groups/CLAUDE.md → /workspace/global/CLAUDE.md │ │
│ │ • ~/.claude/ → /root/.claude/ (sessions) │ │
│ │ • ~/.claude/ → /home/node/.claude/ (sessions) │ │
│ │ • Additional dirs → /workspace/extra/* │ │
│ │ │ │
│ │ Tools (all groups): │ │
@@ -76,7 +76,7 @@ A personal Claude assistant accessible via WhatsApp, with persistent memory per
| WhatsApp Connection | Node.js (@whiskeysockets/baileys) | Connect to WhatsApp, send/receive messages |
| Message Storage | SQLite (better-sqlite3) | Store messages for polling |
| Container Runtime | Apple Container | Isolated Linux VMs for agent execution |
| Agent | @anthropic-ai/claude-agent-sdk | Run Claude with tools and MCP servers |
| Agent | @anthropic-ai/claude-agent-sdk (0.2.29) | Run Claude with tools and MCP servers |
| Browser Automation | agent-browser + Chromium | Web interaction and screenshots |
| Runtime | Node.js 22+ | Host process for routing and scheduling |
@@ -104,7 +104,7 @@ nanoclaw/
│ └── container-runner.ts # Spawns agents in Apple Containers
├── container/
│ ├── Dockerfile # Container image definition
│ ├── Dockerfile # Container image (runs as 'node' user, includes Claude Code CLI)
│ ├── build.sh # Build script for container image
│ ├── agent-runner/ # Code that runs inside the container
│ │ ├── package.json
@@ -121,8 +121,10 @@ nanoclaw/
│ └── skills/
│ ├── setup/
│ │ └── SKILL.md # /setup skill
── customize/
└── SKILL.md # /customize skill
── customize/
└── SKILL.md # /customize skill
│ └── debug/
│ └── SKILL.md # /debug skill (container debugging)
├── groups/
│ ├── CLAUDE.md # Global memory (all groups read this)
@@ -142,11 +144,14 @@ nanoclaw/
│ ├── sessions.json # Active session IDs per group
│ ├── archived_sessions.json # Old sessions after /clear
│ ├── registered_groups.json # Group JID → folder mapping
── router_state.json # Last processed timestamp + last agent timestamps
── router_state.json # Last processed timestamp + last agent timestamps
│ ├── env/env # Copy of .env for container mounting
│ └── ipc/ # Container IPC (messages/, tasks/)
├── logs/ # Runtime logs (gitignored)
│ ├── nanoclaw.log # stdout
│ └── nanoclaw.error.log # stderr
│ ├── nanoclaw.log # Host stdout
│ └── nanoclaw.error.log # Host stderr
│ # Note: Per-container logs are in groups/{folder}/logs/container-*.log
└── launchd/
└── com.nanoclaw.plist # macOS service configuration
@@ -202,6 +207,18 @@ Groups can have additional directories mounted via `containerConfig` in `data/re
Additional mounts appear at `/workspace/extra/{containerPath}` inside the container.
**Apple Container mount syntax note:** Read-write mounts use `-v host:container`, but readonly mounts require `--mount "type=bind,source=...,target=...,readonly"` (the `:ro` suffix doesn't work).
### API Key Configuration
The Anthropic API key must be in a `.env` file in the project root:
```bash
ANTHROPIC_API_KEY=sk-ant-...
```
This file is automatically mounted into the container at `/workspace/env-dir/env` and sourced by the entrypoint script. This workaround is needed because Apple Container loses `-e` environment variables when using `-i` (interactive mode with piped stdin).
### Changing the Assistant Name
Set the `ASSISTANT_NAME` environment variable:
@@ -540,6 +557,7 @@ All agents run inside Apple Container (lightweight Linux VMs), providing:
- **Safe Bash access**: Commands run inside the container, not on your Mac
- **Network isolation**: Can be configured per-container if needed
- **Process isolation**: Container processes can't affect the host
- **Non-root user**: Container runs as unprivileged `node` user (uid 1000)
### Prompt Injection Risk
@@ -563,7 +581,7 @@ WhatsApp messages could contain malicious instructions attempting to manipulate
| Credential | Storage Location | Notes |
|------------|------------------|-------|
| Claude CLI Auth | ~/.claude/ | Managed by Claude Code CLI |
| Claude CLI Auth | ~/.claude/ | Mounted to /home/node/.claude/ in container |
| WhatsApp Session | store/auth/ | Auto-created, persists ~20 days |
| Gmail OAuth Tokens | ~/.gmail-mcp/ | Created during setup (optional) |

View File

@@ -29,8 +29,8 @@ RUN apt-get update && apt-get install -y \
ENV AGENT_BROWSER_EXECUTABLE_PATH=/usr/bin/chromium
ENV PLAYWRIGHT_CHROMIUM_EXECUTABLE_PATH=/usr/bin/chromium
# Install agent-browser globally
RUN npm install -g agent-browser
# Install agent-browser and claude-code globally
RUN npm install -g agent-browser @anthropic-ai/claude-code
# Create app directory
WORKDIR /app
@@ -50,8 +50,18 @@ RUN npm run build
# Create workspace directories
RUN mkdir -p /workspace/group /workspace/global /workspace/extra /workspace/ipc/messages /workspace/ipc/tasks
# Create entrypoint script
# Sources env from mounted /workspace/env-dir/env if it exists (workaround for Apple Container -i bug)
RUN printf '#!/bin/bash\nset -e\n[ -f /workspace/env-dir/env ] && export $(cat /workspace/env-dir/env | xargs)\ncat > /tmp/input.json\nnode /app/dist/index.js < /tmp/input.json\n' > /app/entrypoint.sh && chmod +x /app/entrypoint.sh
# Set ownership to node user (non-root) for writable directories
RUN chown -R node:node /workspace
# Switch to non-root user (required for --dangerously-skip-permissions)
USER node
# Set working directory to group workspace
WORKDIR /workspace/group
# Entry point reads JSON from stdin, outputs JSON to stdout
ENTRYPOINT ["node", "/app/dist/index.js"]
ENTRYPOINT ["/app/entrypoint.sh"]

View File

@@ -9,8 +9,8 @@
"start": "node dist/index.js"
},
"dependencies": {
"@anthropic-ai/claude-agent-sdk": "^0.1.9",
"zod": "^3.24.2"
"@anthropic-ai/claude-agent-sdk": "0.2.29",
"zod": "^4.0.0"
},
"devDependencies": {
"@types/node": "^22.10.7",

View File

@@ -83,6 +83,7 @@ async function main(): Promise<void> {
'mcp__gmail__*'
],
permissionMode: 'bypassPermissions',
allowDangerouslySkipPermissions: true,
settingSources: ['project'],
mcpServers: {
nanoclaw: ipcMcp,

View File

@@ -109,6 +109,20 @@ function buildVolumeMounts(group: RegisteredGroup, isMain: boolean): VolumeMount
readonly: false
});
// Environment file directory (workaround for Apple Container -i env var bug)
const envDir = path.join(DATA_DIR, 'env');
fs.mkdirSync(envDir, { recursive: true });
const envFile = path.join(projectRoot, '.env');
if (fs.existsSync(envFile)) {
// Copy .env to the env directory as a plain file called 'env'
fs.copyFileSync(envFile, path.join(envDir, 'env'));
mounts.push({
hostPath: envDir,
containerPath: '/workspace/env-dir',
readonly: true
});
}
// Additional mounts from group config
if (group.containerConfig?.additionalMounts) {
for (const mount of group.containerConfig.additionalMounts) {
@@ -136,9 +150,13 @@ function buildContainerArgs(mounts: VolumeMount[]): string[] {
const args: string[] = ['run', '-i', '--rm'];
// Add volume mounts
// Apple Container: use --mount for readonly, -v for read-write
for (const mount of mounts) {
const mode = mount.readonly ? ':ro' : '';
args.push('-v', `${mount.hostPath}:${mount.containerPath}${mode}`);
if (mount.readonly) {
args.push('--mount', `type=bind,source=${mount.hostPath},target=${mount.containerPath},readonly`);
} else {
args.push('-v', `${mount.hostPath}:${mount.containerPath}`);
}
}
// Add the image name
@@ -161,12 +179,23 @@ export async function runContainerAgent(
const mounts = buildVolumeMounts(group, input.isMain);
const containerArgs = buildContainerArgs(mounts);
// Log detailed mount info at debug level
logger.debug({
group: group.name,
mounts: mounts.map(m => `${m.hostPath} -> ${m.containerPath}${m.readonly ? ' (ro)' : ''}`),
containerArgs: containerArgs.join(' ')
}, 'Container mount configuration');
logger.info({
group: group.name,
mountCount: mounts.length,
isMain: input.isMain
}, 'Spawning container agent');
// Create logs directory for this group
const logsDir = path.join(GROUPS_DIR, group.folder, 'logs');
fs.mkdirSync(logsDir, { recursive: true });
return new Promise((resolve) => {
const container = spawn('container', containerArgs, {
stdio: ['pipe', 'pipe', 'pipe']
@@ -207,12 +236,42 @@ export async function runContainerAgent(
clearTimeout(timeout);
const duration = Date.now() - startTime;
// Always write stderr to log file for debugging
const timestamp = new Date().toISOString().replace(/[:.]/g, '-');
const logFile = path.join(logsDir, `container-${timestamp}.log`);
const logContent = [
`=== Container Run Log ===`,
`Timestamp: ${new Date().toISOString()}`,
`Group: ${group.name}`,
`IsMain: ${input.isMain}`,
`Duration: ${duration}ms`,
`Exit Code: ${code}`,
``,
`=== Input ===`,
JSON.stringify(input, null, 2),
``,
`=== Container Args ===`,
containerArgs.join(' '),
``,
`=== Mounts ===`,
mounts.map(m => `${m.hostPath} -> ${m.containerPath}${m.readonly ? ' (ro)' : ''}`).join('\n'),
``,
`=== Stderr ===`,
stderr,
``,
`=== Stdout ===`,
stdout
].join('\n');
fs.writeFileSync(logFile, logContent);
logger.debug({ logFile }, 'Container log written');
if (code !== 0) {
logger.error({
group: group.name,
code,
duration,
stderr: stderr.slice(-500)
stderr: stderr.slice(-500),
logFile
}, 'Container exited with error');
resolve({