Add containerized agent execution with Apple Container

- Agents run in isolated Linux VMs via Apple Container
- All groups get Bash access (safe - sandboxed in container)
- Browser automation via agent-browser + Chromium
- Per-group configurable additional directory mounts
- File-based IPC for messages and scheduled tasks
- Container image with Node.js 22, Chromium, agent-browser

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
gavrielc
2026-01-31 22:55:57 +02:00
parent fa13b14dae
commit 09c0e8142e
14 changed files with 1252 additions and 114 deletions

View File

@@ -10,10 +10,12 @@ A personal Claude assistant accessible via WhatsApp, with minimal custom code.
**Core components:**
- **Claude Agent SDK** as the core agent
- **Apple Container** for isolated agent execution (Linux VMs)
- **WhatsApp** as the primary I/O channel
- **Persistent memory** per conversation and globally
- **Scheduled tasks** that run Claude and can message back
- **Web access** for search and browsing
- **Browser automation** via agent-browser
**Design philosophy:**
- Leverage existing tools (WhatsApp connector, Claude Agent SDK, MCP servers)
@@ -41,10 +43,17 @@ A personal Claude assistant accessible via WhatsApp, with minimal custom code.
- `/clear` command resets the session but keeps memory files
- Old session IDs are archived to a file
### Container Isolation
- All agents run inside Apple Container (lightweight Linux VMs)
- Each agent invocation spawns a container with mounted directories
- Containers provide filesystem isolation - agents can only see mounted paths
- Bash access is safe because commands run inside the container, not on the host
- Browser automation via agent-browser with Chromium in the container
### Scheduled Tasks
- Users can ask Claude to schedule recurring or one-time tasks from any group
- Tasks run as full agents in the context of the group that created them
- Tasks have access to the same tools as regular messages (except Bash)
- Tasks have access to all tools including Bash (safe in container)
- Tasks can optionally send messages to their group via `send_message` tool, or complete silently
- Task runs are logged to the database with duration and result
- Schedule types: cron expressions, intervals (ms), or one-time (ISO timestamp)
@@ -53,17 +62,16 @@ A personal Claude assistant accessible via WhatsApp, with minimal custom code.
### Group Management
- New groups are added explicitly via the main channel
- Main channel agent has Bash access to query the database and find group JIDs
- Groups are registered by editing `data/registered_groups.json`
- Each group gets a dedicated folder under `groups/`
- Groups can have additional directories mounted via `containerConfig`
### Main Channel Privileges
- Main channel is the admin/control group (typically self-chat)
- Has Bash access for system commands and database queries
- Can write to global memory (`groups/CLAUDE.md`)
- Can schedule tasks for any group
- Can view and manage tasks from all groups
- Other groups do NOT have Bash access (security measure)
- Can configure additional directory mounts for any group
---
@@ -79,17 +87,23 @@ A personal Claude assistant accessible via WhatsApp, with minimal custom code.
- Optional, enabled during setup
### Scheduler
- Built-in scheduler (not external MCP) - runs in-process
- Custom `nanoclaw` MCP server provides scheduling tools
- Tools: `schedule_task`, `list_tasks`, `get_task`, `update_task`, `pause_task`, `resume_task`, `cancel_task`, `send_message`
- Built-in scheduler runs on the host, spawns containers for task execution
- Custom `nanoclaw` MCP server (inside container) provides scheduling tools
- Tools: `schedule_task`, `list_tasks`, `pause_task`, `resume_task`, `cancel_task`, `send_message`
- Tasks stored in SQLite with run history
- Scheduler loop checks for due tasks every minute
- Tasks execute Claude Agent SDK in group context with full tool access
- Tasks execute Claude Agent SDK in containerized group context
### Web Access
- Built-in WebSearch and WebFetch tools
- Standard Claude Agent SDK capabilities
### Browser Automation
- agent-browser CLI with Chromium in container
- Snapshot-based interaction with element references (@e1, @e2, etc.)
- Screenshots, PDFs, video recording
- Authentication state persistence
---
## Setup & Customization

124
SPEC.md
View File

@@ -24,8 +24,8 @@ A personal Claude assistant accessible via WhatsApp, with persistent memory per
```
┌─────────────────────────────────────────────────────────────────────┐
NanoClaw
(Single Node.js Process) │
HOST (macOS)
(Main Node.js Process)
├─────────────────────────────────────────────────────────────────────┤
│ │
│ ┌──────────────┐ ┌────────────────────┐ │
@@ -33,32 +33,37 @@ A personal Claude assistant accessible via WhatsApp, with persistent memory per
│ │ (baileys) │◀────────────────────│ (messages.db) │ │
│ └──────────────┘ store/send └─────────┬──────────┘ │
│ │ │
┌────────────────────────────────────────┘
│ │ │
│ ▼ │
│ ┌──────────────────┐ ┌──────────────────┐ ┌───────────────┐ │
│ │ Message Loop │ │ Scheduler Loop │ │ IPC Watcher │ │
│ │ (polls SQLite) │ │ (checks tasks) │ │ (file-based) │ │
│ └────────┬─────────┘ └────────┬─────────┘ └───────────────┘ │
│ │ │ │
│ └───────────┬───────────┘ │
│ │ spawns container │
│ ▼ │
├─────────────────────────────────────────────────────────────────────┤
│ APPLE CONTAINER (Linux VM) │
├─────────────────────────────────────────────────────────────────────┤
│ ┌──────────────────────────────────────────────────────────────┐ │
│ │ MESSAGE LOOP │ │
│ │ • Polls SQLite for new messages every 2 seconds │ │
│ │ • Filters: only registered groups, only trigger word │ │
│ │ • Loads session ID for conversation continuity │ │
│ │ • Invokes Claude Agent SDK in the group's directory │ │
│ │ • Sends response back to WhatsApp │ │
│ └──────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────────────────────────────────────────────────┐ │
│ │ CLAUDE AGENT SDK │ │
│ │ AGENT RUNNER │ │
│ │ │ │
│ │ Working directory: groups/{group-name}/ │ │
│ │ Context loaded: │ │
│ │ • ../CLAUDE.md (global memory) │ │
│ │ • ./CLAUDE.md (group-specific memory) │ │
│ │ Working directory: /workspace/group (mounted from host) │ │
│ │ Volume mounts: │ │
│ │ • groups/{name}/ → /workspace/group │ │
│ │ • groups/CLAUDE.md → /workspace/global/CLAUDE.md │ │
│ │ • ~/.claude/ → /root/.claude/ (sessions) │ │
│ │ • Additional dirs → /workspace/extra/* │ │
│ │ │ │
│ │ Available MCP Servers: │ │
│ │ • gmail-mcp (read/send email) │ │
│ │ • schedule-task-mcp (create cron jobs) │ │
│ │ │ │
│ │ Built-in Tools: │ │
│ │ Tools (all groups): │ │
│ │ • Bash (safe - sandboxed in container!) │ │
│ │ • Read, Write, Edit, Glob, Grep (file operations) │ │
│ │ • WebSearch, WebFetch (internet access) │ │
│ │ • Read, Write, Edit (file operations in group folder) │ │
│ │ • agent-browser (browser automation) │ │
│ │ • mcp__nanoclaw__* (scheduler tools via IPC) │ │
│ │ • mcp__gmail__* (email) │ │
│ └──────────────────────────────────────────────────────────────┘ │
│ │
└──────────────────────────────────────────────────────────────────────┘
@@ -70,8 +75,10 @@ A personal Claude assistant accessible via WhatsApp, with persistent memory per
|-----------|------------|---------|
| WhatsApp Connection | Node.js (@whiskeysockets/baileys) | Connect to WhatsApp, send/receive messages |
| Message Storage | SQLite (better-sqlite3) | Store messages for polling |
| Container Runtime | Apple Container | Isolated Linux VMs for agent execution |
| Agent | @anthropic-ai/claude-agent-sdk | Run Claude with tools and MCP servers |
| Runtime | Node.js 18+ | Single unified process |
| Browser Automation | agent-browser + Chromium | Web interaction and screenshots |
| Runtime | Node.js 22+ | Host process for routing and scheduling |
---
@@ -88,13 +95,25 @@ nanoclaw/
├── .gitignore
├── src/
│ ├── index.ts # Main application (WhatsApp + routing + agent)
│ ├── index.ts # Main application (WhatsApp + routing)
│ ├── config.ts # Configuration constants
│ ├── types.ts # TypeScript interfaces
│ ├── db.ts # Database initialization and queries
│ ├── auth.ts # Standalone WhatsApp authentication
│ ├── scheduler.ts # Scheduler loop (runs due tasks)
│ └── scheduler-mcp.ts # In-process MCP server for scheduling tools
│ └── container-runner.ts # Spawns agents in Apple Containers
├── container/
│ ├── Dockerfile # Container image definition
│ ├── build.sh # Build script for container image
│ ├── agent-runner/ # Code that runs inside the container
│ │ ├── package.json
│ │ ├── tsconfig.json
│ │ └── src/
│ │ ├── index.ts # Entry point (reads JSON, runs agent)
│ │ └── ipc-mcp.ts # MCP server for host communication
│ └── skills/
│ └── agent-browser.md # Browser automation skill
├── dist/ # Compiled JavaScript (gitignored)
@@ -142,14 +161,47 @@ Configuration constants are in `src/config.ts`:
```typescript
export const ASSISTANT_NAME = process.env.ASSISTANT_NAME || 'Andy';
export const POLL_INTERVAL = 2000;
export const SCHEDULER_POLL_INTERVAL = 60000;
export const STORE_DIR = './store';
export const GROUPS_DIR = './groups';
export const DATA_DIR = './data';
// Container configuration
export const CONTAINER_IMAGE = process.env.CONTAINER_IMAGE || 'nanoclaw-agent:latest';
export const CONTAINER_TIMEOUT = parseInt(process.env.CONTAINER_TIMEOUT || '300000', 10);
export const IPC_POLL_INTERVAL = 1000;
export const TRIGGER_PATTERN = new RegExp(`^@${ASSISTANT_NAME}\\b`, 'i');
export const CLEAR_COMMAND = '/clear';
```
### Container Configuration
Groups can have additional directories mounted via `containerConfig` in `data/registered_groups.json`:
```json
{
"1234567890@g.us": {
"name": "Dev Team",
"folder": "dev-team",
"trigger": "@Andy",
"added_at": "2026-01-31T12:00:00Z",
"containerConfig": {
"additionalMounts": [
{
"hostPath": "/Users/gavriel/projects/webapp",
"containerPath": "webapp",
"readonly": false
}
],
"timeout": 600000
}
}
}
```
Additional mounts appear at `/workspace/extra/{containerPath}` inside the container.
### Changing the Assistant Name
Set the `ASSISTANT_NAME` environment variable:
@@ -198,9 +250,9 @@ NanoClaw uses a hierarchical memory system based on CLAUDE.md files.
3. **Main Channel Privileges**
- Only the "main" group (self-chat) can write to global memory
- Main has **Bash access** for admin tasks (querying DB, system commands)
- Main can manage registered groups and schedule tasks for any group
- Other groups do NOT have Bash access (security measure)
- Main can configure additional directory mounts for any group
- All groups have Bash access (safe because it runs inside container)
---
@@ -481,19 +533,29 @@ tail -f logs/nanoclaw.log
## Security Considerations
### Container Isolation
All agents run inside Apple Container (lightweight Linux VMs), providing:
- **Filesystem isolation**: Agents can only access mounted directories
- **Safe Bash access**: Commands run inside the container, not on your Mac
- **Network isolation**: Can be configured per-container if needed
- **Process isolation**: Container processes can't affect the host
### Prompt Injection Risk
WhatsApp messages could contain malicious instructions attempting to manipulate Claude's behavior.
**Mitigations:**
- Container isolation limits blast radius
- Only registered groups are processed
- Trigger word required (reduces accidental processing)
- Main channel has elevated privileges (isolated from other groups)
- Regular groups do NOT have Bash access (only main does)
- Agents can only access their group's mounted directories
- Main can configure additional directories per group
- Claude's built-in safety training
**Recommendations:**
- Only register trusted groups
- Review additional directory mounts carefully
- Review scheduled tasks periodically
- Monitor logs for unusual activity

57
container/Dockerfile Normal file
View File

@@ -0,0 +1,57 @@
# NanoClaw Agent Container
# Runs Claude Agent SDK in isolated Linux VM with browser automation
FROM node:22-slim
# Install system dependencies for Chromium
RUN apt-get update && apt-get install -y \
chromium \
fonts-liberation \
fonts-noto-color-emoji \
libgbm1 \
libnss3 \
libatk-bridge2.0-0 \
libgtk-3-0 \
libx11-xcb1 \
libxcomposite1 \
libxdamage1 \
libxrandr2 \
libasound2 \
libpangocairo-1.0-0 \
libcups2 \
libdrm2 \
libxshmfence1 \
curl \
git \
&& rm -rf /var/lib/apt/lists/*
# Set Chromium path for agent-browser
ENV AGENT_BROWSER_EXECUTABLE_PATH=/usr/bin/chromium
ENV PLAYWRIGHT_CHROMIUM_EXECUTABLE_PATH=/usr/bin/chromium
# Install agent-browser globally
RUN npm install -g agent-browser
# Create app directory
WORKDIR /app
# Copy package files first for better caching
COPY agent-runner/package*.json ./
# Install dependencies
RUN npm install
# Copy source code
COPY agent-runner/ ./
# Build TypeScript
RUN npm run build
# Create workspace directories
RUN mkdir -p /workspace/group /workspace/global /workspace/extra /workspace/ipc/messages /workspace/ipc/tasks
# Set working directory to group workspace
WORKDIR /workspace/group
# Entry point reads JSON from stdin, outputs JSON to stdout
ENTRYPOINT ["node", "/app/dist/index.js"]

View File

@@ -0,0 +1,19 @@
{
"name": "nanoclaw-agent-runner",
"version": "1.0.0",
"type": "module",
"description": "Container-side agent runner for NanoClaw",
"main": "dist/index.js",
"scripts": {
"build": "tsc",
"start": "node dist/index.js"
},
"dependencies": {
"@anthropic-ai/claude-agent-sdk": "^0.1.9",
"zod": "^3.24.2"
},
"devDependencies": {
"@types/node": "^22.10.7",
"typescript": "^5.7.3"
}
}

View File

@@ -0,0 +1,124 @@
/**
* NanoClaw Agent Runner
* Runs inside a container, receives config via stdin, outputs result to stdout
*/
import { query } from '@anthropic-ai/claude-agent-sdk';
import { createIpcMcp } from './ipc-mcp.js';
interface ContainerInput {
prompt: string;
sessionId?: string;
groupFolder: string;
chatJid: string;
isMain: boolean;
}
interface ContainerOutput {
status: 'success' | 'error';
result: string | null;
newSessionId?: string;
error?: string;
}
async function readStdin(): Promise<string> {
return new Promise((resolve, reject) => {
let data = '';
process.stdin.setEncoding('utf8');
process.stdin.on('data', chunk => { data += chunk; });
process.stdin.on('end', () => resolve(data));
process.stdin.on('error', reject);
});
}
function writeOutput(output: ContainerOutput): void {
// Write to stdout as JSON (this is how the host process receives results)
console.log(JSON.stringify(output));
}
function log(message: string): void {
// Write logs to stderr so they don't interfere with JSON output
console.error(`[agent-runner] ${message}`);
}
async function main(): Promise<void> {
let input: ContainerInput;
try {
const stdinData = await readStdin();
input = JSON.parse(stdinData);
log(`Received input for group: ${input.groupFolder}`);
} catch (err) {
writeOutput({
status: 'error',
result: null,
error: `Failed to parse input: ${err instanceof Error ? err.message : String(err)}`
});
process.exit(1);
}
// Create IPC-based MCP for communicating back to host
const ipcMcp = createIpcMcp({
chatJid: input.chatJid,
groupFolder: input.groupFolder,
isMain: input.isMain
});
let result: string | null = null;
let newSessionId: string | undefined;
try {
log('Starting agent...');
for await (const message of query({
prompt: input.prompt,
options: {
cwd: '/workspace/group',
resume: input.sessionId,
allowedTools: [
'Bash', // Safe - sandboxed in container!
'Read', 'Write', 'Edit', 'Glob', 'Grep',
'WebSearch', 'WebFetch',
'mcp__nanoclaw__*',
'mcp__gmail__*'
],
permissionMode: 'bypassPermissions',
settingSources: ['project'],
mcpServers: {
nanoclaw: ipcMcp,
gmail: { command: 'npx', args: ['-y', '@gongrzhe/server-gmail-autoauth-mcp'] }
}
}
})) {
// Capture session ID from init message
if (message.type === 'system' && message.subtype === 'init') {
newSessionId = message.session_id;
log(`Session initialized: ${newSessionId}`);
}
// Capture final result
if ('result' in message && message.result) {
result = message.result as string;
}
}
log('Agent completed successfully');
writeOutput({
status: 'success',
result,
newSessionId
});
} catch (err) {
log(`Agent error: ${err instanceof Error ? err.message : String(err)}`);
writeOutput({
status: 'error',
result: null,
newSessionId,
error: err instanceof Error ? err.message : String(err)
});
process.exit(1);
}
}
main();

View File

@@ -0,0 +1,245 @@
/**
* IPC-based MCP Server for NanoClaw
* Writes messages and tasks to files for the host process to pick up
*/
import { createSdkMcpServer, tool } from '@anthropic-ai/claude-agent-sdk';
import { z } from 'zod';
import fs from 'fs';
import path from 'path';
const IPC_DIR = '/workspace/ipc';
const MESSAGES_DIR = path.join(IPC_DIR, 'messages');
const TASKS_DIR = path.join(IPC_DIR, 'tasks');
export interface IpcMcpContext {
chatJid: string;
groupFolder: string;
isMain: boolean;
}
function writeIpcFile(dir: string, data: object): string {
// Ensure directory exists
fs.mkdirSync(dir, { recursive: true });
// Use timestamp + random suffix for unique filename
const filename = `${Date.now()}-${Math.random().toString(36).slice(2, 8)}.json`;
const filepath = path.join(dir, filename);
// Write atomically: write to temp file, then rename
const tempPath = `${filepath}.tmp`;
fs.writeFileSync(tempPath, JSON.stringify(data, null, 2));
fs.renameSync(tempPath, filepath);
return filename;
}
export function createIpcMcp(ctx: IpcMcpContext) {
const { chatJid, groupFolder, isMain } = ctx;
return createSdkMcpServer({
name: 'nanoclaw',
version: '1.0.0',
tools: [
// Send a message to the WhatsApp group
tool(
'send_message',
'Send a message to the current WhatsApp group. Use this to proactively share information or updates.',
{
text: z.string().describe('The message text to send')
},
async (args) => {
const data = {
type: 'message',
chatJid,
text: args.text,
groupFolder,
timestamp: new Date().toISOString()
};
const filename = writeIpcFile(MESSAGES_DIR, data);
return {
content: [{
type: 'text',
text: `Message queued for delivery (${filename})`
}]
};
}
),
// Schedule a new task
tool(
'schedule_task',
'Schedule a recurring or one-time task. The task will run as a full agent with access to all tools.',
{
prompt: z.string().describe('What the agent should do when the task runs'),
schedule_type: z.enum(['cron', 'interval', 'once']).describe('Type of schedule'),
schedule_value: z.string().describe('Cron expression, interval in ms, or ISO timestamp'),
target_group: z.string().optional().describe('Target group folder (main only, defaults to current group)')
},
async (args) => {
// Non-main groups can only schedule for themselves
const targetGroup = isMain && args.target_group ? args.target_group : groupFolder;
const data = {
type: 'schedule_task',
prompt: args.prompt,
schedule_type: args.schedule_type,
schedule_value: args.schedule_value,
groupFolder: targetGroup,
chatJid,
createdBy: groupFolder,
timestamp: new Date().toISOString()
};
const filename = writeIpcFile(TASKS_DIR, data);
return {
content: [{
type: 'text',
text: `Task scheduled (${filename}): ${args.schedule_type} - ${args.schedule_value}`
}]
};
}
),
// List tasks (reads from a mounted file that host keeps updated)
tool(
'list_tasks',
'List all scheduled tasks. From main: shows all tasks. From other groups: shows only that group\'s tasks.',
{},
async () => {
// Host process writes current tasks to this file
const tasksFile = path.join(IPC_DIR, 'current_tasks.json');
try {
if (!fs.existsSync(tasksFile)) {
return {
content: [{
type: 'text',
text: 'No scheduled tasks found.'
}]
};
}
const allTasks = JSON.parse(fs.readFileSync(tasksFile, 'utf-8'));
// Filter to current group unless main
const tasks = isMain
? allTasks
: allTasks.filter((t: { groupFolder: string }) => t.groupFolder === groupFolder);
if (tasks.length === 0) {
return {
content: [{
type: 'text',
text: 'No scheduled tasks found.'
}]
};
}
const formatted = tasks.map((t: { id: string; prompt: string; schedule_type: string; schedule_value: string; status: string; next_run: string }) =>
`- [${t.id}] ${t.prompt.slice(0, 50)}... (${t.schedule_type}: ${t.schedule_value}) - ${t.status}, next: ${t.next_run || 'N/A'}`
).join('\n');
return {
content: [{
type: 'text',
text: `Scheduled tasks:\n${formatted}`
}]
};
} catch (err) {
return {
content: [{
type: 'text',
text: `Error reading tasks: ${err instanceof Error ? err.message : String(err)}`
}]
};
}
}
),
// Pause a task
tool(
'pause_task',
'Pause a scheduled task. It will not run until resumed.',
{
task_id: z.string().describe('The task ID to pause')
},
async (args) => {
const data = {
type: 'pause_task',
taskId: args.task_id,
groupFolder,
isMain,
timestamp: new Date().toISOString()
};
writeIpcFile(TASKS_DIR, data);
return {
content: [{
type: 'text',
text: `Task ${args.task_id} pause requested.`
}]
};
}
),
// Resume a task
tool(
'resume_task',
'Resume a paused task.',
{
task_id: z.string().describe('The task ID to resume')
},
async (args) => {
const data = {
type: 'resume_task',
taskId: args.task_id,
groupFolder,
isMain,
timestamp: new Date().toISOString()
};
writeIpcFile(TASKS_DIR, data);
return {
content: [{
type: 'text',
text: `Task ${args.task_id} resume requested.`
}]
};
}
),
// Cancel a task
tool(
'cancel_task',
'Cancel and delete a scheduled task.',
{
task_id: z.string().describe('The task ID to cancel')
},
async (args) => {
const data = {
type: 'cancel_task',
taskId: args.task_id,
groupFolder,
isMain,
timestamp: new Date().toISOString()
};
writeIpcFile(TASKS_DIR, data);
return {
content: [{
type: 'text',
text: `Task ${args.task_id} cancellation requested.`
}]
};
}
)
]
});
}

View File

@@ -0,0 +1,15 @@
{
"compilerOptions": {
"target": "ES2022",
"module": "NodeNext",
"moduleResolution": "NodeNext",
"outDir": "./dist",
"rootDir": "./src",
"strict": true,
"esModuleInterop": true,
"skipLibCheck": true,
"declaration": true
},
"include": ["src/**/*"],
"exclude": ["node_modules", "dist"]
}

23
container/build.sh Executable file
View File

@@ -0,0 +1,23 @@
#!/bin/bash
# Build the NanoClaw agent container image
set -e
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
cd "$SCRIPT_DIR"
IMAGE_NAME="nanoclaw-agent"
TAG="${1:-latest}"
echo "Building NanoClaw agent container image..."
echo "Image: ${IMAGE_NAME}:${TAG}"
# Build with Apple Container
container build -t "${IMAGE_NAME}:${TAG}" .
echo ""
echo "Build complete!"
echo "Image: ${IMAGE_NAME}:${TAG}"
echo ""
echo "Test with:"
echo " echo '{\"prompt\":\"What is 2+2?\",\"groupFolder\":\"test\",\"chatJid\":\"test@g.us\",\"isMain\":false}' | container run -i ${IMAGE_NAME}:${TAG}"

View File

@@ -0,0 +1,159 @@
---
name: agent-browser
description: Automates browser interactions for web testing, form filling, screenshots, and data extraction. Use when the user needs to navigate websites, interact with web pages, fill forms, take screenshots, test web applications, or extract information from web pages.
allowed-tools: Bash(agent-browser:*)
---
# Browser Automation with agent-browser
## Quick start
```bash
agent-browser open <url> # Navigate to page
agent-browser snapshot -i # Get interactive elements with refs
agent-browser click @e1 # Click element by ref
agent-browser fill @e2 "text" # Fill input by ref
agent-browser close # Close browser
```
## Core workflow
1. Navigate: `agent-browser open <url>`
2. Snapshot: `agent-browser snapshot -i` (returns elements with refs like `@e1`, `@e2`)
3. Interact using refs from the snapshot
4. Re-snapshot after navigation or significant DOM changes
## Commands
### Navigation
```bash
agent-browser open <url> # Navigate to URL
agent-browser back # Go back
agent-browser forward # Go forward
agent-browser reload # Reload page
agent-browser close # Close browser
```
### Snapshot (page analysis)
```bash
agent-browser snapshot # Full accessibility tree
agent-browser snapshot -i # Interactive elements only (recommended)
agent-browser snapshot -c # Compact output
agent-browser snapshot -d 3 # Limit depth to 3
agent-browser snapshot -s "#main" # Scope to CSS selector
```
### Interactions (use @refs from snapshot)
```bash
agent-browser click @e1 # Click
agent-browser dblclick @e1 # Double-click
agent-browser fill @e2 "text" # Clear and type
agent-browser type @e2 "text" # Type without clearing
agent-browser press Enter # Press key
agent-browser hover @e1 # Hover
agent-browser check @e1 # Check checkbox
agent-browser uncheck @e1 # Uncheck checkbox
agent-browser select @e1 "value" # Select dropdown option
agent-browser scroll down 500 # Scroll page
agent-browser upload @e1 file.pdf # Upload files
```
### Get information
```bash
agent-browser get text @e1 # Get element text
agent-browser get html @e1 # Get innerHTML
agent-browser get value @e1 # Get input value
agent-browser get attr @e1 href # Get attribute
agent-browser get title # Get page title
agent-browser get url # Get current URL
agent-browser get count ".item" # Count matching elements
```
### Screenshots & PDF
```bash
agent-browser screenshot # Save to temp directory
agent-browser screenshot path.png # Save to specific path
agent-browser screenshot --full # Full page
agent-browser pdf output.pdf # Save as PDF
```
### Wait
```bash
agent-browser wait @e1 # Wait for element
agent-browser wait 2000 # Wait milliseconds
agent-browser wait --text "Success" # Wait for text
agent-browser wait --url "**/dashboard" # Wait for URL pattern
agent-browser wait --load networkidle # Wait for network idle
```
### Semantic locators (alternative to refs)
```bash
agent-browser find role button click --name "Submit"
agent-browser find text "Sign In" click
agent-browser find label "Email" fill "user@test.com"
agent-browser find placeholder "Search" type "query"
```
### Authentication with saved state
```bash
# Login once
agent-browser open https://app.example.com/login
agent-browser snapshot -i
agent-browser fill @e1 "username"
agent-browser fill @e2 "password"
agent-browser click @e3
agent-browser wait --url "**/dashboard"
agent-browser state save auth.json
# Later: load saved state
agent-browser state load auth.json
agent-browser open https://app.example.com/dashboard
```
### Cookies & Storage
```bash
agent-browser cookies # Get all cookies
agent-browser cookies set name value # Set cookie
agent-browser cookies clear # Clear cookies
agent-browser storage local # Get localStorage
agent-browser storage local set k v # Set value
```
### JavaScript
```bash
agent-browser eval "document.title" # Run JavaScript
```
## Example: Form submission
```bash
agent-browser open https://example.com/form
agent-browser snapshot -i
# Output shows: textbox "Email" [ref=e1], textbox "Password" [ref=e2], button "Submit" [ref=e3]
agent-browser fill @e1 "user@example.com"
agent-browser fill @e2 "password123"
agent-browser click @e3
agent-browser wait --load networkidle
agent-browser snapshot -i # Check result
```
## Example: Data extraction
```bash
agent-browser open https://example.com/products
agent-browser snapshot -i
agent-browser get text @e1 # Get product title
agent-browser get attr @e2 href # Get link URL
agent-browser screenshot products.png
```

View File

@@ -6,5 +6,10 @@ export const GROUPS_DIR = './groups';
export const DATA_DIR = './data';
export const MAIN_GROUP_FOLDER = 'main';
// Container configuration
export const CONTAINER_IMAGE = process.env.CONTAINER_IMAGE || 'nanoclaw-agent:latest';
export const CONTAINER_TIMEOUT = parseInt(process.env.CONTAINER_TIMEOUT || '300000', 10); // 5 minutes default
export const IPC_POLL_INTERVAL = 1000; // Check IPC directories every second
export const TRIGGER_PATTERN = new RegExp(`^@${ASSISTANT_NAME}\\b`, 'i');
export const CLEAR_COMMAND = '/clear';

265
src/container-runner.ts Normal file
View File

@@ -0,0 +1,265 @@
/**
* Container Runner for NanoClaw
* Spawns agent execution in Apple Container and handles IPC
*/
import { spawn } from 'child_process';
import fs from 'fs';
import path from 'path';
import pino from 'pino';
import {
CONTAINER_IMAGE,
CONTAINER_TIMEOUT,
GROUPS_DIR,
DATA_DIR
} from './config.js';
import { RegisteredGroup } from './types.js';
const logger = pino({
level: process.env.LOG_LEVEL || 'info',
transport: { target: 'pino-pretty', options: { colorize: true } }
});
export interface ContainerInput {
prompt: string;
sessionId?: string;
groupFolder: string;
chatJid: string;
isMain: boolean;
}
export interface ContainerOutput {
status: 'success' | 'error';
result: string | null;
newSessionId?: string;
error?: string;
}
interface VolumeMount {
hostPath: string;
containerPath: string;
readonly?: boolean;
}
function buildVolumeMounts(group: RegisteredGroup, isMain: boolean): VolumeMount[] {
const mounts: VolumeMount[] = [];
const homeDir = process.env.HOME || '/Users/gavriel';
// Group's working directory (read-write)
mounts.push({
hostPath: path.join(GROUPS_DIR, group.folder),
containerPath: '/workspace/group',
readonly: false
});
// Global CLAUDE.md (read-only for non-main, read-write for main)
const globalClaudeMd = path.join(GROUPS_DIR, 'CLAUDE.md');
if (fs.existsSync(globalClaudeMd)) {
mounts.push({
hostPath: globalClaudeMd,
containerPath: '/workspace/global/CLAUDE.md',
readonly: !isMain
});
}
// Claude sessions directory (for session persistence)
const claudeDir = path.join(homeDir, '.claude');
if (fs.existsSync(claudeDir)) {
mounts.push({
hostPath: claudeDir,
containerPath: '/root/.claude',
readonly: false
});
}
// Gmail MCP credentials
const gmailDir = path.join(homeDir, '.gmail-mcp');
if (fs.existsSync(gmailDir)) {
mounts.push({
hostPath: gmailDir,
containerPath: '/root/.gmail-mcp',
readonly: false
});
}
// IPC directory for messages and tasks
const ipcDir = path.join(DATA_DIR, 'ipc');
fs.mkdirSync(path.join(ipcDir, 'messages'), { recursive: true });
fs.mkdirSync(path.join(ipcDir, 'tasks'), { recursive: true });
mounts.push({
hostPath: ipcDir,
containerPath: '/workspace/ipc',
readonly: false
});
// Additional mounts from group config
if (group.containerConfig?.additionalMounts) {
for (const mount of group.containerConfig.additionalMounts) {
// Resolve home directory in path
const hostPath = mount.hostPath.startsWith('~')
? path.join(homeDir, mount.hostPath.slice(1))
: mount.hostPath;
if (fs.existsSync(hostPath)) {
mounts.push({
hostPath,
containerPath: `/workspace/extra/${mount.containerPath}`,
readonly: mount.readonly !== false // Default to readonly for safety
});
} else {
logger.warn({ hostPath }, 'Additional mount path does not exist, skipping');
}
}
}
return mounts;
}
function buildContainerArgs(mounts: VolumeMount[]): string[] {
const args: string[] = ['run', '-i', '--rm'];
// Add volume mounts
for (const mount of mounts) {
const mode = mount.readonly ? ':ro' : '';
args.push('-v', `${mount.hostPath}:${mount.containerPath}${mode}`);
}
// Add the image name
args.push(CONTAINER_IMAGE);
return args;
}
export async function runContainerAgent(
group: RegisteredGroup,
input: ContainerInput
): Promise<ContainerOutput> {
const startTime = Date.now();
// Ensure group directory exists
const groupDir = path.join(GROUPS_DIR, group.folder);
fs.mkdirSync(groupDir, { recursive: true });
// Build volume mounts
const mounts = buildVolumeMounts(group, input.isMain);
const containerArgs = buildContainerArgs(mounts);
logger.info({
group: group.name,
mountCount: mounts.length,
isMain: input.isMain
}, 'Spawning container agent');
return new Promise((resolve) => {
const container = spawn('container', containerArgs, {
stdio: ['pipe', 'pipe', 'pipe']
});
let stdout = '';
let stderr = '';
// Send input JSON to container stdin
container.stdin.write(JSON.stringify(input));
container.stdin.end();
container.stdout.on('data', (data) => {
stdout += data.toString();
});
container.stderr.on('data', (data) => {
stderr += data.toString();
// Log container stderr in real-time
const lines = data.toString().trim().split('\n');
for (const line of lines) {
if (line) logger.debug({ container: group.folder }, line);
}
});
// Timeout handler
const timeout = setTimeout(() => {
logger.error({ group: group.name }, 'Container timeout, killing');
container.kill('SIGKILL');
resolve({
status: 'error',
result: null,
error: `Container timed out after ${CONTAINER_TIMEOUT}ms`
});
}, group.containerConfig?.timeout || CONTAINER_TIMEOUT);
container.on('close', (code) => {
clearTimeout(timeout);
const duration = Date.now() - startTime;
if (code !== 0) {
logger.error({
group: group.name,
code,
duration,
stderr: stderr.slice(-500)
}, 'Container exited with error');
resolve({
status: 'error',
result: null,
error: `Container exited with code ${code}: ${stderr.slice(-200)}`
});
return;
}
// Parse JSON output from stdout
try {
// Find the JSON line (last non-empty line should be the output)
const lines = stdout.trim().split('\n');
const jsonLine = lines[lines.length - 1];
const output: ContainerOutput = JSON.parse(jsonLine);
logger.info({
group: group.name,
duration,
status: output.status,
hasResult: !!output.result
}, 'Container completed');
resolve(output);
} catch (err) {
logger.error({
group: group.name,
stdout: stdout.slice(-500),
error: err
}, 'Failed to parse container output');
resolve({
status: 'error',
result: null,
error: `Failed to parse container output: ${err instanceof Error ? err.message : String(err)}`
});
}
});
container.on('error', (err) => {
clearTimeout(timeout);
logger.error({ group: group.name, error: err }, 'Container spawn error');
resolve({
status: 'error',
result: null,
error: `Container spawn error: ${err.message}`
});
});
});
}
// Export task snapshot for container IPC
export function writeTasksSnapshot(tasks: Array<{
id: string;
groupFolder: string;
prompt: string;
schedule_type: string;
schedule_value: string;
status: string;
next_run: string | null;
}>): void {
const ipcDir = path.join(DATA_DIR, 'ipc');
fs.mkdirSync(ipcDir, { recursive: true });
const tasksFile = path.join(ipcDir, 'current_tasks.json');
fs.writeFileSync(tasksFile, JSON.stringify(tasks, null, 2));
}

View File

@@ -4,7 +4,6 @@ import makeWASocket, {
makeCacheableSignalKeyStore,
WASocket
} from '@whiskeysockets/baileys';
import { query } from '@anthropic-ai/claude-agent-sdk';
import pino from 'pino';
import { exec } from 'child_process';
import fs from 'fs';
@@ -18,12 +17,13 @@ import {
DATA_DIR,
TRIGGER_PATTERN,
CLEAR_COMMAND,
MAIN_GROUP_FOLDER
MAIN_GROUP_FOLDER,
IPC_POLL_INTERVAL
} from './config.js';
import { RegisteredGroup, Session, NewMessage } from './types.js';
import { initDatabase, storeMessage, getNewMessages, getMessagesSince } from './db.js';
import { createSchedulerMcp } from './scheduler-mcp.js';
import { initDatabase, storeMessage, getNewMessages, getMessagesSince, getAllTasks } from './db.js';
import { startSchedulerLoop } from './scheduler.js';
import { runContainerAgent, writeTasksSnapshot } from './container-runner.js';
const logger = pino({
level: process.env.LOG_LEVEL || 'info',
@@ -118,59 +118,46 @@ async function processMessage(msg: NewMessage): Promise<void> {
}
async function runAgent(group: RegisteredGroup, prompt: string, chatJid: string): Promise<string | null> {
const groupDir = path.join(GROUPS_DIR, group.folder);
fs.mkdirSync(groupDir, { recursive: true });
const isMain = group.folder === MAIN_GROUP_FOLDER;
const sessionId = sessions[group.folder];
let newSessionId: string | undefined;
let result: string | null = null;
// Create scheduler MCP with current group context
const schedulerMcp = createSchedulerMcp({
groupFolder: group.folder,
chatJid,
isMain,
sendMessage
});
// Main channel gets Bash access for admin tasks (querying DB, etc.)
const baseTools = ['Read', 'Write', 'Edit', 'Glob', 'Grep', 'WebSearch', 'WebFetch', 'mcp__nanoclaw__*', 'mcp__gmail__*'];
const allowedTools = isMain ? [...baseTools, 'Bash'] : baseTools;
// Update tasks snapshot for container to read
const tasks = getAllTasks();
writeTasksSnapshot(tasks.map(t => ({
id: t.id,
groupFolder: t.group_folder,
prompt: t.prompt,
schedule_type: t.schedule_type,
schedule_value: t.schedule_value,
status: t.status,
next_run: t.next_run
})));
try {
for await (const message of query({
const output = await runContainerAgent(group, {
prompt,
options: {
cwd: groupDir,
resume: sessionId,
allowedTools,
permissionMode: 'bypassPermissions',
settingSources: ['project'],
mcpServers: {
nanoclaw: schedulerMcp,
gmail: { command: 'npx', args: ['-y', '@gongrzhe/server-gmail-autoauth-mcp'] }
}
}
})) {
if (message.type === 'system' && message.subtype === 'init') {
newSessionId = message.session_id;
}
if ('result' in message && message.result) {
result = message.result as string;
}
sessionId,
groupFolder: group.folder,
chatJid,
isMain
});
// Update session if changed
if (output.newSessionId) {
sessions[group.folder] = output.newSessionId;
saveJson(path.join(DATA_DIR, 'sessions.json'), sessions);
}
if (output.status === 'error') {
logger.error({ group: group.name, error: output.error }, 'Container agent error');
return null;
}
return output.result;
} catch (err) {
logger.error({ group: group.name, err }, 'Agent error');
return null;
}
if (newSessionId) {
sessions[group.folder] = newSessionId;
saveJson(path.join(DATA_DIR, 'sessions.json'), sessions);
}
return result;
}
async function sendMessage(jid: string, text: string): Promise<void> {
@@ -182,6 +169,139 @@ async function sendMessage(jid: string, text: string): Promise<void> {
}
}
// IPC watcher for container messages and tasks
function startIpcWatcher(): void {
const messagesDir = path.join(DATA_DIR, 'ipc', 'messages');
const tasksDir = path.join(DATA_DIR, 'ipc', 'tasks');
fs.mkdirSync(messagesDir, { recursive: true });
fs.mkdirSync(tasksDir, { recursive: true });
const processIpcFiles = async () => {
// Process pending messages
try {
const messageFiles = fs.readdirSync(messagesDir).filter(f => f.endsWith('.json'));
for (const file of messageFiles) {
const filePath = path.join(messagesDir, file);
try {
const data = JSON.parse(fs.readFileSync(filePath, 'utf-8'));
if (data.type === 'message' && data.chatJid && data.text) {
await sendMessage(data.chatJid, `${ASSISTANT_NAME}: ${data.text}`);
logger.info({ chatJid: data.chatJid }, 'IPC message sent');
}
fs.unlinkSync(filePath);
} catch (err) {
logger.error({ file, err }, 'Error processing IPC message');
// Move to error directory instead of deleting
const errorDir = path.join(DATA_DIR, 'ipc', 'errors');
fs.mkdirSync(errorDir, { recursive: true });
fs.renameSync(filePath, path.join(errorDir, file));
}
}
} catch (err) {
logger.error({ err }, 'Error reading IPC messages directory');
}
// Process pending task operations
try {
const taskFiles = fs.readdirSync(tasksDir).filter(f => f.endsWith('.json'));
for (const file of taskFiles) {
const filePath = path.join(tasksDir, file);
try {
const data = JSON.parse(fs.readFileSync(filePath, 'utf-8'));
await processTaskIpc(data);
fs.unlinkSync(filePath);
} catch (err) {
logger.error({ file, err }, 'Error processing IPC task');
const errorDir = path.join(DATA_DIR, 'ipc', 'errors');
fs.mkdirSync(errorDir, { recursive: true });
fs.renameSync(filePath, path.join(errorDir, file));
}
}
} catch (err) {
logger.error({ err }, 'Error reading IPC tasks directory');
}
setTimeout(processIpcFiles, IPC_POLL_INTERVAL);
};
processIpcFiles();
logger.info('IPC watcher started');
}
async function processTaskIpc(data: {
type: string;
taskId?: string;
prompt?: string;
schedule_type?: string;
schedule_value?: string;
groupFolder?: string;
chatJid?: string;
isMain?: boolean;
}): Promise<void> {
// Import db functions dynamically to avoid circular deps
const { createTask, updateTask, deleteTask } = await import('./db.js');
const { CronExpressionParser } = await import('cron-parser');
switch (data.type) {
case 'schedule_task':
if (data.prompt && data.schedule_type && data.schedule_value && data.groupFolder && data.chatJid) {
const scheduleType = data.schedule_type as 'cron' | 'interval' | 'once';
// Calculate next run time
let nextRun: string | null = null;
if (scheduleType === 'cron') {
const interval = CronExpressionParser.parse(data.schedule_value);
nextRun = interval.next().toISOString();
} else if (scheduleType === 'interval') {
const ms = parseInt(data.schedule_value, 10);
nextRun = new Date(Date.now() + ms).toISOString();
} else if (scheduleType === 'once') {
nextRun = data.schedule_value; // ISO timestamp
}
const taskId = `task-${Date.now()}-${Math.random().toString(36).slice(2, 8)}`;
createTask({
id: taskId,
group_folder: data.groupFolder,
chat_jid: data.chatJid,
prompt: data.prompt,
schedule_type: scheduleType,
schedule_value: data.schedule_value,
next_run: nextRun,
status: 'active',
created_at: new Date().toISOString()
});
logger.info({ taskId, groupFolder: data.groupFolder }, 'Task created via IPC');
}
break;
case 'pause_task':
if (data.taskId) {
updateTask(data.taskId, { status: 'paused' });
logger.info({ taskId: data.taskId }, 'Task paused via IPC');
}
break;
case 'resume_task':
if (data.taskId) {
updateTask(data.taskId, { status: 'active' });
logger.info({ taskId: data.taskId }, 'Task resumed via IPC');
}
break;
case 'cancel_task':
if (data.taskId) {
deleteTask(data.taskId);
logger.info({ taskId: data.taskId }, 'Task cancelled via IPC');
}
break;
default:
logger.warn({ type: data.type }, 'Unknown IPC task type');
}
}
async function connectWhatsApp(): Promise<void> {
const authDir = path.join(STORE_DIR, 'auth');
fs.mkdirSync(authDir, { recursive: true });
@@ -219,7 +339,8 @@ async function connectWhatsApp(): Promise<void> {
}
} else if (connection === 'open') {
logger.info('Connected to WhatsApp');
startSchedulerLoop({ sendMessage });
startSchedulerLoop({ sendMessage, registeredGroups: () => registeredGroups });
startIpcWatcher();
startMessageLoop();
}
});

View File

@@ -1,12 +1,11 @@
import { query } from '@anthropic-ai/claude-agent-sdk';
import fs from 'fs';
import path from 'path';
import pino from 'pino';
import { CronExpressionParser } from 'cron-parser';
import { getDueTasks, updateTaskAfterRun, logTaskRun, getTaskById } from './db.js';
import { createSchedulerMcp } from './scheduler-mcp.js';
import { ScheduledTask } from './types.js';
import { GROUPS_DIR, SCHEDULER_POLL_INTERVAL } from './config.js';
import { getDueTasks, updateTaskAfterRun, logTaskRun, getTaskById, getAllTasks } from './db.js';
import { ScheduledTask, RegisteredGroup } from './types.js';
import { GROUPS_DIR, SCHEDULER_POLL_INTERVAL, DATA_DIR } from './config.js';
import { runContainerAgent, writeTasksSnapshot } from './container-runner.js';
const logger = pino({
level: process.env.LOG_LEVEL || 'info',
@@ -15,6 +14,7 @@ const logger = pino({
export interface SchedulerDependencies {
sendMessage: (jid: string, text: string) => Promise<void>;
registeredGroups: () => Record<string, RegisteredGroup>;
}
async function runTask(task: ScheduledTask, deps: SchedulerDependencies): Promise<void> {
@@ -24,37 +24,53 @@ async function runTask(task: ScheduledTask, deps: SchedulerDependencies): Promis
logger.info({ taskId: task.id, group: task.group_folder }, 'Running scheduled task');
// Create the scheduler MCP with task's group context
const schedulerMcp = createSchedulerMcp({
groupFolder: task.group_folder,
chatJid: task.chat_jid,
isMain: false, // Scheduled tasks run in their group's context, not as main
sendMessage: deps.sendMessage
});
// Find the group config for this task
const groups = deps.registeredGroups();
const group = Object.values(groups).find(g => g.folder === task.group_folder);
if (!group) {
logger.error({ taskId: task.id, groupFolder: task.group_folder }, 'Group not found for task');
logTaskRun({
task_id: task.id,
run_at: new Date().toISOString(),
duration_ms: Date.now() - startTime,
status: 'error',
result: null,
error: `Group not found: ${task.group_folder}`
});
return;
}
// Update tasks snapshot for container to read
const tasks = getAllTasks();
writeTasksSnapshot(tasks.map(t => ({
id: t.id,
groupFolder: t.group_folder,
prompt: t.prompt,
schedule_type: t.schedule_type,
schedule_value: t.schedule_value,
status: t.status,
next_run: t.next_run
})));
let result: string | null = null;
let error: string | null = null;
try {
for await (const message of query({
const output = await runContainerAgent(group, {
prompt: task.prompt,
options: {
cwd: groupDir,
allowedTools: ['Read', 'Write', 'Edit', 'Glob', 'Grep', 'WebSearch', 'WebFetch', 'mcp__nanoclaw__*', 'mcp__gmail__*'],
permissionMode: 'bypassPermissions',
settingSources: ['project'],
mcpServers: {
nanoclaw: schedulerMcp,
gmail: { command: 'npx', args: ['-y', '@gongrzhe/server-gmail-autoauth-mcp'] }
}
}
})) {
if ('result' in message && message.result) {
result = message.result as string;
}
groupFolder: task.group_folder,
chatJid: task.chat_jid,
isMain: false // Scheduled tasks run in their group's context
});
if (output.status === 'error') {
error = output.error || 'Unknown error';
} else {
result = output.result;
}
logger.info({ taskId: task.id, durationMs: Date.now() - startTime }, 'Task completed successfully');
logger.info({ taskId: task.id, durationMs: Date.now() - startTime }, 'Task completed');
} catch (err) {
error = err instanceof Error ? err.message : String(err);
logger.error({ taskId: task.id, error }, 'Task failed');

View File

@@ -1,8 +1,21 @@
export interface AdditionalMount {
hostPath: string; // Absolute path on host (supports ~ for home)
containerPath: string; // Path inside container (under /workspace/extra/)
readonly?: boolean; // Default: true for safety
}
export interface ContainerConfig {
additionalMounts?: AdditionalMount[];
timeout?: number; // Default: 300000 (5 minutes)
env?: Record<string, string>;
}
export interface RegisteredGroup {
name: string;
folder: string;
trigger: string;
added_at: string;
containerConfig?: ContainerConfig;
}
export interface Session {