Adds Agent Swarms
* feat: streaming container mode, IPC messaging, agent teams support
Major architectural shift from single-shot container runs to long-lived
streaming containers with IPC-based message injection.
- Agent runner: query loop with AsyncIterable prompt to keep stdin open
for agent teams (fixes isSingleUserTurn premature shutdown)
- New standalone stdio MCP server (ipc-mcp-stdio.ts) inheritable by
subagents, with send_message and schedule_task tools
- Streaming output: parse OUTPUT_START/END markers in real-time, send
results to WhatsApp as they arrive
- IPC file-based messaging: host writes to ipc/{group}/input/, agent
polls for follow-up messages without respawning containers
- Per-group settings.json with CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1
- SDK bumped to 0.2.34 for TeamCreate tool support
- Container idle timeout (30min) with _close sentinel for shutdown
- Orphaned container cleanup on startup
- alwaysRespond flag for groups that skip trigger pattern check
- Uncaught exception/rejection handlers with timestamps in logger
- Combined SDK documentation into single deep dive reference
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* chore: remove unused ipc-mcp.ts (replaced by ipc-mcp-stdio.ts)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: clarify agent communication model in docs and tool descriptions
- CLAUDE.md (main + global): split communication instructions into
"responding to messages" vs "scheduled tasks" sections
- send_message tool: note that scheduled task output is not sent to user
- Remove structured output (outputFormat) — not needed with current flow
- Regular output is sent to WhatsApp; scheduled task output is only logged
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* chore: ignore dynamic group data while preserving base structure
Only track groups/main/CLAUDE.md and groups/global/CLAUDE.md. All other
group directories and files are ignored to prevent tracking user-specific
session data.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: resolve critical bugs in streaming container mode
Bug 1 (scheduled task hang): Task scheduler now passes onOutput callback
with idle timer that writes _close sentinel after IDLE_TIMEOUT, so
containers exit cleanly instead of blocking queue slots for 30 minutes.
Scheduled tasks stay alive for interactive follow-up via IPC.
Bug 2 (timeout disabled): Remove resetTimeout() from stderr handler.
SDK writes debug logs continuously, resetting the timer on every line.
Timeout now only resets on actual output markers in stdout.
Bug 3 (trigger bypass): Piped messages in startMessageLoop now check
trigger pattern for non-main groups. Non-trigger messages accumulate in
DB and are pulled as context via getMessagesSince when a trigger arrives.
Bug 7 (non-atomic IPC writes): GroupQueue.sendMessage uses temp file +
rename for atomic writes, matching ipc-mcp-stdio.ts pattern.
Also: flip isVerbose back to false (debug leftover), add isScheduledTask
to host-side ContainerInput interface.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: idle timer not starting + scheduled task groupFolder missing
Two bugs that prevented the scheduled task idle timeout fix from working:
1. onOutput was only called when parsed.result !== null, but session
update markers have result: null. The idle timer never started for
"silent" query completions, leaving containers parked at
waitForIpcMessage until hard timeout.
2. Scheduler's onProcess callback didn't pass groupFolder to
queue.registerProcess, so closeStdin no-oped (groupFolder was null).
The _close sentinel was never written even when the idle timer fired.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: duplicate messages and timestamp rollback in piping path
Two bugs introduced by the trigger context accumulation change:
1. processGroupMessages didn't advance lastAgentTimestamp until after
the container finished. The piping path's getMessagesSince(lastAgent
Timestamp) re-fetched messages already sent as the initial prompt,
causing duplicates.
2. processGroupMessages overwrote lastAgentTimestamp with the original
batch timestamp on completion, rolling back any advancement made by
the piping path while the container was running.
Fix: advance lastAgentTimestamp immediately after building the prompt,
before starting the container. This matches the piping path behavior
and eliminates both the overlap and the rollback.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: container idles 30 extra minutes after _close during query
When _close was detected during pollIpcDuringQuery, it was consumed
(deleted) and stream.end() was called. But after runQuery returned,
main() still emitted a session-update marker (resetting the host's idle
timer) and called waitForIpcMessage (which polled forever since _close
was already gone). The container had to wait for a second _close.
Fix: runQuery now returns closedDuringQuery. When true, main() skips
the session-update marker and waitForIpcMessage, exiting immediately.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: resume branching, internal tags, and output forwarding
- Fix resume branching: pass resumeSessionAt with last assistant UUID
to anchor each query loop resume to the correct conversation tree
position. Prevents agent responses landing on invisible branches
when agent teams subagents create parallel JSONL entries.
- Add <internal> tag stripping: agent can wrap internal reasoning in
<internal> tags which are logged but not sent to WhatsApp. Prevents
duplicate messages and internal monologue reaching users.
- Forward scheduled task output: scheduled tasks now send result text
to WhatsApp (with <internal> stripping), matching regular message
behavior. No more special-case instructions.
- Update Communication guidance in CLAUDE.md: simplified to "your
output is sent to the user or group" with soft guidance on
<internal> tags and send_message usage.
- Add messaging behavior docs to schedule_task tool: prompts the
scheduling agent to include guidance on whether the task should
always/conditionally/never message the user.
- Mount security: containerPath now optional, defaults to basename
of hostPath.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: cursor rollback on error, flush guard, verbose logging
- Roll back lastAgentTimestamp on container error so retries can
re-process the messages instead of silently losing them.
- Add guard flag to flushOutgoingQueue to prevent duplicate sends
from concurrent flushes during rapid WA reconnects.
- Revert isVerbose from hardcoded false back to env-based check
(LOG_LEVEL=debug|trace).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: orphan container cleanup was silently failing
The startup cleanup used `container ls --format {{.Names}}` which is
Docker Go-template syntax. Apple Container only supports `--format json`
or `--format table`. The command errored with exit code 64, but the
catch block silently swallowed it — orphan containers were never cleaned
up on restart.
Fixed to use `--format json` and parse `configuration.id` from the
JSON output. Also filters by `status: running` and logs a warning on
failure instead of silently catching.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* docs: add Discord badge and community section
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: idle timer reset on null results and flush queue message loss
- Only reset idle timer on actual results (non-null), not session-update
markers. Prevents containers staying alive 30 extra minutes after the
agent finishes work.
- flushOutgoingQueue now uses shift() instead of splice(0) so unattempted
messages stay in the queue if an unexpected error bails the loop.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* docs: add Agent Swarms to README
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: update Telegram skill for current architecture
Rewrite integration instructions to match the per-group queue/SQLite
architecture: remove onMessage callback pattern (store to DB, let
message loop pick up), fix startSchedulerLoop signature, add
TELEGRAM_ONLY service startup, SQLite registration, data/env/env sync,
@mention-to-trigger translation, and BotFather group privacy docs.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: Telegram skill message chunking, media placeholders, chat discovery
- Split long messages at Telegram's 4096 char limit to prevent silent
send failures
- Store placeholder text for non-text messages (photos, voice, stickers,
etc.) so the agent knows media was sent
- Update getAvailableGroups filter to include tg: chats so the agent can
discover and register Telegram chats via IPC
- Fix removal step numbering
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* docs: update REQUIREMENTS.md and SPEC.md for SQLite architecture
- Replace all registered_groups.json / sessions.json / router_state.json
references with SQLite equivalents
- Fix CONTAINER_TIMEOUT default (300000 → 1800000)
- Add missing config exports (IDLE_TIMEOUT, MAX_CONCURRENT_CONTAINERS)
- Update folder structure: add missing src files (logger, group-queue,
mount-security), remove non-existent utils.ts, list all skills
- Fix agent-runner entry (ipc-mcp.ts → ipc-mcp-stdio.ts)
- Update startup sequence to reflect per-group queue architecture
- Fix env mounting description (data/env/env, not extracted vars)
- Update troubleshooting to use sqlite3 commands
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* docs: fix README architecture description, revert SPEC.md env error
- README: update architecture blurb to mention per-group queue, add
group-queue.ts to key files, update file descriptions
- SPEC.md: restore correct credential filtering description (only auth
vars are extracted from .env, not the full file)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -52,6 +52,21 @@ Tell the user:
|
||||
> 2. Send any message
|
||||
> 3. Use the `/chatid` command in the group
|
||||
|
||||
### 4. Disable Group Privacy (for group chats)
|
||||
|
||||
Tell the user:
|
||||
|
||||
> **Important for group chats**: By default, Telegram bots in groups only receive messages that @mention the bot or are commands. To let the bot see all messages (needed for `requiresTrigger: false` or trigger-word detection):
|
||||
>
|
||||
> 1. Open Telegram and search for `@BotFather`
|
||||
> 2. Send `/mybots` and select your bot
|
||||
> 3. Go to **Bot Settings** > **Group Privacy**
|
||||
> 4. Select **Turn off**
|
||||
>
|
||||
> Without this, the bot will only see messages that directly @mention it.
|
||||
|
||||
This step is optional if the user only wants trigger-based responses via @mentioning the bot.
|
||||
|
||||
## Questions to Ask
|
||||
|
||||
Before making changes, ask:
|
||||
@@ -61,8 +76,8 @@ Before making changes, ask:
|
||||
- If alongside: Both will run
|
||||
|
||||
2. **Chat behavior**: Should this chat respond to all messages or only when @mentioned?
|
||||
- Main chat: Responds to all
|
||||
- Other chats: Can configure `respondToAll: true` in registered_groups.json
|
||||
- Main chat: Responds to all (set `requiresTrigger: false`)
|
||||
- Other chats: Default requires trigger (`requiresTrigger: true`)
|
||||
|
||||
## Implementation
|
||||
|
||||
@@ -108,44 +123,51 @@ export function storeMessageDirect(msg: {
|
||||
}
|
||||
```
|
||||
|
||||
Also update the db.ts exports to include `storeMessageDirect`.
|
||||
This uses the existing `db` instance from `db.ts`. No additional imports needed.
|
||||
|
||||
### Step 3: Create Telegram Module
|
||||
|
||||
Create `src/telegram.ts` with this content:
|
||||
Create `src/telegram.ts`. The Telegram module is a thin layer that stores incoming messages to the database. It does NOT call the agent directly — the existing `startMessageLoop()` in `src/index.ts` polls all registered group JIDs and picks up Telegram messages automatically.
|
||||
|
||||
```typescript
|
||||
import { Bot } from "grammy";
|
||||
import pino from "pino";
|
||||
import {
|
||||
ASSISTANT_NAME,
|
||||
TRIGGER_PATTERN,
|
||||
MAIN_GROUP_FOLDER,
|
||||
} from "./config.js";
|
||||
import { RegisteredGroup, NewMessage } from "./types.js";
|
||||
import { storeChatMetadata, storeMessageDirect } from "./db.js";
|
||||
|
||||
const logger = pino({
|
||||
level: process.env.LOG_LEVEL || "info",
|
||||
transport: { target: "pino-pretty", options: { colorize: true } },
|
||||
});
|
||||
|
||||
export interface TelegramCallbacks {
|
||||
onMessage: (
|
||||
msg: NewMessage,
|
||||
group: RegisteredGroup,
|
||||
) => Promise<string | null>;
|
||||
getRegisteredGroups: () => Record<string, RegisteredGroup>;
|
||||
}
|
||||
import {
|
||||
getAllRegisteredGroups,
|
||||
storeChatMetadata,
|
||||
storeMessageDirect,
|
||||
} from "./db.js";
|
||||
import { logger } from "./logger.js";
|
||||
|
||||
let bot: Bot | null = null;
|
||||
let callbacks: TelegramCallbacks | null = null;
|
||||
|
||||
export async function connectTelegram(
|
||||
botToken: string,
|
||||
cbs: TelegramCallbacks,
|
||||
): Promise<void> {
|
||||
callbacks = cbs;
|
||||
/** Store a placeholder message for non-text content (photos, voice, etc.) */
|
||||
function storeNonTextMessage(ctx: any, placeholder: string): void {
|
||||
const chatId = `tg:${ctx.chat.id}`;
|
||||
const registeredGroups = getAllRegisteredGroups();
|
||||
if (!registeredGroups[chatId]) return;
|
||||
|
||||
const timestamp = new Date(ctx.message.date * 1000).toISOString();
|
||||
const senderName =
|
||||
ctx.from?.first_name || ctx.from?.username || ctx.from?.id?.toString() || "Unknown";
|
||||
const caption = ctx.message.caption ? ` ${ctx.message.caption}` : "";
|
||||
|
||||
storeChatMetadata(chatId, timestamp);
|
||||
storeMessageDirect({
|
||||
id: ctx.message.message_id.toString(),
|
||||
chat_jid: chatId,
|
||||
sender: ctx.from?.id?.toString() || "",
|
||||
sender_name: senderName,
|
||||
content: `${placeholder}${caption}`,
|
||||
timestamp,
|
||||
is_from_me: false,
|
||||
});
|
||||
}
|
||||
|
||||
export async function connectTelegram(botToken: string): Promise<void> {
|
||||
bot = new Bot(botToken);
|
||||
|
||||
// Command to get chat ID (useful for registration)
|
||||
@@ -173,7 +195,7 @@ export async function connectTelegram(
|
||||
if (ctx.message.text.startsWith("/")) return;
|
||||
|
||||
const chatId = `tg:${ctx.chat.id}`;
|
||||
const content = ctx.message.text;
|
||||
let content = ctx.message.text;
|
||||
const timestamp = new Date(ctx.message.date * 1000).toISOString();
|
||||
const senderName =
|
||||
ctx.from?.first_name ||
|
||||
@@ -189,11 +211,31 @@ export async function connectTelegram(
|
||||
? senderName
|
||||
: (ctx.chat as any).title || chatId;
|
||||
|
||||
// Translate Telegram @bot_username mentions into TRIGGER_PATTERN format.
|
||||
// Telegram @mentions (e.g., @andy_ai_bot) won't match TRIGGER_PATTERN
|
||||
// (e.g., ^@Andy\b), so we prepend the trigger when the bot is @mentioned.
|
||||
const botUsername = ctx.me?.username?.toLowerCase();
|
||||
if (botUsername) {
|
||||
const entities = ctx.message.entities || [];
|
||||
const isBotMentioned = entities.some((entity) => {
|
||||
if (entity.type === "mention") {
|
||||
const mentionText = content
|
||||
.substring(entity.offset, entity.offset + entity.length)
|
||||
.toLowerCase();
|
||||
return mentionText === `@${botUsername}`;
|
||||
}
|
||||
return false;
|
||||
});
|
||||
if (isBotMentioned && !TRIGGER_PATTERN.test(content)) {
|
||||
content = `@${ASSISTANT_NAME} ${content}`;
|
||||
}
|
||||
}
|
||||
|
||||
// Store chat metadata for discovery
|
||||
storeChatMetadata(chatId, timestamp, chatName);
|
||||
|
||||
// Check if this chat is registered
|
||||
const registeredGroups = callbacks!.getRegisteredGroups();
|
||||
const registeredGroups = getAllRegisteredGroups();
|
||||
const group = registeredGroups[chatId];
|
||||
|
||||
if (!group) {
|
||||
@@ -204,7 +246,7 @@ export async function connectTelegram(
|
||||
return;
|
||||
}
|
||||
|
||||
// Store message for registered chats
|
||||
// Store message — startMessageLoop() will pick it up
|
||||
storeMessageDirect({
|
||||
id: msgId,
|
||||
chat_jid: chatId,
|
||||
@@ -215,59 +257,28 @@ export async function connectTelegram(
|
||||
is_from_me: false,
|
||||
});
|
||||
|
||||
const isMain = group.folder === MAIN_GROUP_FOLDER;
|
||||
const respondToAll = (group as any).respondToAll === true;
|
||||
|
||||
// Check if bot is @mentioned in the message (Telegram native mention)
|
||||
const botUsername = ctx.me?.username?.toLowerCase();
|
||||
const entities = ctx.message.entities || [];
|
||||
const isBotMentioned = entities.some((entity) => {
|
||||
if (entity.type === "mention") {
|
||||
const mentionText = content
|
||||
.substring(entity.offset, entity.offset + entity.length)
|
||||
.toLowerCase();
|
||||
return mentionText === `@${botUsername}`;
|
||||
}
|
||||
return false;
|
||||
});
|
||||
|
||||
// Respond if: main group, respondToAll group, bot is @mentioned, or trigger pattern matches
|
||||
if (
|
||||
!isMain &&
|
||||
!respondToAll &&
|
||||
!isBotMentioned &&
|
||||
!TRIGGER_PATTERN.test(content)
|
||||
) {
|
||||
return;
|
||||
}
|
||||
|
||||
logger.info(
|
||||
{ chatId, chatName, sender: senderName },
|
||||
"Processing Telegram message",
|
||||
"Telegram message stored",
|
||||
);
|
||||
|
||||
// Send typing indicator
|
||||
await ctx.replyWithChatAction("typing");
|
||||
|
||||
const msg: NewMessage = {
|
||||
id: msgId,
|
||||
chat_jid: chatId,
|
||||
sender,
|
||||
sender_name: senderName,
|
||||
content,
|
||||
timestamp,
|
||||
};
|
||||
|
||||
try {
|
||||
const response = await callbacks!.onMessage(msg, group);
|
||||
if (response) {
|
||||
await ctx.reply(`${ASSISTANT_NAME}: ${response}`);
|
||||
}
|
||||
} catch (err) {
|
||||
logger.error({ err, chatId }, "Error processing Telegram message");
|
||||
}
|
||||
});
|
||||
|
||||
// Handle non-text messages with placeholders so the agent knows something was sent
|
||||
bot.on("message:photo", (ctx) => storeNonTextMessage(ctx, "[Photo]"));
|
||||
bot.on("message:video", (ctx) => storeNonTextMessage(ctx, "[Video]"));
|
||||
bot.on("message:voice", (ctx) => storeNonTextMessage(ctx, "[Voice message]"));
|
||||
bot.on("message:audio", (ctx) => storeNonTextMessage(ctx, "[Audio]"));
|
||||
bot.on("message:document", (ctx) => {
|
||||
const name = ctx.message.document?.file_name || "file";
|
||||
storeNonTextMessage(ctx, `[Document: ${name}]`);
|
||||
});
|
||||
bot.on("message:sticker", (ctx) => {
|
||||
const emoji = ctx.message.sticker?.emoji || "";
|
||||
storeNonTextMessage(ctx, `[Sticker ${emoji}]`);
|
||||
});
|
||||
bot.on("message:location", (ctx) => storeNonTextMessage(ctx, "[Location]"));
|
||||
bot.on("message:contact", (ctx) => storeNonTextMessage(ctx, "[Contact]"));
|
||||
|
||||
// Handle errors gracefully
|
||||
bot.catch((err) => {
|
||||
logger.error({ err: err.message }, "Telegram bot error");
|
||||
@@ -298,15 +309,33 @@ export async function sendTelegramMessage(
|
||||
}
|
||||
|
||||
try {
|
||||
// Remove tg: prefix if present
|
||||
const numericId = chatId.replace(/^tg:/, "");
|
||||
await bot.api.sendMessage(numericId, text);
|
||||
|
||||
// Telegram has a 4096 character limit per message — split if needed
|
||||
const MAX_LENGTH = 4096;
|
||||
if (text.length <= MAX_LENGTH) {
|
||||
await bot.api.sendMessage(numericId, text);
|
||||
} else {
|
||||
for (let i = 0; i < text.length; i += MAX_LENGTH) {
|
||||
await bot.api.sendMessage(numericId, text.slice(i, i + MAX_LENGTH));
|
||||
}
|
||||
}
|
||||
logger.info({ chatId, length: text.length }, "Telegram message sent");
|
||||
} catch (err) {
|
||||
logger.error({ chatId, err }, "Failed to send Telegram message");
|
||||
}
|
||||
}
|
||||
|
||||
export async function setTelegramTyping(chatId: string): Promise<void> {
|
||||
if (!bot) return;
|
||||
try {
|
||||
const numericId = chatId.replace(/^tg:/, "");
|
||||
await bot.api.sendChatAction(numericId, "typing");
|
||||
} catch (err) {
|
||||
logger.debug({ chatId, err }, "Failed to send Telegram typing indicator");
|
||||
}
|
||||
}
|
||||
|
||||
export function isTelegramConnected(): boolean {
|
||||
return bot !== null;
|
||||
}
|
||||
@@ -315,120 +344,140 @@ export function stopTelegram(): void {
|
||||
if (bot) {
|
||||
bot.stop();
|
||||
bot = null;
|
||||
callbacks = null;
|
||||
logger.info("Telegram bot stopped");
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Key differences from WhatsApp message handling:
|
||||
- No `onMessage` callback — messages are stored to DB and the existing message loop picks them up
|
||||
- Registration check uses `getAllRegisteredGroups()` from `db.ts` directly
|
||||
- Trigger matching is handled by `startMessageLoop()` / `processGroupMessages()`, not the Telegram module
|
||||
|
||||
### Step 4: Update Main Application
|
||||
|
||||
Modify `src/index.ts`:
|
||||
|
||||
1. Add imports at the top:
|
||||
1. **Add imports** at the top:
|
||||
|
||||
```typescript
|
||||
import {
|
||||
connectTelegram,
|
||||
sendTelegramMessage,
|
||||
isTelegramConnected,
|
||||
setTelegramTyping,
|
||||
stopTelegram,
|
||||
} from "./telegram.js";
|
||||
import { TELEGRAM_BOT_TOKEN, TELEGRAM_ONLY } from "./config.js";
|
||||
```
|
||||
|
||||
2. Update `sendMessage` function to route by channel. Find the `sendMessage` function and replace it with:
|
||||
2. **Update `sendMessage` function** to route Telegram messages. Find the `sendMessage` function and add a `tg:` prefix check before the WhatsApp path:
|
||||
|
||||
```typescript
|
||||
async function sendMessage(jid: string, text: string): Promise<void> {
|
||||
// Route Telegram messages directly (no outgoing queue needed)
|
||||
if (jid.startsWith("tg:")) {
|
||||
await sendTelegramMessage(jid, text);
|
||||
} else {
|
||||
try {
|
||||
await sock.sendMessage(jid, { text });
|
||||
logger.info({ jid, length: text.length }, "Message sent");
|
||||
} catch (err) {
|
||||
logger.error({ jid, err }, "Failed to send message");
|
||||
}
|
||||
return;
|
||||
}
|
||||
|
||||
// WhatsApp path (with outgoing queue for reconnection)
|
||||
if (!waConnected) {
|
||||
outgoingQueue.push({ jid, text });
|
||||
logger.info({ jid, length: text.length, queueSize: outgoingQueue.length }, 'WA disconnected, message queued');
|
||||
return;
|
||||
}
|
||||
try {
|
||||
await sock.sendMessage(jid, { text });
|
||||
logger.info({ jid, length: text.length }, 'Message sent');
|
||||
} catch (err) {
|
||||
outgoingQueue.push({ jid, text });
|
||||
logger.warn({ jid, err, queueSize: outgoingQueue.length }, 'Failed to send, message queued');
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
3. Update `main()` function. Find the `main()` function and update it to support Telegram. Add this before the `connectWhatsApp()` call:
|
||||
3. **Update `setTyping` function** to route Telegram typing indicators:
|
||||
|
||||
```typescript
|
||||
const hasTelegram = !!TELEGRAM_BOT_TOKEN;
|
||||
|
||||
if (hasTelegram) {
|
||||
await connectTelegram(TELEGRAM_BOT_TOKEN, {
|
||||
onMessage: async (msg, group) => {
|
||||
// Get messages since last agent interaction for context
|
||||
const sinceTimestamp = lastAgentTimestamp[msg.chat_jid] || "";
|
||||
const missedMessages = getMessagesSince(
|
||||
msg.chat_jid,
|
||||
sinceTimestamp,
|
||||
ASSISTANT_NAME,
|
||||
);
|
||||
|
||||
const lines = missedMessages.map((m) => {
|
||||
const escapeXml = (s: string) =>
|
||||
s
|
||||
.replace(/&/g, "&")
|
||||
.replace(/</g, "<")
|
||||
.replace(/>/g, ">")
|
||||
.replace(/"/g, """);
|
||||
return `<message sender="${escapeXml(m.sender_name)}" time="${m.timestamp}">${escapeXml(m.content)}</message>`;
|
||||
});
|
||||
const prompt = `<messages>\n${lines.join("\n")}\n</messages>`;
|
||||
|
||||
const group = registeredGroups[msg.chat_jid];
|
||||
const isMain = group.folder === MAIN_GROUP_FOLDER;
|
||||
|
||||
const output = await runContainerAgent(group, {
|
||||
prompt,
|
||||
sessionId: sessions[group.folder],
|
||||
groupFolder: group.folder,
|
||||
chatJid: msg.chat_jid,
|
||||
isMain,
|
||||
isScheduledTask: false,
|
||||
});
|
||||
|
||||
if (output.newSessionId) {
|
||||
sessions[group.folder] = output.newSessionId;
|
||||
saveJson(path.join(DATA_DIR, "sessions.json"), sessions);
|
||||
}
|
||||
|
||||
lastAgentTimestamp[msg.chat_jid] = msg.timestamp;
|
||||
saveState();
|
||||
|
||||
return output.status === "success" ? output.result : null;
|
||||
},
|
||||
getRegisteredGroups: () => registeredGroups,
|
||||
});
|
||||
async function setTyping(jid: string, isTyping: boolean): Promise<void> {
|
||||
if (jid.startsWith("tg:")) {
|
||||
if (isTyping) await setTelegramTyping(jid);
|
||||
return;
|
||||
}
|
||||
try {
|
||||
await sock.sendPresenceUpdate(isTyping ? 'composing' : 'paused', jid);
|
||||
} catch (err) {
|
||||
logger.debug({ jid, err }, 'Failed to update typing status');
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
4. Wrap the `connectWhatsApp()` call to support Telegram-only mode. Replace:
|
||||
4. **Update `main()` function**. Add Telegram startup before `connectWhatsApp()` and wrap WhatsApp in a `TELEGRAM_ONLY` check:
|
||||
|
||||
```typescript
|
||||
await connectWhatsApp();
|
||||
async function main(): Promise<void> {
|
||||
ensureContainerSystemRunning();
|
||||
initDatabase();
|
||||
logger.info('Database initialized');
|
||||
loadState();
|
||||
|
||||
// Graceful shutdown handlers
|
||||
const shutdown = async (signal: string) => {
|
||||
logger.info({ signal }, 'Shutdown signal received');
|
||||
stopTelegram();
|
||||
await queue.shutdown(10000);
|
||||
process.exit(0);
|
||||
};
|
||||
process.on('SIGTERM', () => shutdown('SIGTERM'));
|
||||
process.on('SIGINT', () => shutdown('SIGINT'));
|
||||
|
||||
// Start Telegram bot if configured (independent of WhatsApp)
|
||||
const hasTelegram = !!TELEGRAM_BOT_TOKEN;
|
||||
if (hasTelegram) {
|
||||
await connectTelegram(TELEGRAM_BOT_TOKEN);
|
||||
}
|
||||
|
||||
if (!TELEGRAM_ONLY) {
|
||||
await connectWhatsApp();
|
||||
} else {
|
||||
// Telegram-only mode: start all services that WhatsApp's connection.open normally starts
|
||||
startSchedulerLoop({
|
||||
registeredGroups: () => registeredGroups,
|
||||
getSessions: () => sessions,
|
||||
queue,
|
||||
onProcess: (groupJid, proc, containerName, groupFolder) =>
|
||||
queue.registerProcess(groupJid, proc, containerName, groupFolder),
|
||||
sendMessage,
|
||||
assistantName: ASSISTANT_NAME,
|
||||
});
|
||||
startIpcWatcher();
|
||||
queue.setProcessMessagesFn(processGroupMessages);
|
||||
recoverPendingMessages();
|
||||
startMessageLoop();
|
||||
logger.info(
|
||||
`NanoClaw running (Telegram-only, trigger: @${ASSISTANT_NAME})`,
|
||||
);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
With:
|
||||
Note: When running alongside WhatsApp, the `connection.open` handler in `connectWhatsApp()` already starts the scheduler, IPC watcher, queue, and message loop — no duplication needed.
|
||||
|
||||
5. **Update `getAvailableGroups` function** to include Telegram chats. The current filter only shows WhatsApp groups (`@g.us`). Update it to also include `tg:` chats so the agent can discover and register Telegram chats via IPC:
|
||||
|
||||
```typescript
|
||||
if (!TELEGRAM_ONLY) {
|
||||
await connectWhatsApp();
|
||||
} else {
|
||||
// Telegram-only mode: start scheduler and IPC without WhatsApp
|
||||
startSchedulerLoop({
|
||||
sendMessage,
|
||||
registeredGroups: () => registeredGroups,
|
||||
getSessions: () => sessions,
|
||||
});
|
||||
startIpcWatcher();
|
||||
logger.info(
|
||||
`NanoClaw running (Telegram-only, trigger: @${ASSISTANT_NAME})`,
|
||||
);
|
||||
function getAvailableGroups(): AvailableGroup[] {
|
||||
const chats = getAllChats();
|
||||
const registeredJids = new Set(Object.keys(registeredGroups));
|
||||
|
||||
return chats
|
||||
.filter((c) => c.jid !== '__group_sync__' && (c.jid.endsWith('@g.us') || c.jid.startsWith('tg:')))
|
||||
.map((c) => ({
|
||||
jid: c.jid,
|
||||
name: c.name,
|
||||
lastActivity: c.last_message_time,
|
||||
isRegistered: registeredJids.has(c.jid),
|
||||
}));
|
||||
}
|
||||
```
|
||||
|
||||
@@ -443,44 +492,47 @@ TELEGRAM_BOT_TOKEN=YOUR_BOT_TOKEN_HERE
|
||||
# TELEGRAM_ONLY=true
|
||||
```
|
||||
|
||||
**Important**: After modifying `.env`, sync to the container environment:
|
||||
|
||||
```bash
|
||||
cp .env data/env/env
|
||||
```
|
||||
|
||||
The container reads environment from `data/env/env`, not `.env` directly.
|
||||
|
||||
### Step 6: Register a Telegram Chat
|
||||
|
||||
After installing and starting the bot, tell the user:
|
||||
|
||||
> 1. Send `/chatid` to your bot (in private chat or in a group)
|
||||
> 2. Copy the chat ID (e.g., `tg:123456789` or `tg:-1001234567890`)
|
||||
> 3. I'll add it to registered_groups.json
|
||||
> 3. I'll register it for you
|
||||
|
||||
Then update `data/registered_groups.json`:
|
||||
Registration uses the `registerGroup()` function in `src/index.ts`, which writes to SQLite and creates the group folder structure. Call it like this (or add a one-time script):
|
||||
|
||||
For private chat:
|
||||
```typescript
|
||||
// For private chat (main group):
|
||||
registerGroup("tg:123456789", {
|
||||
name: "Personal",
|
||||
folder: "main",
|
||||
trigger: `@${ASSISTANT_NAME}`,
|
||||
added_at: new Date().toISOString(),
|
||||
requiresTrigger: false, // main group responds to all messages
|
||||
});
|
||||
|
||||
```json
|
||||
{
|
||||
"tg:123456789": {
|
||||
"name": "Personal",
|
||||
"folder": "main",
|
||||
"trigger": "@Andy",
|
||||
"added_at": "2026-02-05T12:00:00.000Z"
|
||||
}
|
||||
}
|
||||
// For group chat (note negative ID for Telegram groups):
|
||||
registerGroup("tg:-1001234567890", {
|
||||
name: "My Telegram Group",
|
||||
folder: "telegram-group",
|
||||
trigger: `@${ASSISTANT_NAME}`,
|
||||
added_at: new Date().toISOString(),
|
||||
requiresTrigger: true, // only respond when triggered
|
||||
});
|
||||
```
|
||||
|
||||
For group chat (note the negative ID for groups):
|
||||
The `RegisteredGroup` type requires a `trigger` string field and has an optional `requiresTrigger` boolean (defaults to `true`). Set `requiresTrigger: false` for chats that should respond to all messages.
|
||||
|
||||
```json
|
||||
{
|
||||
"tg:-1001234567890": {
|
||||
"name": "My Telegram Group",
|
||||
"folder": "telegram-group",
|
||||
"trigger": "@Andy",
|
||||
"added_at": "2026-02-05T12:00:00.000Z",
|
||||
"respondToAll": false
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Set `respondToAll: true` if you want the bot to respond to all messages in that chat (not just when @mentioned or triggered).
|
||||
Alternatively, if the agent is already running in the main group, it can register new groups via IPC using the `register_group` task type.
|
||||
|
||||
### Step 7: Build and Restart
|
||||
|
||||
@@ -511,8 +563,10 @@ Tell the user:
|
||||
If user wants Telegram-only:
|
||||
|
||||
1. Set `TELEGRAM_ONLY=true` in `.env`
|
||||
2. The WhatsApp connection code is automatically skipped
|
||||
3. Optionally remove `@whiskeysockets/baileys` dependency (but it's harmless to keep)
|
||||
2. Run `cp .env data/env/env` to sync to container
|
||||
3. The WhatsApp connection code is automatically skipped
|
||||
4. All services (scheduler, IPC watcher, queue, message loop) start independently
|
||||
5. Optionally remove `@whiskeysockets/baileys` dependency (but it's harmless to keep)
|
||||
|
||||
## Features
|
||||
|
||||
@@ -524,10 +578,13 @@ If user wants Telegram-only:
|
||||
### Trigger Options
|
||||
|
||||
The bot responds when:
|
||||
1. Message is in the main chat (folder: "main")
|
||||
2. Chat has `respondToAll: true` in registered_groups.json
|
||||
3. Bot is @mentioned using native Telegram mention (e.g., @your_bot_username)
|
||||
4. Message matches TRIGGER_PATTERN (e.g., starts with @Andy)
|
||||
1. Chat has `requiresTrigger: false` in its registration (e.g., main group)
|
||||
2. Bot is @mentioned in Telegram (translated to TRIGGER_PATTERN automatically)
|
||||
3. Message matches TRIGGER_PATTERN directly (e.g., starts with @Andy)
|
||||
|
||||
Telegram @mentions (e.g., `@andy_ai_bot`) are automatically translated: if the bot is @mentioned and the message doesn't already match TRIGGER_PATTERN, the trigger prefix is prepended before storing. This ensures @mentioning the bot always triggers a response.
|
||||
|
||||
**Group Privacy**: The bot must have Group Privacy disabled in BotFather to see non-mention messages in groups. See Prerequisites step 4.
|
||||
|
||||
### Commands
|
||||
|
||||
@@ -539,11 +596,18 @@ The bot responds when:
|
||||
### Bot not responding
|
||||
|
||||
Check:
|
||||
1. `TELEGRAM_BOT_TOKEN` is set in `.env`
|
||||
2. Chat is registered in `data/registered_groups.json` with `tg:` prefix
|
||||
3. For non-main chats: message includes trigger or @mention
|
||||
1. `TELEGRAM_BOT_TOKEN` is set in `.env` AND synced to `data/env/env`
|
||||
2. Chat is registered in SQLite (check with: `sqlite3 store/messages.db "SELECT * FROM registered_groups WHERE jid LIKE 'tg:%'"`)
|
||||
3. For non-main chats: message includes trigger pattern
|
||||
4. Service is running: `launchctl list | grep nanoclaw`
|
||||
|
||||
### Bot only responds to @mentions in groups
|
||||
|
||||
The bot has Group Privacy enabled (default). It can only see messages that @mention it or are commands. To fix:
|
||||
1. Open `@BotFather` in Telegram
|
||||
2. `/mybots` > select bot > **Bot Settings** > **Group Privacy** > **Turn off**
|
||||
3. Remove and re-add the bot to the group (required for the change to take effect)
|
||||
|
||||
### Getting chat ID
|
||||
|
||||
If `/chatid` doesn't work:
|
||||
@@ -566,9 +630,12 @@ To remove Telegram integration:
|
||||
|
||||
1. Delete `src/telegram.ts`
|
||||
2. Remove Telegram imports from `src/index.ts`
|
||||
3. Remove `sendTelegramMessage` logic from `sendMessage()` function
|
||||
4. Remove `connectTelegram()` call from `main()`
|
||||
5. Remove `storeMessageDirect` from `src/db.ts`
|
||||
6. Remove Telegram config from `src/config.ts`
|
||||
7. Uninstall: `npm uninstall grammy`
|
||||
8. Rebuild: `npm run build && launchctl kickstart -k gui/$(id -u)/com.nanoclaw`
|
||||
3. Remove `sendTelegramMessage` / `setTelegramTyping` routing from `sendMessage()` and `setTyping()` functions
|
||||
4. Remove `connectTelegram()` / `stopTelegram()` calls from `main()`
|
||||
5. Remove `TELEGRAM_ONLY` conditional in `main()`
|
||||
6. Revert `getAvailableGroups()` filter to only include `@g.us` chats
|
||||
7. Remove `storeMessageDirect` from `src/db.ts`
|
||||
8. Remove Telegram config (`TELEGRAM_BOT_TOKEN`, `TELEGRAM_ONLY`) from `src/config.ts`
|
||||
9. Remove Telegram registrations from SQLite: `sqlite3 store/messages.db "DELETE FROM registered_groups WHERE jid LIKE 'tg:%'"`
|
||||
10. Uninstall: `npm uninstall grammy`
|
||||
11. Rebuild: `npm run build && launchctl kickstart -k gui/$(id -u)/com.nanoclaw`
|
||||
|
||||
@@ -375,10 +375,11 @@ Tell the user:
|
||||
> ```json
|
||||
> "containerConfig": {
|
||||
> "additionalMounts": [
|
||||
> { "hostPath": "~/projects/my-app", "containerPath": "my-app", "readonly": false }
|
||||
> { "hostPath": "~/projects/my-app" }
|
||||
> ]
|
||||
> }
|
||||
> ```
|
||||
> The folder appears inside the container at `/workspace/extra/<folder-name>` (derived from the last segment of the path). Add `"readonly": false` for write access, or `"containerPath": "custom-name"` to override the default name.
|
||||
|
||||
## 8. Configure launchd Service
|
||||
|
||||
|
||||
9
.gitignore
vendored
9
.gitignore
vendored
@@ -9,6 +9,15 @@ store/
|
||||
data/
|
||||
logs/
|
||||
|
||||
# Groups - only track base structure and specific CLAUDE.md files
|
||||
groups/*
|
||||
!groups/main/
|
||||
!groups/global/
|
||||
groups/main/*
|
||||
groups/global/*
|
||||
!groups/main/CLAUDE.md
|
||||
!groups/global/CLAUDE.md
|
||||
|
||||
# Secrets
|
||||
*.keys.json
|
||||
.env
|
||||
|
||||
11
CLAUDE.md
11
CLAUDE.md
@@ -41,3 +41,14 @@ Service management:
|
||||
launchctl load ~/Library/LaunchAgents/com.nanoclaw.plist
|
||||
launchctl unload ~/Library/LaunchAgents/com.nanoclaw.plist
|
||||
```
|
||||
|
||||
## Container Build Cache
|
||||
|
||||
Apple Container's buildkit caches the build context aggressively. `--no-cache` alone does NOT invalidate COPY steps — the builder's volume retains stale files. To force a truly clean rebuild:
|
||||
|
||||
```bash
|
||||
container builder stop && container builder rm && container builder start
|
||||
./container/build.sh
|
||||
```
|
||||
|
||||
Always verify after rebuild: `container run -i --rm --entrypoint wc nanoclaw-agent:latest -l /app/src/index.ts`
|
||||
|
||||
20
README.md
20
README.md
@@ -6,6 +6,12 @@
|
||||
My personal Claude assistant that runs securely in containers. Lightweight and built to be understood and customized for your own needs.
|
||||
</p>
|
||||
|
||||
<p align="center">
|
||||
<a href="https://discord.gg/VGWXrf8x"><img src="https://img.shields.io/discord/1470188214710046894?label=Discord&logo=discord" alt="Discord"></a>
|
||||
</p>
|
||||
|
||||
**New:** First AI assistant to support [Agent Swarms](https://code.claude.com/docs/en/agent-teams). Spin up teams of agents that collaborate in your chat.
|
||||
|
||||
## Why I Built This
|
||||
|
||||
[OpenClaw](https://github.com/openclaw/openclaw) is an impressive project with a great vision. But I can't sleep well running software I don't understand with access to my life. OpenClaw has 52+ modules, 8 config management files, 45+ dependencies, and abstractions for 15 channel providers. Security is application-level (allowlists, pairing codes) rather than OS isolation. Everything runs in one Node process with shared memory.
|
||||
@@ -46,6 +52,7 @@ Then run `/setup`. Claude Code handles everything: dependencies, authentication,
|
||||
- **Scheduled tasks** - Recurring jobs that run Claude and can message you back
|
||||
- **Web access** - Search and fetch content
|
||||
- **Container isolation** - Agents sandboxed in Apple Container (macOS) or Docker (macOS/Linux)
|
||||
- **Agent Swarms** - Spin up teams of specialized agents that collaborate on complex tasks (first personal AI assistant to support this)
|
||||
- **Optional integrations** - Add Gmail (`/add-gmail`) and more via skills
|
||||
|
||||
## Usage
|
||||
@@ -114,13 +121,14 @@ Skills we'd love to see:
|
||||
WhatsApp (baileys) --> SQLite --> Polling loop --> Container (Claude Agent SDK) --> Response
|
||||
```
|
||||
|
||||
Single Node.js process. Agents execute in isolated Linux containers with mounted directories. IPC via filesystem. No daemons, no queues, no complexity.
|
||||
Single Node.js process. Agents execute in isolated Linux containers with mounted directories. Per-group message queue with concurrency control. IPC via filesystem.
|
||||
|
||||
Key files:
|
||||
- `src/index.ts` - Main app: WhatsApp connection, routing, IPC
|
||||
- `src/container-runner.ts` - Spawns agent containers
|
||||
- `src/index.ts` - Main app: WhatsApp connection, message loop, IPC
|
||||
- `src/group-queue.ts` - Per-group queue with global concurrency limit
|
||||
- `src/container-runner.ts` - Spawns streaming agent containers
|
||||
- `src/task-scheduler.ts` - Runs scheduled tasks
|
||||
- `src/db.ts` - SQLite operations
|
||||
- `src/db.ts` - SQLite operations (messages, groups, sessions, state)
|
||||
- `groups/*/CLAUDE.md` - Per-group memory
|
||||
|
||||
## FAQ
|
||||
@@ -161,6 +169,10 @@ Everything else (new capabilities, OS compatibility, hardware support, enhanceme
|
||||
|
||||
This keeps the base system minimal and lets every user customize their installation without inheriting features they don't want.
|
||||
|
||||
## Community
|
||||
|
||||
Questions? Ideas? [Join the Discord](https://discord.gg/VGWXrf8x).
|
||||
|
||||
## License
|
||||
|
||||
MIT
|
||||
|
||||
@@ -48,11 +48,13 @@ COPY agent-runner/ ./
|
||||
RUN npm run build
|
||||
|
||||
# Create workspace directories
|
||||
RUN mkdir -p /workspace/group /workspace/global /workspace/extra /workspace/ipc/messages /workspace/ipc/tasks
|
||||
RUN mkdir -p /workspace/group /workspace/global /workspace/extra /workspace/ipc/messages /workspace/ipc/tasks /workspace/ipc/input
|
||||
|
||||
# Create entrypoint script
|
||||
# Sources env from mounted /workspace/env-dir/env if it exists (workaround for Apple Container -i bug)
|
||||
RUN printf '#!/bin/bash\nset -e\n[ -f /workspace/env-dir/env ] && export $(cat /workspace/env-dir/env | xargs)\ncat > /tmp/input.json\nnode /app/dist/index.js < /tmp/input.json\n' > /app/entrypoint.sh && chmod +x /app/entrypoint.sh
|
||||
# Stdin is buffered to /tmp then piped (Apple Container requires EOF to flush stdin pipe)
|
||||
# Follow-up messages arrive via IPC files in /workspace/ipc/input/
|
||||
RUN printf '#!/bin/bash\nset -e\n[ -f /workspace/env-dir/env ] && export $(cat /workspace/env-dir/env | xargs)\ncd /app && npx tsc --outDir /tmp/dist 2>&1 >&2\nln -s /app/node_modules /tmp/dist/node_modules\nchmod -R a-w /tmp/dist\ncat > /tmp/input.json\nnode /tmp/dist/index.js < /tmp/input.json\n' > /app/entrypoint.sh && chmod +x /app/entrypoint.sh
|
||||
|
||||
# Set ownership to node user (non-root) for writable directories
|
||||
RUN chown -R node:node /workspace
|
||||
|
||||
1123
container/agent-runner/package-lock.json
generated
1123
container/agent-runner/package-lock.json
generated
File diff suppressed because it is too large
Load Diff
@@ -9,7 +9,8 @@
|
||||
"start": "node dist/index.js"
|
||||
},
|
||||
"dependencies": {
|
||||
"@anthropic-ai/claude-agent-sdk": "0.2.29",
|
||||
"@anthropic-ai/claude-agent-sdk": "^0.2.34",
|
||||
"@modelcontextprotocol/sdk": "^1.12.1",
|
||||
"cron-parser": "^5.0.0",
|
||||
"zod": "^4.0.0"
|
||||
},
|
||||
|
||||
@@ -1,12 +1,23 @@
|
||||
/**
|
||||
* NanoClaw Agent Runner
|
||||
* Runs inside a container, receives config via stdin, outputs result to stdout
|
||||
*
|
||||
* Input protocol:
|
||||
* Stdin: Full ContainerInput JSON (read until EOF, like before)
|
||||
* IPC: Follow-up messages written as JSON files to /workspace/ipc/input/
|
||||
* Files: {type:"message", text:"..."}.json — polled and consumed
|
||||
* Sentinel: /workspace/ipc/input/_close — signals session end
|
||||
*
|
||||
* Stdout protocol:
|
||||
* Each result is wrapped in OUTPUT_START_MARKER / OUTPUT_END_MARKER pairs.
|
||||
* Multiple results may be emitted (one per agent teams result).
|
||||
* Final marker after loop ends signals completion.
|
||||
*/
|
||||
|
||||
import fs from 'fs';
|
||||
import path from 'path';
|
||||
import { query, HookCallback, PreCompactHookInput } from '@anthropic-ai/claude-agent-sdk';
|
||||
import { createIpcMcp } from './ipc-mcp.js';
|
||||
import { fileURLToPath } from 'url';
|
||||
|
||||
interface ContainerInput {
|
||||
prompt: string;
|
||||
@@ -17,35 +28,9 @@ interface ContainerInput {
|
||||
isScheduledTask?: boolean;
|
||||
}
|
||||
|
||||
interface AgentResponse {
|
||||
outputType: 'message' | 'log';
|
||||
userMessage?: string;
|
||||
internalLog?: string;
|
||||
}
|
||||
|
||||
const AGENT_RESPONSE_SCHEMA = {
|
||||
type: 'object',
|
||||
properties: {
|
||||
outputType: {
|
||||
type: 'string',
|
||||
enum: ['message', 'log'],
|
||||
description: '"message": the userMessage field contains a message to send to the user or group. "log": the output will not be sent to the user or group.',
|
||||
},
|
||||
userMessage: {
|
||||
type: 'string',
|
||||
description: 'A message to send to the user or group. Include when outputType is "message".',
|
||||
},
|
||||
internalLog: {
|
||||
type: 'string',
|
||||
description: 'Information that will be logged internally but not sent to the user or group.',
|
||||
},
|
||||
},
|
||||
required: ['outputType'],
|
||||
} as const;
|
||||
|
||||
interface ContainerOutput {
|
||||
status: 'success' | 'error';
|
||||
result: AgentResponse | null;
|
||||
result: string | null;
|
||||
newSessionId?: string;
|
||||
error?: string;
|
||||
}
|
||||
@@ -61,6 +46,53 @@ interface SessionsIndex {
|
||||
entries: SessionEntry[];
|
||||
}
|
||||
|
||||
interface SDKUserMessage {
|
||||
type: 'user';
|
||||
message: { role: 'user'; content: string };
|
||||
parent_tool_use_id: null;
|
||||
session_id: string;
|
||||
}
|
||||
|
||||
const IPC_INPUT_DIR = '/workspace/ipc/input';
|
||||
const IPC_INPUT_CLOSE_SENTINEL = path.join(IPC_INPUT_DIR, '_close');
|
||||
const IPC_POLL_MS = 500;
|
||||
|
||||
/**
|
||||
* Push-based async iterable for streaming user messages to the SDK.
|
||||
* Keeps the iterable alive until end() is called, preventing isSingleUserTurn.
|
||||
*/
|
||||
class MessageStream {
|
||||
private queue: SDKUserMessage[] = [];
|
||||
private waiting: (() => void) | null = null;
|
||||
private done = false;
|
||||
|
||||
push(text: string): void {
|
||||
this.queue.push({
|
||||
type: 'user',
|
||||
message: { role: 'user', content: text },
|
||||
parent_tool_use_id: null,
|
||||
session_id: '',
|
||||
});
|
||||
this.waiting?.();
|
||||
}
|
||||
|
||||
end(): void {
|
||||
this.done = true;
|
||||
this.waiting?.();
|
||||
}
|
||||
|
||||
async *[Symbol.asyncIterator](): AsyncGenerator<SDKUserMessage> {
|
||||
while (true) {
|
||||
while (this.queue.length > 0) {
|
||||
yield this.queue.shift()!;
|
||||
}
|
||||
if (this.done) return;
|
||||
await new Promise<void>(r => { this.waiting = r; });
|
||||
this.waiting = null;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
async function readStdin(): Promise<string> {
|
||||
return new Promise((resolve, reject) => {
|
||||
let data = '';
|
||||
@@ -85,7 +117,6 @@ function log(message: string): void {
|
||||
}
|
||||
|
||||
function getSessionSummary(sessionId: string, transcriptPath: string): string | null {
|
||||
// sessions-index.json is in the same directory as the transcript
|
||||
const projectDir = path.dirname(transcriptPath);
|
||||
const indexPath = path.join(projectDir, 'sessions-index.json');
|
||||
|
||||
@@ -226,13 +257,200 @@ function formatTranscriptMarkdown(messages: ParsedMessage[], title?: string | nu
|
||||
return lines.join('\n');
|
||||
}
|
||||
|
||||
/**
|
||||
* Check for _close sentinel.
|
||||
*/
|
||||
function shouldClose(): boolean {
|
||||
if (fs.existsSync(IPC_INPUT_CLOSE_SENTINEL)) {
|
||||
try { fs.unlinkSync(IPC_INPUT_CLOSE_SENTINEL); } catch { /* ignore */ }
|
||||
return true;
|
||||
}
|
||||
return false;
|
||||
}
|
||||
|
||||
/**
|
||||
* Drain all pending IPC input messages.
|
||||
* Returns messages found, or empty array.
|
||||
*/
|
||||
function drainIpcInput(): string[] {
|
||||
try {
|
||||
fs.mkdirSync(IPC_INPUT_DIR, { recursive: true });
|
||||
const files = fs.readdirSync(IPC_INPUT_DIR)
|
||||
.filter(f => f.endsWith('.json'))
|
||||
.sort();
|
||||
|
||||
const messages: string[] = [];
|
||||
for (const file of files) {
|
||||
const filePath = path.join(IPC_INPUT_DIR, file);
|
||||
try {
|
||||
const data = JSON.parse(fs.readFileSync(filePath, 'utf-8'));
|
||||
fs.unlinkSync(filePath);
|
||||
if (data.type === 'message' && data.text) {
|
||||
messages.push(data.text);
|
||||
}
|
||||
} catch (err) {
|
||||
log(`Failed to process input file ${file}: ${err instanceof Error ? err.message : String(err)}`);
|
||||
try { fs.unlinkSync(filePath); } catch { /* ignore */ }
|
||||
}
|
||||
}
|
||||
return messages;
|
||||
} catch (err) {
|
||||
log(`IPC drain error: ${err instanceof Error ? err.message : String(err)}`);
|
||||
return [];
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Wait for a new IPC message or _close sentinel.
|
||||
* Returns the messages as a single string, or null if _close.
|
||||
*/
|
||||
function waitForIpcMessage(): Promise<string | null> {
|
||||
return new Promise((resolve) => {
|
||||
const poll = () => {
|
||||
if (shouldClose()) {
|
||||
resolve(null);
|
||||
return;
|
||||
}
|
||||
const messages = drainIpcInput();
|
||||
if (messages.length > 0) {
|
||||
resolve(messages.join('\n'));
|
||||
return;
|
||||
}
|
||||
setTimeout(poll, IPC_POLL_MS);
|
||||
};
|
||||
poll();
|
||||
});
|
||||
}
|
||||
|
||||
/**
|
||||
* Run a single query and stream results via writeOutput.
|
||||
* Uses MessageStream (AsyncIterable) to keep isSingleUserTurn=false,
|
||||
* allowing agent teams subagents to run to completion.
|
||||
* Also pipes IPC messages into the stream during the query.
|
||||
*/
|
||||
async function runQuery(
|
||||
prompt: string,
|
||||
sessionId: string | undefined,
|
||||
mcpServerPath: string,
|
||||
containerInput: ContainerInput,
|
||||
resumeAt?: string,
|
||||
): Promise<{ newSessionId?: string; lastAssistantUuid?: string; closedDuringQuery: boolean }> {
|
||||
const stream = new MessageStream();
|
||||
stream.push(prompt);
|
||||
|
||||
// Poll IPC for follow-up messages and _close sentinel during the query
|
||||
let ipcPolling = true;
|
||||
let closedDuringQuery = false;
|
||||
const pollIpcDuringQuery = () => {
|
||||
if (!ipcPolling) return;
|
||||
if (shouldClose()) {
|
||||
log('Close sentinel detected during query, ending stream');
|
||||
closedDuringQuery = true;
|
||||
stream.end();
|
||||
ipcPolling = false;
|
||||
return;
|
||||
}
|
||||
const messages = drainIpcInput();
|
||||
for (const text of messages) {
|
||||
log(`Piping IPC message into active query (${text.length} chars)`);
|
||||
stream.push(text);
|
||||
}
|
||||
setTimeout(pollIpcDuringQuery, IPC_POLL_MS);
|
||||
};
|
||||
setTimeout(pollIpcDuringQuery, IPC_POLL_MS);
|
||||
|
||||
let newSessionId: string | undefined;
|
||||
let lastAssistantUuid: string | undefined;
|
||||
let messageCount = 0;
|
||||
let resultCount = 0;
|
||||
|
||||
// Load global CLAUDE.md as additional system context (shared across all groups)
|
||||
const globalClaudeMdPath = '/workspace/global/CLAUDE.md';
|
||||
let globalClaudeMd: string | undefined;
|
||||
if (!containerInput.isMain && fs.existsSync(globalClaudeMdPath)) {
|
||||
globalClaudeMd = fs.readFileSync(globalClaudeMdPath, 'utf-8');
|
||||
}
|
||||
|
||||
for await (const message of query({
|
||||
prompt: stream,
|
||||
options: {
|
||||
cwd: '/workspace/group',
|
||||
resume: sessionId,
|
||||
resumeSessionAt: resumeAt,
|
||||
systemPrompt: globalClaudeMd
|
||||
? { type: 'preset' as const, preset: 'claude_code' as const, append: globalClaudeMd }
|
||||
: undefined,
|
||||
allowedTools: [
|
||||
'Bash',
|
||||
'Read', 'Write', 'Edit', 'Glob', 'Grep',
|
||||
'WebSearch', 'WebFetch',
|
||||
'Task', 'TaskOutput', 'TaskStop',
|
||||
'TeamCreate', 'TeamDelete', 'SendMessage',
|
||||
'TodoWrite', 'ToolSearch', 'Skill',
|
||||
'NotebookEdit',
|
||||
'mcp__nanoclaw__*'
|
||||
],
|
||||
permissionMode: 'bypassPermissions',
|
||||
allowDangerouslySkipPermissions: true,
|
||||
settingSources: ['project', 'user'],
|
||||
mcpServers: {
|
||||
nanoclaw: {
|
||||
command: 'node',
|
||||
args: [mcpServerPath],
|
||||
env: {
|
||||
NANOCLAW_CHAT_JID: containerInput.chatJid,
|
||||
NANOCLAW_GROUP_FOLDER: containerInput.groupFolder,
|
||||
NANOCLAW_IS_MAIN: containerInput.isMain ? '1' : '0',
|
||||
},
|
||||
},
|
||||
},
|
||||
hooks: {
|
||||
PreCompact: [{ hooks: [createPreCompactHook()] }]
|
||||
},
|
||||
}
|
||||
})) {
|
||||
messageCount++;
|
||||
const msgType = message.type === 'system' ? `system/${(message as { subtype?: string }).subtype}` : message.type;
|
||||
log(`[msg #${messageCount}] type=${msgType}`);
|
||||
|
||||
if (message.type === 'assistant' && 'uuid' in message) {
|
||||
lastAssistantUuid = (message as { uuid: string }).uuid;
|
||||
}
|
||||
|
||||
if (message.type === 'system' && message.subtype === 'init') {
|
||||
newSessionId = message.session_id;
|
||||
log(`Session initialized: ${newSessionId}`);
|
||||
}
|
||||
|
||||
if (message.type === 'system' && (message as { subtype?: string }).subtype === 'task_notification') {
|
||||
const tn = message as { task_id: string; status: string; summary: string };
|
||||
log(`Task notification: task=${tn.task_id} status=${tn.status} summary=${tn.summary}`);
|
||||
}
|
||||
|
||||
if (message.type === 'result') {
|
||||
resultCount++;
|
||||
const textResult = 'result' in message ? (message as { result?: string }).result : null;
|
||||
log(`Result #${resultCount}: subtype=${message.subtype}${textResult ? ` text=${textResult.slice(0, 200)}` : ''}`);
|
||||
writeOutput({
|
||||
status: 'success',
|
||||
result: textResult || null,
|
||||
newSessionId
|
||||
});
|
||||
}
|
||||
}
|
||||
|
||||
ipcPolling = false;
|
||||
log(`Query done. Messages: ${messageCount}, results: ${resultCount}, lastAssistantUuid: ${lastAssistantUuid || 'none'}, closedDuringQuery: ${closedDuringQuery}`);
|
||||
return { newSessionId, lastAssistantUuid, closedDuringQuery };
|
||||
}
|
||||
|
||||
async function main(): Promise<void> {
|
||||
let input: ContainerInput;
|
||||
let containerInput: ContainerInput;
|
||||
|
||||
try {
|
||||
const stdinData = await readStdin();
|
||||
input = JSON.parse(stdinData);
|
||||
log(`Received input for group: ${input.groupFolder}`);
|
||||
containerInput = JSON.parse(stdinData);
|
||||
log(`Received input for group: ${containerInput.groupFolder}`);
|
||||
} catch (err) {
|
||||
writeOutput({
|
||||
status: 'error',
|
||||
@@ -242,98 +460,70 @@ async function main(): Promise<void> {
|
||||
process.exit(1);
|
||||
}
|
||||
|
||||
const ipcMcp = createIpcMcp({
|
||||
chatJid: input.chatJid,
|
||||
groupFolder: input.groupFolder,
|
||||
isMain: input.isMain
|
||||
});
|
||||
const __dirname = path.dirname(fileURLToPath(import.meta.url));
|
||||
const mcpServerPath = path.join(__dirname, 'ipc-mcp-stdio.js');
|
||||
|
||||
let result: AgentResponse | null = null;
|
||||
let newSessionId: string | undefined;
|
||||
let sessionId = containerInput.sessionId;
|
||||
fs.mkdirSync(IPC_INPUT_DIR, { recursive: true });
|
||||
|
||||
// Add context for scheduled tasks
|
||||
let prompt = input.prompt;
|
||||
if (input.isScheduledTask) {
|
||||
prompt = `[SCHEDULED TASK - The following message was sent automatically and is not coming directly from the user or group.]\n\n${input.prompt}`;
|
||||
}
|
||||
|
||||
// Load global CLAUDE.md as additional system context (shared across all groups)
|
||||
const globalClaudeMdPath = '/workspace/global/CLAUDE.md';
|
||||
let globalClaudeMd: string | undefined;
|
||||
if (!input.isMain && fs.existsSync(globalClaudeMdPath)) {
|
||||
globalClaudeMd = fs.readFileSync(globalClaudeMdPath, 'utf-8');
|
||||
// Clean up stale _close sentinel from previous container runs
|
||||
try { fs.unlinkSync(IPC_INPUT_CLOSE_SENTINEL); } catch { /* ignore */ }
|
||||
|
||||
// Build initial prompt (drain any pending IPC messages too)
|
||||
let prompt = containerInput.prompt;
|
||||
if (containerInput.isScheduledTask) {
|
||||
prompt = `[SCHEDULED TASK - The following message was sent automatically and is not coming directly from the user or group.]\n\n${prompt}`;
|
||||
}
|
||||
const pending = drainIpcInput();
|
||||
if (pending.length > 0) {
|
||||
log(`Draining ${pending.length} pending IPC messages into initial prompt`);
|
||||
prompt += '\n' + pending.join('\n');
|
||||
}
|
||||
|
||||
// Query loop: run query → wait for IPC message → run new query → repeat
|
||||
let resumeAt: string | undefined;
|
||||
try {
|
||||
log('Starting agent...');
|
||||
while (true) {
|
||||
log(`Starting query (session: ${sessionId || 'new'}, resumeAt: ${resumeAt || 'latest'})...`);
|
||||
|
||||
for await (const message of query({
|
||||
prompt,
|
||||
options: {
|
||||
cwd: '/workspace/group',
|
||||
resume: input.sessionId,
|
||||
systemPrompt: globalClaudeMd
|
||||
? { type: 'preset' as const, preset: 'claude_code' as const, append: globalClaudeMd }
|
||||
: undefined,
|
||||
allowedTools: [
|
||||
'Bash',
|
||||
'Read', 'Write', 'Edit', 'Glob', 'Grep',
|
||||
'WebSearch', 'WebFetch',
|
||||
'mcp__nanoclaw__*'
|
||||
],
|
||||
permissionMode: 'bypassPermissions',
|
||||
allowDangerouslySkipPermissions: true,
|
||||
settingSources: ['project'],
|
||||
mcpServers: {
|
||||
nanoclaw: ipcMcp
|
||||
},
|
||||
hooks: {
|
||||
PreCompact: [{ hooks: [createPreCompactHook()] }]
|
||||
},
|
||||
outputFormat: {
|
||||
type: 'json_schema',
|
||||
schema: AGENT_RESPONSE_SCHEMA,
|
||||
}
|
||||
const queryResult = await runQuery(prompt, sessionId, mcpServerPath, containerInput, resumeAt);
|
||||
if (queryResult.newSessionId) {
|
||||
sessionId = queryResult.newSessionId;
|
||||
}
|
||||
})) {
|
||||
if (message.type === 'system' && message.subtype === 'init') {
|
||||
newSessionId = message.session_id;
|
||||
log(`Session initialized: ${newSessionId}`);
|
||||
if (queryResult.lastAssistantUuid) {
|
||||
resumeAt = queryResult.lastAssistantUuid;
|
||||
}
|
||||
|
||||
if (message.type === 'result') {
|
||||
if (message.subtype === 'success' && message.structured_output) {
|
||||
result = message.structured_output as AgentResponse;
|
||||
if (result.outputType === 'message' && !result.userMessage) {
|
||||
log('Warning: outputType is "message" but userMessage is missing, treating as "log"');
|
||||
result = { outputType: 'log', internalLog: result.internalLog };
|
||||
}
|
||||
log(`Agent result: outputType=${result.outputType}${result.internalLog ? `, log=${result.internalLog}` : ''}`);
|
||||
} else if (message.subtype === 'success' || message.subtype === 'error_max_structured_output_retries') {
|
||||
// Structured output missing or agent couldn't produce valid structured output — fall back to text
|
||||
log(`Structured output unavailable (subtype=${message.subtype}), falling back to text`);
|
||||
const textResult = 'result' in message ? (message as { result?: string }).result : null;
|
||||
if (textResult) {
|
||||
result = { outputType: 'message', userMessage: textResult };
|
||||
}
|
||||
}
|
||||
// If _close was consumed during the query, exit immediately.
|
||||
// Don't emit a session-update marker (it would reset the host's
|
||||
// idle timer and cause a 30-min delay before the next _close).
|
||||
if (queryResult.closedDuringQuery) {
|
||||
log('Close sentinel consumed during query, exiting');
|
||||
break;
|
||||
}
|
||||
|
||||
// Emit session update so host can track it
|
||||
writeOutput({ status: 'success', result: null, newSessionId: sessionId });
|
||||
|
||||
log('Query ended, waiting for next IPC message...');
|
||||
|
||||
// Wait for the next message or _close sentinel
|
||||
const nextMessage = await waitForIpcMessage();
|
||||
if (nextMessage === null) {
|
||||
log('Close sentinel received, exiting');
|
||||
break;
|
||||
}
|
||||
|
||||
log(`Got new message (${nextMessage.length} chars), starting new query`);
|
||||
prompt = nextMessage;
|
||||
}
|
||||
|
||||
log('Agent completed successfully');
|
||||
writeOutput({
|
||||
status: 'success',
|
||||
result: result ?? { outputType: 'log' },
|
||||
newSessionId
|
||||
});
|
||||
|
||||
} catch (err) {
|
||||
const errorMessage = err instanceof Error ? err.message : String(err);
|
||||
log(`Agent error: ${errorMessage}`);
|
||||
writeOutput({
|
||||
status: 'error',
|
||||
result: null,
|
||||
newSessionId,
|
||||
newSessionId: sessionId,
|
||||
error: errorMessage
|
||||
});
|
||||
process.exit(1);
|
||||
|
||||
275
container/agent-runner/src/ipc-mcp-stdio.ts
Normal file
275
container/agent-runner/src/ipc-mcp-stdio.ts
Normal file
@@ -0,0 +1,275 @@
|
||||
/**
|
||||
* Stdio MCP Server for NanoClaw
|
||||
* Standalone process that agent teams subagents can inherit.
|
||||
* Reads context from environment variables, writes IPC files for the host.
|
||||
*/
|
||||
|
||||
import { McpServer } from '@modelcontextprotocol/sdk/server/mcp.js';
|
||||
import { StdioServerTransport } from '@modelcontextprotocol/sdk/server/stdio.js';
|
||||
import { z } from 'zod';
|
||||
import fs from 'fs';
|
||||
import path from 'path';
|
||||
import { CronExpressionParser } from 'cron-parser';
|
||||
|
||||
const IPC_DIR = '/workspace/ipc';
|
||||
const MESSAGES_DIR = path.join(IPC_DIR, 'messages');
|
||||
const TASKS_DIR = path.join(IPC_DIR, 'tasks');
|
||||
|
||||
// Context from environment variables (set by the agent runner)
|
||||
const chatJid = process.env.NANOCLAW_CHAT_JID!;
|
||||
const groupFolder = process.env.NANOCLAW_GROUP_FOLDER!;
|
||||
const isMain = process.env.NANOCLAW_IS_MAIN === '1';
|
||||
|
||||
function writeIpcFile(dir: string, data: object): string {
|
||||
fs.mkdirSync(dir, { recursive: true });
|
||||
|
||||
const filename = `${Date.now()}-${Math.random().toString(36).slice(2, 8)}.json`;
|
||||
const filepath = path.join(dir, filename);
|
||||
|
||||
// Atomic write: temp file then rename
|
||||
const tempPath = `${filepath}.tmp`;
|
||||
fs.writeFileSync(tempPath, JSON.stringify(data, null, 2));
|
||||
fs.renameSync(tempPath, filepath);
|
||||
|
||||
return filename;
|
||||
}
|
||||
|
||||
const server = new McpServer({
|
||||
name: 'nanoclaw',
|
||||
version: '1.0.0',
|
||||
});
|
||||
|
||||
server.tool(
|
||||
'send_message',
|
||||
"Send a message to the user or group immediately while you're still running. Use this for progress updates or to send multiple messages. You can call this multiple times. Note: when running as a scheduled task, your final output is NOT sent to the user — use this tool if you need to communicate with the user or group.",
|
||||
{ text: z.string().describe('The message text to send') },
|
||||
async (args) => {
|
||||
const data = {
|
||||
type: 'message',
|
||||
chatJid,
|
||||
text: args.text,
|
||||
groupFolder,
|
||||
timestamp: new Date().toISOString(),
|
||||
};
|
||||
|
||||
writeIpcFile(MESSAGES_DIR, data);
|
||||
|
||||
return { content: [{ type: 'text' as const, text: 'Message sent.' }] };
|
||||
},
|
||||
);
|
||||
|
||||
server.tool(
|
||||
'schedule_task',
|
||||
`Schedule a recurring or one-time task. The task will run as a full agent with access to all tools.
|
||||
|
||||
CONTEXT MODE - Choose based on task type:
|
||||
\u2022 "group": Task runs in the group's conversation context, with access to chat history. Use for tasks that need context about ongoing discussions, user preferences, or recent interactions.
|
||||
\u2022 "isolated": Task runs in a fresh session with no conversation history. Use for independent tasks that don't need prior context. When using isolated mode, include all necessary context in the prompt itself.
|
||||
|
||||
If unsure which mode to use, you can ask the user. Examples:
|
||||
- "Remind me about our discussion" \u2192 group (needs conversation context)
|
||||
- "Check the weather every morning" \u2192 isolated (self-contained task)
|
||||
- "Follow up on my request" \u2192 group (needs to know what was requested)
|
||||
- "Generate a daily report" \u2192 isolated (just needs instructions in prompt)
|
||||
|
||||
MESSAGING BEHAVIOR - The task agent's output is sent to the user or group. It can also use send_message for immediate delivery, or wrap output in <internal> tags to suppress it. Include guidance in the prompt about whether the agent should:
|
||||
\u2022 Always send a message (e.g., reminders, daily briefings)
|
||||
\u2022 Only send a message when there's something to report (e.g., "notify me if...")
|
||||
\u2022 Never send a message (background maintenance tasks)
|
||||
|
||||
SCHEDULE VALUE FORMAT (all times are LOCAL timezone):
|
||||
\u2022 cron: Standard cron expression (e.g., "*/5 * * * *" for every 5 minutes, "0 9 * * *" for daily at 9am LOCAL time)
|
||||
\u2022 interval: Milliseconds between runs (e.g., "300000" for 5 minutes, "3600000" for 1 hour)
|
||||
\u2022 once: Local time WITHOUT "Z" suffix (e.g., "2026-02-01T15:30:00"). Do NOT use UTC/Z suffix.`,
|
||||
{
|
||||
prompt: z.string().describe('What the agent should do when the task runs. For isolated mode, include all necessary context here.'),
|
||||
schedule_type: z.enum(['cron', 'interval', 'once']).describe('cron=recurring at specific times, interval=recurring every N ms, once=run once at specific time'),
|
||||
schedule_value: z.string().describe('cron: "*/5 * * * *" | interval: milliseconds like "300000" | once: local timestamp like "2026-02-01T15:30:00" (no Z suffix!)'),
|
||||
context_mode: z.enum(['group', 'isolated']).default('group').describe('group=runs with chat history and memory, isolated=fresh session (include context in prompt)'),
|
||||
target_group_jid: z.string().optional().describe('(Main group only) JID of the group to schedule the task for. Defaults to the current group.'),
|
||||
},
|
||||
async (args) => {
|
||||
// Validate schedule_value before writing IPC
|
||||
if (args.schedule_type === 'cron') {
|
||||
try {
|
||||
CronExpressionParser.parse(args.schedule_value);
|
||||
} catch {
|
||||
return {
|
||||
content: [{ type: 'text' as const, text: `Invalid cron: "${args.schedule_value}". Use format like "0 9 * * *" (daily 9am) or "*/5 * * * *" (every 5 min).` }],
|
||||
isError: true,
|
||||
};
|
||||
}
|
||||
} else if (args.schedule_type === 'interval') {
|
||||
const ms = parseInt(args.schedule_value, 10);
|
||||
if (isNaN(ms) || ms <= 0) {
|
||||
return {
|
||||
content: [{ type: 'text' as const, text: `Invalid interval: "${args.schedule_value}". Must be positive milliseconds (e.g., "300000" for 5 min).` }],
|
||||
isError: true,
|
||||
};
|
||||
}
|
||||
} else if (args.schedule_type === 'once') {
|
||||
const date = new Date(args.schedule_value);
|
||||
if (isNaN(date.getTime())) {
|
||||
return {
|
||||
content: [{ type: 'text' as const, text: `Invalid timestamp: "${args.schedule_value}". Use ISO 8601 format like "2026-02-01T15:30:00.000Z".` }],
|
||||
isError: true,
|
||||
};
|
||||
}
|
||||
}
|
||||
|
||||
// Non-main groups can only schedule for themselves
|
||||
const targetJid = isMain && args.target_group_jid ? args.target_group_jid : chatJid;
|
||||
|
||||
const data = {
|
||||
type: 'schedule_task',
|
||||
prompt: args.prompt,
|
||||
schedule_type: args.schedule_type,
|
||||
schedule_value: args.schedule_value,
|
||||
context_mode: args.context_mode || 'group',
|
||||
targetJid,
|
||||
createdBy: groupFolder,
|
||||
timestamp: new Date().toISOString(),
|
||||
};
|
||||
|
||||
const filename = writeIpcFile(TASKS_DIR, data);
|
||||
|
||||
return {
|
||||
content: [{ type: 'text' as const, text: `Task scheduled (${filename}): ${args.schedule_type} - ${args.schedule_value}` }],
|
||||
};
|
||||
},
|
||||
);
|
||||
|
||||
server.tool(
|
||||
'list_tasks',
|
||||
"List all scheduled tasks. From main: shows all tasks. From other groups: shows only that group's tasks.",
|
||||
{},
|
||||
async () => {
|
||||
const tasksFile = path.join(IPC_DIR, 'current_tasks.json');
|
||||
|
||||
try {
|
||||
if (!fs.existsSync(tasksFile)) {
|
||||
return { content: [{ type: 'text' as const, text: 'No scheduled tasks found.' }] };
|
||||
}
|
||||
|
||||
const allTasks = JSON.parse(fs.readFileSync(tasksFile, 'utf-8'));
|
||||
|
||||
const tasks = isMain
|
||||
? allTasks
|
||||
: allTasks.filter((t: { groupFolder: string }) => t.groupFolder === groupFolder);
|
||||
|
||||
if (tasks.length === 0) {
|
||||
return { content: [{ type: 'text' as const, text: 'No scheduled tasks found.' }] };
|
||||
}
|
||||
|
||||
const formatted = tasks
|
||||
.map(
|
||||
(t: { id: string; prompt: string; schedule_type: string; schedule_value: string; status: string; next_run: string }) =>
|
||||
`- [${t.id}] ${t.prompt.slice(0, 50)}... (${t.schedule_type}: ${t.schedule_value}) - ${t.status}, next: ${t.next_run || 'N/A'}`,
|
||||
)
|
||||
.join('\n');
|
||||
|
||||
return { content: [{ type: 'text' as const, text: `Scheduled tasks:\n${formatted}` }] };
|
||||
} catch (err) {
|
||||
return {
|
||||
content: [{ type: 'text' as const, text: `Error reading tasks: ${err instanceof Error ? err.message : String(err)}` }],
|
||||
};
|
||||
}
|
||||
},
|
||||
);
|
||||
|
||||
server.tool(
|
||||
'pause_task',
|
||||
'Pause a scheduled task. It will not run until resumed.',
|
||||
{ task_id: z.string().describe('The task ID to pause') },
|
||||
async (args) => {
|
||||
const data = {
|
||||
type: 'pause_task',
|
||||
taskId: args.task_id,
|
||||
groupFolder,
|
||||
isMain,
|
||||
timestamp: new Date().toISOString(),
|
||||
};
|
||||
|
||||
writeIpcFile(TASKS_DIR, data);
|
||||
|
||||
return { content: [{ type: 'text' as const, text: `Task ${args.task_id} pause requested.` }] };
|
||||
},
|
||||
);
|
||||
|
||||
server.tool(
|
||||
'resume_task',
|
||||
'Resume a paused task.',
|
||||
{ task_id: z.string().describe('The task ID to resume') },
|
||||
async (args) => {
|
||||
const data = {
|
||||
type: 'resume_task',
|
||||
taskId: args.task_id,
|
||||
groupFolder,
|
||||
isMain,
|
||||
timestamp: new Date().toISOString(),
|
||||
};
|
||||
|
||||
writeIpcFile(TASKS_DIR, data);
|
||||
|
||||
return { content: [{ type: 'text' as const, text: `Task ${args.task_id} resume requested.` }] };
|
||||
},
|
||||
);
|
||||
|
||||
server.tool(
|
||||
'cancel_task',
|
||||
'Cancel and delete a scheduled task.',
|
||||
{ task_id: z.string().describe('The task ID to cancel') },
|
||||
async (args) => {
|
||||
const data = {
|
||||
type: 'cancel_task',
|
||||
taskId: args.task_id,
|
||||
groupFolder,
|
||||
isMain,
|
||||
timestamp: new Date().toISOString(),
|
||||
};
|
||||
|
||||
writeIpcFile(TASKS_DIR, data);
|
||||
|
||||
return { content: [{ type: 'text' as const, text: `Task ${args.task_id} cancellation requested.` }] };
|
||||
},
|
||||
);
|
||||
|
||||
server.tool(
|
||||
'register_group',
|
||||
`Register a new WhatsApp group so the agent can respond to messages there. Main group only.
|
||||
|
||||
Use available_groups.json to find the JID for a group. The folder name should be lowercase with hyphens (e.g., "family-chat").`,
|
||||
{
|
||||
jid: z.string().describe('The WhatsApp JID (e.g., "120363336345536173@g.us")'),
|
||||
name: z.string().describe('Display name for the group'),
|
||||
folder: z.string().describe('Folder name for group files (lowercase, hyphens, e.g., "family-chat")'),
|
||||
trigger: z.string().describe('Trigger word (e.g., "@Andy")'),
|
||||
},
|
||||
async (args) => {
|
||||
if (!isMain) {
|
||||
return {
|
||||
content: [{ type: 'text' as const, text: 'Only the main group can register new groups.' }],
|
||||
isError: true,
|
||||
};
|
||||
}
|
||||
|
||||
const data = {
|
||||
type: 'register_group',
|
||||
jid: args.jid,
|
||||
name: args.name,
|
||||
folder: args.folder,
|
||||
trigger: args.trigger,
|
||||
timestamp: new Date().toISOString(),
|
||||
};
|
||||
|
||||
writeIpcFile(TASKS_DIR, data);
|
||||
|
||||
return {
|
||||
content: [{ type: 'text' as const, text: `Group "${args.name}" registered. It will start receiving messages immediately.` }],
|
||||
};
|
||||
},
|
||||
);
|
||||
|
||||
// Start the stdio transport
|
||||
const transport = new StdioServerTransport();
|
||||
await server.connect(transport);
|
||||
@@ -1,320 +0,0 @@
|
||||
/**
|
||||
* IPC-based MCP Server for NanoClaw
|
||||
* Writes messages and tasks to files for the host process to pick up
|
||||
*/
|
||||
|
||||
import { createSdkMcpServer, tool } from '@anthropic-ai/claude-agent-sdk';
|
||||
import { z } from 'zod';
|
||||
import fs from 'fs';
|
||||
import path from 'path';
|
||||
import { CronExpressionParser } from 'cron-parser';
|
||||
|
||||
const IPC_DIR = '/workspace/ipc';
|
||||
const MESSAGES_DIR = path.join(IPC_DIR, 'messages');
|
||||
const TASKS_DIR = path.join(IPC_DIR, 'tasks');
|
||||
|
||||
export interface IpcMcpContext {
|
||||
chatJid: string;
|
||||
groupFolder: string;
|
||||
isMain: boolean;
|
||||
}
|
||||
|
||||
function writeIpcFile(dir: string, data: object): string {
|
||||
fs.mkdirSync(dir, { recursive: true });
|
||||
|
||||
const filename = `${Date.now()}-${Math.random().toString(36).slice(2, 8)}.json`;
|
||||
const filepath = path.join(dir, filename);
|
||||
|
||||
// Atomic write: temp file then rename
|
||||
const tempPath = `${filepath}.tmp`;
|
||||
fs.writeFileSync(tempPath, JSON.stringify(data, null, 2));
|
||||
fs.renameSync(tempPath, filepath);
|
||||
|
||||
return filename;
|
||||
}
|
||||
|
||||
export function createIpcMcp(ctx: IpcMcpContext) {
|
||||
const { chatJid, groupFolder, isMain } = ctx;
|
||||
|
||||
return createSdkMcpServer({
|
||||
name: 'nanoclaw',
|
||||
version: '1.0.0',
|
||||
tools: [
|
||||
tool(
|
||||
'send_message',
|
||||
'Send a message to the user or group. The message is delivered immediately while you\'re still running. You can call this multiple times to send multiple messages.',
|
||||
{
|
||||
text: z.string().describe('The message text to send')
|
||||
},
|
||||
async (args) => {
|
||||
const data = {
|
||||
type: 'message',
|
||||
chatJid,
|
||||
text: args.text,
|
||||
groupFolder,
|
||||
timestamp: new Date().toISOString()
|
||||
};
|
||||
|
||||
writeIpcFile(MESSAGES_DIR, data);
|
||||
|
||||
return {
|
||||
content: [{
|
||||
type: 'text',
|
||||
text: 'Message sent.'
|
||||
}]
|
||||
};
|
||||
}
|
||||
),
|
||||
|
||||
tool(
|
||||
'schedule_task',
|
||||
`Schedule a recurring or one-time task. The task will run as a full agent with access to all tools.
|
||||
|
||||
CONTEXT MODE - Choose based on task type:
|
||||
• "group": Task runs in the group's conversation context, with access to chat history. Use for tasks that need context about ongoing discussions, user preferences, or recent interactions.
|
||||
• "isolated": Task runs in a fresh session with no conversation history. Use for independent tasks that don't need prior context. When using isolated mode, include all necessary context in the prompt itself.
|
||||
|
||||
If unsure which mode to use, you can ask the user. Examples:
|
||||
- "Remind me about our discussion" → group (needs conversation context)
|
||||
- "Check the weather every morning" → isolated (self-contained task)
|
||||
- "Follow up on my request" → group (needs to know what was requested)
|
||||
- "Generate a daily report" → isolated (just needs instructions in prompt)
|
||||
|
||||
SCHEDULE VALUE FORMAT (all times are LOCAL timezone):
|
||||
• cron: Standard cron expression (e.g., "*/5 * * * *" for every 5 minutes, "0 9 * * *" for daily at 9am LOCAL time)
|
||||
• interval: Milliseconds between runs (e.g., "300000" for 5 minutes, "3600000" for 1 hour)
|
||||
• once: Local time WITHOUT "Z" suffix (e.g., "2026-02-01T15:30:00"). Do NOT use UTC/Z suffix.`,
|
||||
{
|
||||
prompt: z.string().describe('What the agent should do when the task runs. For isolated mode, include all necessary context here.'),
|
||||
schedule_type: z.enum(['cron', 'interval', 'once']).describe('cron=recurring at specific times, interval=recurring every N ms, once=run once at specific time'),
|
||||
schedule_value: z.string().describe('cron: "*/5 * * * *" | interval: milliseconds like "300000" | once: local timestamp like "2026-02-01T15:30:00" (no Z suffix!)'),
|
||||
context_mode: z.enum(['group', 'isolated']).default('group').describe('group=runs with chat history and memory, isolated=fresh session (include context in prompt)'),
|
||||
...(isMain ? { target_group_jid: z.string().optional().describe('JID of the group to schedule the task for. The group must be registered — look up JIDs in /workspace/project/data/registered_groups.json (the keys are JIDs). If the group is not registered, let the user know and ask if they want to activate it. Defaults to the current group.') } : {}),
|
||||
},
|
||||
async (args) => {
|
||||
// Validate schedule_value before writing IPC
|
||||
if (args.schedule_type === 'cron') {
|
||||
try {
|
||||
CronExpressionParser.parse(args.schedule_value);
|
||||
} catch (err) {
|
||||
return {
|
||||
content: [{ type: 'text', text: `Invalid cron: "${args.schedule_value}". Use format like "0 9 * * *" (daily 9am) or "*/5 * * * *" (every 5 min).` }],
|
||||
isError: true
|
||||
};
|
||||
}
|
||||
} else if (args.schedule_type === 'interval') {
|
||||
const ms = parseInt(args.schedule_value, 10);
|
||||
if (isNaN(ms) || ms <= 0) {
|
||||
return {
|
||||
content: [{ type: 'text', text: `Invalid interval: "${args.schedule_value}". Must be positive milliseconds (e.g., "300000" for 5 min).` }],
|
||||
isError: true
|
||||
};
|
||||
}
|
||||
} else if (args.schedule_type === 'once') {
|
||||
const date = new Date(args.schedule_value);
|
||||
if (isNaN(date.getTime())) {
|
||||
return {
|
||||
content: [{ type: 'text', text: `Invalid timestamp: "${args.schedule_value}". Use ISO 8601 format like "2026-02-01T15:30:00.000Z".` }],
|
||||
isError: true
|
||||
};
|
||||
}
|
||||
}
|
||||
|
||||
// Non-main groups can only schedule for themselves
|
||||
const targetJid = isMain && args.target_group_jid ? args.target_group_jid : chatJid;
|
||||
|
||||
const data = {
|
||||
type: 'schedule_task',
|
||||
prompt: args.prompt,
|
||||
schedule_type: args.schedule_type,
|
||||
schedule_value: args.schedule_value,
|
||||
context_mode: args.context_mode || 'group',
|
||||
targetJid,
|
||||
createdBy: groupFolder,
|
||||
timestamp: new Date().toISOString()
|
||||
};
|
||||
|
||||
const filename = writeIpcFile(TASKS_DIR, data);
|
||||
|
||||
return {
|
||||
content: [{
|
||||
type: 'text',
|
||||
text: `Task scheduled (${filename}): ${args.schedule_type} - ${args.schedule_value}`
|
||||
}]
|
||||
};
|
||||
}
|
||||
),
|
||||
|
||||
// Reads from current_tasks.json which host keeps updated
|
||||
tool(
|
||||
'list_tasks',
|
||||
'List all scheduled tasks. From main: shows all tasks. From other groups: shows only that group\'s tasks.',
|
||||
{},
|
||||
async () => {
|
||||
const tasksFile = path.join(IPC_DIR, 'current_tasks.json');
|
||||
|
||||
try {
|
||||
if (!fs.existsSync(tasksFile)) {
|
||||
return {
|
||||
content: [{
|
||||
type: 'text',
|
||||
text: 'No scheduled tasks found.'
|
||||
}]
|
||||
};
|
||||
}
|
||||
|
||||
const allTasks = JSON.parse(fs.readFileSync(tasksFile, 'utf-8'));
|
||||
|
||||
const tasks = isMain
|
||||
? allTasks
|
||||
: allTasks.filter((t: { groupFolder: string }) => t.groupFolder === groupFolder);
|
||||
|
||||
if (tasks.length === 0) {
|
||||
return {
|
||||
content: [{
|
||||
type: 'text',
|
||||
text: 'No scheduled tasks found.'
|
||||
}]
|
||||
};
|
||||
}
|
||||
|
||||
const formatted = tasks.map((t: { id: string; prompt: string; schedule_type: string; schedule_value: string; status: string; next_run: string }) =>
|
||||
`- [${t.id}] ${t.prompt.slice(0, 50)}... (${t.schedule_type}: ${t.schedule_value}) - ${t.status}, next: ${t.next_run || 'N/A'}`
|
||||
).join('\n');
|
||||
|
||||
return {
|
||||
content: [{
|
||||
type: 'text',
|
||||
text: `Scheduled tasks:\n${formatted}`
|
||||
}]
|
||||
};
|
||||
} catch (err) {
|
||||
return {
|
||||
content: [{
|
||||
type: 'text',
|
||||
text: `Error reading tasks: ${err instanceof Error ? err.message : String(err)}`
|
||||
}]
|
||||
};
|
||||
}
|
||||
}
|
||||
),
|
||||
|
||||
tool(
|
||||
'pause_task',
|
||||
'Pause a scheduled task. It will not run until resumed.',
|
||||
{
|
||||
task_id: z.string().describe('The task ID to pause')
|
||||
},
|
||||
async (args) => {
|
||||
const data = {
|
||||
type: 'pause_task',
|
||||
taskId: args.task_id,
|
||||
groupFolder,
|
||||
isMain,
|
||||
timestamp: new Date().toISOString()
|
||||
};
|
||||
|
||||
writeIpcFile(TASKS_DIR, data);
|
||||
|
||||
return {
|
||||
content: [{
|
||||
type: 'text',
|
||||
text: `Task ${args.task_id} pause requested.`
|
||||
}]
|
||||
};
|
||||
}
|
||||
),
|
||||
|
||||
tool(
|
||||
'resume_task',
|
||||
'Resume a paused task.',
|
||||
{
|
||||
task_id: z.string().describe('The task ID to resume')
|
||||
},
|
||||
async (args) => {
|
||||
const data = {
|
||||
type: 'resume_task',
|
||||
taskId: args.task_id,
|
||||
groupFolder,
|
||||
isMain,
|
||||
timestamp: new Date().toISOString()
|
||||
};
|
||||
|
||||
writeIpcFile(TASKS_DIR, data);
|
||||
|
||||
return {
|
||||
content: [{
|
||||
type: 'text',
|
||||
text: `Task ${args.task_id} resume requested.`
|
||||
}]
|
||||
};
|
||||
}
|
||||
),
|
||||
|
||||
tool(
|
||||
'cancel_task',
|
||||
'Cancel and delete a scheduled task.',
|
||||
{
|
||||
task_id: z.string().describe('The task ID to cancel')
|
||||
},
|
||||
async (args) => {
|
||||
const data = {
|
||||
type: 'cancel_task',
|
||||
taskId: args.task_id,
|
||||
groupFolder,
|
||||
isMain,
|
||||
timestamp: new Date().toISOString()
|
||||
};
|
||||
|
||||
writeIpcFile(TASKS_DIR, data);
|
||||
|
||||
return {
|
||||
content: [{
|
||||
type: 'text',
|
||||
text: `Task ${args.task_id} cancellation requested.`
|
||||
}]
|
||||
};
|
||||
}
|
||||
),
|
||||
|
||||
tool(
|
||||
'register_group',
|
||||
`Register a new WhatsApp group so the agent can respond to messages there. Main group only.
|
||||
|
||||
Use available_groups.json to find the JID for a group. The folder name should be lowercase with hyphens (e.g., "family-chat").`,
|
||||
{
|
||||
jid: z.string().describe('The WhatsApp JID (e.g., "120363336345536173@g.us")'),
|
||||
name: z.string().describe('Display name for the group'),
|
||||
folder: z.string().describe('Folder name for group files (lowercase, hyphens, e.g., "family-chat")'),
|
||||
trigger: z.string().describe('Trigger word (e.g., "@Andy")')
|
||||
},
|
||||
async (args) => {
|
||||
if (!isMain) {
|
||||
return {
|
||||
content: [{ type: 'text', text: 'Only the main group can register new groups.' }],
|
||||
isError: true
|
||||
};
|
||||
}
|
||||
|
||||
const data = {
|
||||
type: 'register_group',
|
||||
jid: args.jid,
|
||||
name: args.name,
|
||||
folder: args.folder,
|
||||
trigger: args.trigger,
|
||||
timestamp: new Date().toISOString()
|
||||
};
|
||||
|
||||
writeIpcFile(TASKS_DIR, data);
|
||||
|
||||
return {
|
||||
content: [{
|
||||
type: 'text',
|
||||
text: `Group "${args.name}" registered. It will start receiving messages immediately.`
|
||||
}]
|
||||
};
|
||||
}
|
||||
)
|
||||
]
|
||||
});
|
||||
}
|
||||
143
docs/DEBUG_CHECKLIST.md
Normal file
143
docs/DEBUG_CHECKLIST.md
Normal file
@@ -0,0 +1,143 @@
|
||||
# NanoClaw Debug Checklist
|
||||
|
||||
## Known Issues (2026-02-08)
|
||||
|
||||
### 1. [FIXED] Resume branches from stale tree position
|
||||
When agent teams spawns subagent CLI processes, they write to the same session JSONL. On subsequent `query()` resumes, the CLI reads the JSONL but may pick a stale branch tip (from before the subagent activity), causing the agent's response to land on a branch the host never receives a `result` for. **Fix**: pass `resumeSessionAt` with the last assistant message UUID to explicitly anchor each resume.
|
||||
|
||||
### 2. IDLE_TIMEOUT == CONTAINER_TIMEOUT (both 30 min)
|
||||
Both timers fire at the same time, so containers always exit via hard SIGKILL (code 137) instead of graceful `_close` sentinel shutdown. The idle timeout should be shorter (e.g., 5 min) so containers wind down between messages, while container timeout stays at 30 min as a safety net for stuck agents.
|
||||
|
||||
### 3. Cursor advanced before agent succeeds
|
||||
`processGroupMessages` advances `lastAgentTimestamp` before the agent runs. If the container times out, retries find no messages (cursor already past them). Messages are permanently lost on timeout.
|
||||
|
||||
## Quick Status Check
|
||||
|
||||
```bash
|
||||
# 1. Is the service running?
|
||||
launchctl list | grep nanoclaw
|
||||
# Expected: PID 0 com.nanoclaw (PID = running, "-" = not running, non-zero exit = crashed)
|
||||
|
||||
# 2. Any running containers?
|
||||
container ls --format '{{.Names}} {{.Status}}' 2>/dev/null | grep nanoclaw
|
||||
|
||||
# 3. Any stopped/orphaned containers?
|
||||
container ls -a --format '{{.Names}} {{.Status}}' 2>/dev/null | grep nanoclaw
|
||||
|
||||
# 4. Recent errors in service log?
|
||||
grep -E 'ERROR|WARN' logs/nanoclaw.log | tail -20
|
||||
|
||||
# 5. Is WhatsApp connected? (look for last connection event)
|
||||
grep -E 'Connected to WhatsApp|Connection closed|connection.*close' logs/nanoclaw.log | tail -5
|
||||
|
||||
# 6. Are groups loaded?
|
||||
grep 'groupCount' logs/nanoclaw.log | tail -3
|
||||
```
|
||||
|
||||
## Session Transcript Branching
|
||||
|
||||
```bash
|
||||
# Check for concurrent CLI processes in session debug logs
|
||||
ls -la data/sessions/<group>/.claude/debug/
|
||||
|
||||
# Count unique SDK processes that handled messages
|
||||
# Each .txt file = one CLI subprocess. Multiple = concurrent queries.
|
||||
|
||||
# Check parentUuid branching in transcript
|
||||
python3 -c "
|
||||
import json, sys
|
||||
lines = open('data/sessions/<group>/.claude/projects/-workspace-group/<session>.jsonl').read().strip().split('\n')
|
||||
for i, line in enumerate(lines):
|
||||
try:
|
||||
d = json.loads(line)
|
||||
if d.get('type') == 'user' and d.get('message'):
|
||||
parent = d.get('parentUuid', 'ROOT')[:8]
|
||||
content = str(d['message'].get('content', ''))[:60]
|
||||
print(f'L{i+1} parent={parent} {content}')
|
||||
except: pass
|
||||
"
|
||||
```
|
||||
|
||||
## Container Timeout Investigation
|
||||
|
||||
```bash
|
||||
# Check for recent timeouts
|
||||
grep -E 'Container timeout|timed out' logs/nanoclaw.log | tail -10
|
||||
|
||||
# Check container log files for the timed-out container
|
||||
ls -lt groups/*/logs/container-*.log | head -10
|
||||
|
||||
# Read the most recent container log (replace path)
|
||||
cat groups/<group>/logs/container-<timestamp>.log
|
||||
|
||||
# Check if retries were scheduled and what happened
|
||||
grep -E 'Scheduling retry|retry|Max retries' logs/nanoclaw.log | tail -10
|
||||
```
|
||||
|
||||
## Agent Not Responding
|
||||
|
||||
```bash
|
||||
# Check if messages are being received from WhatsApp
|
||||
grep 'New messages' logs/nanoclaw.log | tail -10
|
||||
|
||||
# Check if messages are being processed (container spawned)
|
||||
grep -E 'Processing messages|Spawning container' logs/nanoclaw.log | tail -10
|
||||
|
||||
# Check if messages are being piped to active container
|
||||
grep -E 'Piped messages|sendMessage' logs/nanoclaw.log | tail -10
|
||||
|
||||
# Check the queue state — any active containers?
|
||||
grep -E 'Starting container|Container active|concurrency limit' logs/nanoclaw.log | tail -10
|
||||
|
||||
# Check lastAgentTimestamp vs latest message timestamp
|
||||
sqlite3 store/messages.db "SELECT chat_jid, MAX(timestamp) as latest FROM messages GROUP BY chat_jid ORDER BY latest DESC LIMIT 5;"
|
||||
```
|
||||
|
||||
## Container Mount Issues
|
||||
|
||||
```bash
|
||||
# Check mount validation logs (shows on container spawn)
|
||||
grep -E 'Mount validated|Mount.*REJECTED|mount' logs/nanoclaw.log | tail -10
|
||||
|
||||
# Verify the mount allowlist is readable
|
||||
cat ~/.config/nanoclaw/mount-allowlist.json
|
||||
|
||||
# Check group's container_config in DB
|
||||
sqlite3 store/messages.db "SELECT name, container_config FROM registered_groups;"
|
||||
|
||||
# Test-run a container to check mounts (dry run)
|
||||
# Replace <group-folder> with the group's folder name
|
||||
container run -i --rm --entrypoint ls nanoclaw-agent:latest /workspace/extra/
|
||||
```
|
||||
|
||||
## WhatsApp Auth Issues
|
||||
|
||||
```bash
|
||||
# Check if QR code was requested (means auth expired)
|
||||
grep 'QR\|authentication required\|qr' logs/nanoclaw.log | tail -5
|
||||
|
||||
# Check auth files exist
|
||||
ls -la store/auth/
|
||||
|
||||
# Re-authenticate if needed
|
||||
npm run auth
|
||||
```
|
||||
|
||||
## Service Management
|
||||
|
||||
```bash
|
||||
# Restart the service
|
||||
launchctl kickstart -k gui/$(id -u)/com.nanoclaw
|
||||
|
||||
# View live logs
|
||||
tail -f logs/nanoclaw.log
|
||||
|
||||
# Stop the service (careful — running containers are detached, not killed)
|
||||
launchctl bootout gui/$(id -u)/com.nanoclaw
|
||||
|
||||
# Start the service
|
||||
launchctl bootstrap gui/$(id -u) ~/Library/LaunchAgents/com.nanoclaw.plist
|
||||
|
||||
# Rebuild after code changes
|
||||
npm run build && launchctl kickstart -k gui/$(id -u)/com.nanoclaw
|
||||
```
|
||||
@@ -122,7 +122,7 @@ A personal Claude assistant accessible via WhatsApp, with minimal custom code.
|
||||
|
||||
### Group Management
|
||||
- New groups are added explicitly via the main channel
|
||||
- Groups are registered by editing `data/registered_groups.json`
|
||||
- Groups are registered in SQLite (via the main channel or IPC `register_group` command)
|
||||
- Each group gets a dedicated folder under `groups/`
|
||||
- Groups can have additional directories mounted via `containerConfig`
|
||||
|
||||
|
||||
643
docs/SDK_DEEP_DIVE.md
Normal file
643
docs/SDK_DEEP_DIVE.md
Normal file
@@ -0,0 +1,643 @@
|
||||
# Claude Agent SDK Deep Dive
|
||||
|
||||
Findings from reverse-engineering `@anthropic-ai/claude-agent-sdk` v0.2.29–0.2.34 to understand how `query()` works, why agent teams subagents were being killed, and how to fix it. Supplemented with official SDK reference docs.
|
||||
|
||||
## Architecture
|
||||
|
||||
```
|
||||
Agent Runner (our code)
|
||||
└── query() → SDK (sdk.mjs)
|
||||
└── spawns CLI subprocess (cli.js)
|
||||
└── Claude API calls, tool execution
|
||||
└── Task tool → spawns subagent subprocesses
|
||||
```
|
||||
|
||||
The SDK spawns `cli.js` as a child process with `--output-format stream-json --input-format stream-json --print --verbose` flags. Communication happens via JSON-lines on stdin/stdout.
|
||||
|
||||
`query()` returns a `Query` object extending `AsyncGenerator<SDKMessage, void>`. Internally:
|
||||
|
||||
- SDK spawns CLI as a child process, communicates via stdin/stdout JSON lines
|
||||
- SDK's `readMessages()` reads from CLI stdout, enqueues into internal stream
|
||||
- `readSdkMessages()` async generator yields from that stream
|
||||
- `[Symbol.asyncIterator]` returns `readSdkMessages()`
|
||||
- Iterator returns `done: true` only when CLI closes stdout
|
||||
|
||||
Both V1 (`query()`) and V2 (`createSession`/`send`/`stream`) use the exact same three-layer architecture:
|
||||
|
||||
```
|
||||
SDK (sdk.mjs) CLI Process (cli.js)
|
||||
-------------- --------------------
|
||||
XX Transport ------> stdin reader (bd1)
|
||||
(spawn cli.js) |
|
||||
$X Query <------ stdout writer
|
||||
(JSON-lines) |
|
||||
EZ() recursive generator
|
||||
|
|
||||
Anthropic Messages API
|
||||
```
|
||||
|
||||
## The Core Agent Loop (EZ)
|
||||
|
||||
Inside the CLI, the agentic loop is a **recursive async generator called `EZ()`**, not an iterative while loop:
|
||||
|
||||
```
|
||||
EZ({ messages, systemPrompt, canUseTool, maxTurns, turnCount=1, ... })
|
||||
```
|
||||
|
||||
Each invocation = one API call to Claude (one "turn").
|
||||
|
||||
### Flow per turn:
|
||||
|
||||
1. **Prepare messages** — trim context, run compaction if needed
|
||||
2. **Call the Anthropic API** (via `mW1` streaming function)
|
||||
3. **Extract tool_use blocks** from the response
|
||||
4. **Branch:**
|
||||
- If **no tool_use blocks** → stop (run stop hooks, return)
|
||||
- If **tool_use blocks present** → execute tools, increment turnCount, recurse
|
||||
|
||||
All complex logic — the agent loop, tool execution, background tasks, teammate orchestration — runs inside the CLI subprocess. `query()` is a thin transport wrapper.
|
||||
|
||||
## query() Options
|
||||
|
||||
Full `Options` type from the official docs:
|
||||
|
||||
| Property | Type | Default | Description |
|
||||
|----------|------|---------|-------------|
|
||||
| `abortController` | `AbortController` | `new AbortController()` | Controller for cancelling operations |
|
||||
| `additionalDirectories` | `string[]` | `[]` | Additional directories Claude can access |
|
||||
| `agents` | `Record<string, AgentDefinition>` | `undefined` | Programmatically define subagents (not agent teams — no orchestration) |
|
||||
| `allowDangerouslySkipPermissions` | `boolean` | `false` | Required when using `permissionMode: 'bypassPermissions'` |
|
||||
| `allowedTools` | `string[]` | All tools | List of allowed tool names |
|
||||
| `betas` | `SdkBeta[]` | `[]` | Beta features (e.g., `['context-1m-2025-08-07']` for 1M context) |
|
||||
| `canUseTool` | `CanUseTool` | `undefined` | Custom permission function for tool usage |
|
||||
| `continue` | `boolean` | `false` | Continue the most recent conversation |
|
||||
| `cwd` | `string` | `process.cwd()` | Current working directory |
|
||||
| `disallowedTools` | `string[]` | `[]` | List of disallowed tool names |
|
||||
| `enableFileCheckpointing` | `boolean` | `false` | Enable file change tracking for rewinding |
|
||||
| `env` | `Dict<string>` | `process.env` | Environment variables |
|
||||
| `executable` | `'bun' \| 'deno' \| 'node'` | Auto-detected | JavaScript runtime |
|
||||
| `fallbackModel` | `string` | `undefined` | Model to use if primary fails |
|
||||
| `forkSession` | `boolean` | `false` | When resuming, fork to a new session ID instead of continuing original |
|
||||
| `hooks` | `Partial<Record<HookEvent, HookCallbackMatcher[]>>` | `{}` | Hook callbacks for events |
|
||||
| `includePartialMessages` | `boolean` | `false` | Include partial message events (streaming) |
|
||||
| `maxBudgetUsd` | `number` | `undefined` | Maximum budget in USD for the query |
|
||||
| `maxThinkingTokens` | `number` | `undefined` | Maximum tokens for thinking process |
|
||||
| `maxTurns` | `number` | `undefined` | Maximum conversation turns |
|
||||
| `mcpServers` | `Record<string, McpServerConfig>` | `{}` | MCP server configurations |
|
||||
| `model` | `string` | Default from CLI | Claude model to use |
|
||||
| `outputFormat` | `{ type: 'json_schema', schema: JSONSchema }` | `undefined` | Structured output format |
|
||||
| `pathToClaudeCodeExecutable` | `string` | Uses built-in | Path to Claude Code executable |
|
||||
| `permissionMode` | `PermissionMode` | `'default'` | Permission mode |
|
||||
| `plugins` | `SdkPluginConfig[]` | `[]` | Load custom plugins from local paths |
|
||||
| `resume` | `string` | `undefined` | Session ID to resume |
|
||||
| `resumeSessionAt` | `string` | `undefined` | Resume session at a specific message UUID |
|
||||
| `sandbox` | `SandboxSettings` | `undefined` | Sandbox behavior configuration |
|
||||
| `settingSources` | `SettingSource[]` | `[]` (none) | Which filesystem settings to load. Must include `'project'` to load CLAUDE.md |
|
||||
| `stderr` | `(data: string) => void` | `undefined` | Callback for stderr output |
|
||||
| `systemPrompt` | `string \| { type: 'preset'; preset: 'claude_code'; append?: string }` | `undefined` | System prompt. Use preset to get Claude Code's prompt, with optional `append` |
|
||||
| `tools` | `string[] \| { type: 'preset'; preset: 'claude_code' }` | `undefined` | Tool configuration |
|
||||
|
||||
### PermissionMode
|
||||
|
||||
```typescript
|
||||
type PermissionMode = 'default' | 'acceptEdits' | 'bypassPermissions' | 'plan';
|
||||
```
|
||||
|
||||
### SettingSource
|
||||
|
||||
```typescript
|
||||
type SettingSource = 'user' | 'project' | 'local';
|
||||
// 'user' → ~/.claude/settings.json
|
||||
// 'project' → .claude/settings.json (version controlled)
|
||||
// 'local' → .claude/settings.local.json (gitignored)
|
||||
```
|
||||
|
||||
When omitted, SDK loads NO filesystem settings (isolation by default). Precedence: local > project > user. Programmatic options always override filesystem settings.
|
||||
|
||||
### AgentDefinition
|
||||
|
||||
Programmatic subagents (NOT agent teams — these are simpler, no inter-agent coordination):
|
||||
|
||||
```typescript
|
||||
type AgentDefinition = {
|
||||
description: string; // When to use this agent
|
||||
tools?: string[]; // Allowed tools (inherits all if omitted)
|
||||
prompt: string; // Agent's system prompt
|
||||
model?: 'sonnet' | 'opus' | 'haiku' | 'inherit';
|
||||
}
|
||||
```
|
||||
|
||||
### McpServerConfig
|
||||
|
||||
```typescript
|
||||
type McpServerConfig =
|
||||
| { type?: 'stdio'; command: string; args?: string[]; env?: Record<string, string> }
|
||||
| { type: 'sse'; url: string; headers?: Record<string, string> }
|
||||
| { type: 'http'; url: string; headers?: Record<string, string> }
|
||||
| { type: 'sdk'; name: string; instance: McpServer } // in-process
|
||||
```
|
||||
|
||||
### SdkBeta
|
||||
|
||||
```typescript
|
||||
type SdkBeta = 'context-1m-2025-08-07';
|
||||
// Enables 1M token context window for Opus 4.6, Sonnet 4.5, Sonnet 4
|
||||
```
|
||||
|
||||
### CanUseTool
|
||||
|
||||
```typescript
|
||||
type CanUseTool = (
|
||||
toolName: string,
|
||||
input: ToolInput,
|
||||
options: { signal: AbortSignal; suggestions?: PermissionUpdate[] }
|
||||
) => Promise<PermissionResult>;
|
||||
|
||||
type PermissionResult =
|
||||
| { behavior: 'allow'; updatedInput: ToolInput; updatedPermissions?: PermissionUpdate[] }
|
||||
| { behavior: 'deny'; message: string; interrupt?: boolean };
|
||||
```
|
||||
|
||||
## SDKMessage Types
|
||||
|
||||
`query()` can yield 16 message types. The official docs show a simplified union of 7, but `sdk.d.ts` has the full set:
|
||||
|
||||
| Type | Subtype | Purpose |
|
||||
|------|---------|---------|
|
||||
| `system` | `init` | Session initialized, contains session_id, tools, model |
|
||||
| `system` | `task_notification` | Background agent completed/failed/stopped |
|
||||
| `system` | `compact_boundary` | Conversation was compacted |
|
||||
| `system` | `status` | Status change (e.g. compacting) |
|
||||
| `system` | `hook_started` | Hook execution started |
|
||||
| `system` | `hook_progress` | Hook progress output |
|
||||
| `system` | `hook_response` | Hook completed |
|
||||
| `system` | `files_persisted` | Files saved |
|
||||
| `assistant` | — | Claude's response (text + tool calls) |
|
||||
| `user` | — | User message (internal) |
|
||||
| `user` (replay) | — | Replayed user message on resume |
|
||||
| `result` | `success` / `error_*` | Final result of a prompt processing round |
|
||||
| `stream_event` | — | Partial streaming (when includePartialMessages) |
|
||||
| `tool_progress` | — | Long-running tool progress |
|
||||
| `auth_status` | — | Authentication state changes |
|
||||
| `tool_use_summary` | — | Summary of preceding tool uses |
|
||||
|
||||
### SDKTaskNotificationMessage (sdk.d.ts:1507)
|
||||
|
||||
```typescript
|
||||
type SDKTaskNotificationMessage = {
|
||||
type: 'system';
|
||||
subtype: 'task_notification';
|
||||
task_id: string;
|
||||
status: 'completed' | 'failed' | 'stopped';
|
||||
output_file: string;
|
||||
summary: string;
|
||||
uuid: UUID;
|
||||
session_id: string;
|
||||
};
|
||||
```
|
||||
|
||||
### SDKResultMessage (sdk.d.ts:1375)
|
||||
|
||||
Two variants with shared fields:
|
||||
|
||||
```typescript
|
||||
// Shared fields on both variants:
|
||||
// uuid, session_id, duration_ms, duration_api_ms, is_error, num_turns,
|
||||
// total_cost_usd, usage: NonNullableUsage, modelUsage, permission_denials
|
||||
|
||||
// Success:
|
||||
type SDKResultSuccess = {
|
||||
type: 'result';
|
||||
subtype: 'success';
|
||||
result: string;
|
||||
structured_output?: unknown;
|
||||
// ...shared fields
|
||||
};
|
||||
|
||||
// Error:
|
||||
type SDKResultError = {
|
||||
type: 'result';
|
||||
subtype: 'error_during_execution' | 'error_max_turns' | 'error_max_budget_usd' | 'error_max_structured_output_retries';
|
||||
errors: string[];
|
||||
// ...shared fields
|
||||
};
|
||||
```
|
||||
|
||||
Useful fields on result: `total_cost_usd`, `duration_ms`, `num_turns`, `modelUsage` (per-model breakdown with `costUSD`, `inputTokens`, `outputTokens`, `contextWindow`).
|
||||
|
||||
### SDKAssistantMessage
|
||||
|
||||
```typescript
|
||||
type SDKAssistantMessage = {
|
||||
type: 'assistant';
|
||||
uuid: UUID;
|
||||
session_id: string;
|
||||
message: APIAssistantMessage; // From Anthropic SDK
|
||||
parent_tool_use_id: string | null; // Non-null when from subagent
|
||||
};
|
||||
```
|
||||
|
||||
### SDKSystemMessage (init)
|
||||
|
||||
```typescript
|
||||
type SDKSystemMessage = {
|
||||
type: 'system';
|
||||
subtype: 'init';
|
||||
uuid: UUID;
|
||||
session_id: string;
|
||||
apiKeySource: ApiKeySource;
|
||||
cwd: string;
|
||||
tools: string[];
|
||||
mcp_servers: { name: string; status: string }[];
|
||||
model: string;
|
||||
permissionMode: PermissionMode;
|
||||
slash_commands: string[];
|
||||
output_style: string;
|
||||
};
|
||||
```
|
||||
|
||||
## Turn Behavior: When the Agent Stops vs Continues
|
||||
|
||||
### When the Agent STOPS (no more API calls)
|
||||
|
||||
**1. No tool_use blocks in response (THE PRIMARY CASE)**
|
||||
|
||||
Claude responded with text only — it decided it has completed the task. The API's `stop_reason` will be `"end_turn"`. The SDK does NOT make this decision — it's entirely driven by Claude's model output.
|
||||
|
||||
**2. Max turns exceeded** — Results in `SDKResultError` with `subtype: "error_max_turns"`.
|
||||
|
||||
**3. Abort signal** — User interruption via `abortController`.
|
||||
|
||||
**4. Budget exceeded** — `totalCost >= maxBudgetUsd` → `"error_max_budget_usd"`.
|
||||
|
||||
**5. Stop hook prevents continuation** — Hook returns `{preventContinuation: true}`.
|
||||
|
||||
### When the Agent CONTINUES (makes another API call)
|
||||
|
||||
**1. Response contains tool_use blocks (THE PRIMARY CASE)** — Execute tools, increment turnCount, recurse into EZ.
|
||||
|
||||
**2. max_output_tokens recovery** — Up to 3 retries with a "break your work into smaller pieces" context message.
|
||||
|
||||
**3. Stop hook blocking errors** — Errors fed back as context messages, loop continues.
|
||||
|
||||
**4. Model fallback** — Retry with fallback model (one-time).
|
||||
|
||||
### Decision Table
|
||||
|
||||
| Condition | Action | Result Type |
|
||||
|-----------|--------|-------------|
|
||||
| Response has `tool_use` blocks | Execute tools, recurse into `EZ` | continues |
|
||||
| Response has NO `tool_use` blocks | Run stop hooks, return | `success` |
|
||||
| `turnCount > maxTurns` | Yield max_turns_reached | `error_max_turns` |
|
||||
| `totalCost >= maxBudgetUsd` | Yield budget error | `error_max_budget_usd` |
|
||||
| `abortController.signal.aborted` | Yield interrupted msg | depends on context |
|
||||
| `stop_reason === "max_tokens"` (output) | Retry up to 3x with recovery prompt | continues |
|
||||
| Stop hook `preventContinuation` | Return immediately | `success` |
|
||||
| Stop hook blocking error | Feed error back, recurse | continues |
|
||||
| Model fallback error | Retry with fallback model (one-time) | continues |
|
||||
|
||||
## Subagent Execution Modes
|
||||
|
||||
### Case 1: Synchronous Subagents (`run_in_background: false`) — BLOCKS
|
||||
|
||||
Parent agent calls Task tool → `VR()` runs `EZ()` for subagent → parent waits for full result → tool result returned to parent → parent continues.
|
||||
|
||||
The subagent runs the full recursive EZ loop. The parent's tool execution is suspended via `await`. There is a mid-execution "promotion" mechanism: a synchronous subagent can be promoted to background via `Promise.race()` against a `backgroundSignal` promise.
|
||||
|
||||
### Case 2: Background Tasks (`run_in_background: true`) — DOES NOT WAIT
|
||||
|
||||
- **Bash tool:** Command spawned, tool returns immediately with empty result + `backgroundTaskId`
|
||||
- **Task/Agent tool:** Subagent launched in fire-and-forget wrapper (`g01()`), tool returns immediately with `status: "async_launched"` + `outputFile` path
|
||||
|
||||
Zero "wait for background tasks" logic before emitting the `type: "result"` message. When a background task completes, an `SDKTaskNotificationMessage` is emitted separately.
|
||||
|
||||
### Case 3: Agent Teams (TeammateTool / SendMessage) — RESULT FIRST, THEN POLLING
|
||||
|
||||
The team leader runs its normal EZ loop, which includes spawning teammates. When the leader's EZ loop finishes, `type: "result"` is emitted. Then the leader enters a post-result polling loop:
|
||||
|
||||
```javascript
|
||||
while (true) {
|
||||
// Check if no active teammates AND no running tasks → break
|
||||
// Check for unread messages from teammates → re-inject as new prompt, restart EZ loop
|
||||
// If stdin closed with active teammates → inject shutdown prompt
|
||||
// Poll every 500ms
|
||||
}
|
||||
```
|
||||
|
||||
From the SDK consumer's perspective: you receive the initial `type: "result"`, but the AsyncGenerator may continue yielding more messages as the team leader processes teammate responses and re-enters the agent loop. The generator only truly finishes when all teammates have shut down.
|
||||
|
||||
## The isSingleUserTurn Problem
|
||||
|
||||
From sdk.mjs:
|
||||
|
||||
```javascript
|
||||
QK = typeof X === "string" // isSingleUserTurn = true when prompt is a string
|
||||
```
|
||||
|
||||
When `isSingleUserTurn` is true and the first `result` message arrives:
|
||||
|
||||
```javascript
|
||||
if (this.isSingleUserTurn) {
|
||||
this.transport.endInput(); // closes stdin to CLI
|
||||
}
|
||||
```
|
||||
|
||||
This triggers a chain reaction:
|
||||
|
||||
1. SDK closes CLI stdin
|
||||
2. CLI detects stdin close
|
||||
3. Polling loop sees `D = true` (stdin closed) with active teammates
|
||||
4. Injects shutdown prompt → leader sends `shutdown_request` to all teammates
|
||||
5. **Teammates get killed mid-research**
|
||||
|
||||
The shutdown prompt (found via `BGq` variable in minified cli.js):
|
||||
|
||||
```
|
||||
You are running in non-interactive mode and cannot return a response
|
||||
to the user until your team is shut down.
|
||||
|
||||
You MUST shut down your team before preparing your final response:
|
||||
1. Use requestShutdown to ask each team member to shut down gracefully
|
||||
2. Wait for shutdown approvals
|
||||
3. Use the cleanup operation to clean up the team
|
||||
4. Only then provide your final response to the user
|
||||
```
|
||||
|
||||
### The practical problem
|
||||
|
||||
With V1 `query()` + string prompt + agent teams:
|
||||
|
||||
1. Leader spawns teammates, they start researching
|
||||
2. Leader's EZ loop ends ("I've dispatched the team, they're working on it")
|
||||
3. `type: "result"` emitted
|
||||
4. SDK sees `isSingleUserTurn = true` → closes stdin immediately
|
||||
5. Polling loop detects stdin closed + active teammates → injects shutdown prompt
|
||||
6. Leader sends `shutdown_request` to all teammates
|
||||
7. **Teammates could be 10 seconds into a 5-minute research task and they get told to stop**
|
||||
|
||||
## The Fix: Streaming Input Mode
|
||||
|
||||
Instead of passing a string prompt (which sets `isSingleUserTurn = true`), pass an `AsyncIterable<SDKUserMessage>`:
|
||||
|
||||
```typescript
|
||||
// Before (broken for agent teams):
|
||||
query({ prompt: "do something" })
|
||||
|
||||
// After (keeps CLI alive):
|
||||
query({ prompt: asyncIterableOfMessages })
|
||||
```
|
||||
|
||||
When prompt is an `AsyncIterable`:
|
||||
- `isSingleUserTurn = false`
|
||||
- SDK does NOT close stdin after first result
|
||||
- CLI stays alive, continues processing
|
||||
- Background agents keep running
|
||||
- `task_notification` messages flow through the iterator
|
||||
- We control when to end the iterable
|
||||
|
||||
### Additional Benefit: Streaming New Messages
|
||||
|
||||
With the async iterable approach, we can push new incoming WhatsApp messages into the iterable while the agent is still working. Instead of queuing messages until the container exits and spawning a new container, we stream them directly into the running session.
|
||||
|
||||
### Intended Lifecycle with Agent Teams
|
||||
|
||||
With the async iterable fix (`isSingleUserTurn = false`), stdin stays open so the CLI never hits the teammate check or shutdown prompt injection:
|
||||
|
||||
```
|
||||
1. system/init → session initialized
|
||||
2. assistant/user → Claude reasoning, tool calls, tool results
|
||||
3. ... → more assistant/user turns (spawning subagents, etc.)
|
||||
4. result #1 → lead agent's first response (capture)
|
||||
5. task_notification(s) → background agents complete/fail/stop
|
||||
6. assistant/user → lead agent continues (processing subagent results)
|
||||
7. result #2 → lead agent's follow-up response (capture)
|
||||
8. [iterator done] → CLI closed stdout, all done
|
||||
```
|
||||
|
||||
All results are meaningful — capture every one, not just the first.
|
||||
|
||||
## V1 vs V2 API
|
||||
|
||||
### V1: `query()` — One-shot async generator
|
||||
|
||||
```typescript
|
||||
const q = query({ prompt: "...", options: {...} });
|
||||
for await (const msg of q) { /* process events */ }
|
||||
```
|
||||
|
||||
- When `prompt` is a string: `isSingleUserTurn = true` → stdin auto-closes after first result
|
||||
- For multi-turn: must pass an `AsyncIterable<SDKUserMessage>` and manage coordination yourself
|
||||
|
||||
### V2: `createSession()` + `send()` / `stream()` — Persistent session
|
||||
|
||||
```typescript
|
||||
await using session = unstable_v2_createSession({ model: "..." });
|
||||
await session.send("first message");
|
||||
for await (const msg of session.stream()) { /* events */ }
|
||||
await session.send("follow-up");
|
||||
for await (const msg of session.stream()) { /* events */ }
|
||||
```
|
||||
|
||||
- `isSingleUserTurn = false` always → stdin stays open
|
||||
- `send()` enqueues into an async queue (`QX`)
|
||||
- `stream()` yields from the same message generator, stopping on `result` type
|
||||
- Multi-turn is natural — just alternate `send()` / `stream()`
|
||||
- V2 does NOT call V1 `query()` internally — both independently create Transport + Query
|
||||
|
||||
### Comparison Table
|
||||
|
||||
| Aspect | V1 | V2 |
|
||||
|--------|----|----|
|
||||
| `isSingleUserTurn` | `true` for string prompt | always `false` |
|
||||
| Multi-turn | Requires managing `AsyncIterable` | Just call `send()`/`stream()` |
|
||||
| stdin lifecycle | Auto-closes after first result | Stays open until `close()` |
|
||||
| Agentic loop | Identical `EZ()` | Identical `EZ()` |
|
||||
| Stop conditions | Same | Same |
|
||||
| Session persistence | Must pass `resume` to new `query()` | Built-in via session object |
|
||||
| API stability | Stable | Unstable preview (`unstable_v2_*` prefix) |
|
||||
|
||||
**Key finding: Zero difference in turn behavior.** Both use the same CLI process, the same `EZ()` recursive generator, and the same decision logic.
|
||||
|
||||
## Hook Events
|
||||
|
||||
```typescript
|
||||
type HookEvent =
|
||||
| 'PreToolUse' // Before tool execution
|
||||
| 'PostToolUse' // After successful tool execution
|
||||
| 'PostToolUseFailure' // After failed tool execution
|
||||
| 'Notification' // Notification messages
|
||||
| 'UserPromptSubmit' // User prompt submitted
|
||||
| 'SessionStart' // Session started (startup/resume/clear/compact)
|
||||
| 'SessionEnd' // Session ended
|
||||
| 'Stop' // Agent stopping
|
||||
| 'SubagentStart' // Subagent spawned
|
||||
| 'SubagentStop' // Subagent stopped
|
||||
| 'PreCompact' // Before conversation compaction
|
||||
| 'PermissionRequest'; // Permission being requested
|
||||
```
|
||||
|
||||
### Hook Configuration
|
||||
|
||||
```typescript
|
||||
interface HookCallbackMatcher {
|
||||
matcher?: string; // Optional tool name matcher
|
||||
hooks: HookCallback[];
|
||||
}
|
||||
|
||||
type HookCallback = (
|
||||
input: HookInput,
|
||||
toolUseID: string | undefined,
|
||||
options: { signal: AbortSignal }
|
||||
) => Promise<HookJSONOutput>;
|
||||
```
|
||||
|
||||
### Hook Return Values
|
||||
|
||||
```typescript
|
||||
type HookJSONOutput = AsyncHookJSONOutput | SyncHookJSONOutput;
|
||||
|
||||
type AsyncHookJSONOutput = { async: true; asyncTimeout?: number };
|
||||
|
||||
type SyncHookJSONOutput = {
|
||||
continue?: boolean;
|
||||
suppressOutput?: boolean;
|
||||
stopReason?: string;
|
||||
decision?: 'approve' | 'block';
|
||||
systemMessage?: string;
|
||||
reason?: string;
|
||||
hookSpecificOutput?:
|
||||
| { hookEventName: 'PreToolUse'; permissionDecision?: 'allow' | 'deny' | 'ask'; updatedInput?: Record<string, unknown> }
|
||||
| { hookEventName: 'UserPromptSubmit'; additionalContext?: string }
|
||||
| { hookEventName: 'SessionStart'; additionalContext?: string }
|
||||
| { hookEventName: 'PostToolUse'; additionalContext?: string };
|
||||
};
|
||||
```
|
||||
|
||||
### Subagent Hooks (from sdk.d.ts)
|
||||
|
||||
```typescript
|
||||
type SubagentStartHookInput = BaseHookInput & {
|
||||
hook_event_name: 'SubagentStart';
|
||||
agent_id: string;
|
||||
agent_type: string;
|
||||
};
|
||||
|
||||
type SubagentStopHookInput = BaseHookInput & {
|
||||
hook_event_name: 'SubagentStop';
|
||||
stop_hook_active: boolean;
|
||||
agent_id: string;
|
||||
agent_transcript_path: string;
|
||||
agent_type: string;
|
||||
};
|
||||
|
||||
// BaseHookInput = { session_id, transcript_path, cwd, permission_mode? }
|
||||
```
|
||||
|
||||
## Query Interface Methods
|
||||
|
||||
The `Query` object (sdk.d.ts:931). Official docs list these public methods:
|
||||
|
||||
```typescript
|
||||
interface Query extends AsyncGenerator<SDKMessage, void> {
|
||||
interrupt(): Promise<void>; // Stop current execution (streaming input mode only)
|
||||
rewindFiles(userMessageUuid: string): Promise<void>; // Restore files to state at message (needs enableFileCheckpointing)
|
||||
setPermissionMode(mode: PermissionMode): Promise<void>; // Change permissions (streaming input mode only)
|
||||
setModel(model?: string): Promise<void>; // Change model (streaming input mode only)
|
||||
setMaxThinkingTokens(max: number | null): Promise<void>; // Change thinking tokens (streaming input mode only)
|
||||
supportedCommands(): Promise<SlashCommand[]>; // Available slash commands
|
||||
supportedModels(): Promise<ModelInfo[]>; // Available models
|
||||
mcpServerStatus(): Promise<McpServerStatus[]>; // MCP server connection status
|
||||
accountInfo(): Promise<AccountInfo>; // Authenticated user info
|
||||
}
|
||||
```
|
||||
|
||||
Found in sdk.d.ts but NOT in official docs (may be internal):
|
||||
- `streamInput(stream)` — stream additional user messages
|
||||
- `close()` — forcefully end the query
|
||||
- `setMcpServers(servers)` — dynamically add/remove MCP servers
|
||||
|
||||
## Sandbox Configuration
|
||||
|
||||
```typescript
|
||||
type SandboxSettings = {
|
||||
enabled?: boolean;
|
||||
autoAllowBashIfSandboxed?: boolean;
|
||||
excludedCommands?: string[];
|
||||
allowUnsandboxedCommands?: boolean;
|
||||
network?: {
|
||||
allowLocalBinding?: boolean;
|
||||
allowUnixSockets?: string[];
|
||||
allowAllUnixSockets?: boolean;
|
||||
httpProxyPort?: number;
|
||||
socksProxyPort?: number;
|
||||
};
|
||||
ignoreViolations?: {
|
||||
file?: string[];
|
||||
network?: string[];
|
||||
};
|
||||
};
|
||||
```
|
||||
|
||||
When `allowUnsandboxedCommands` is true, the model can set `dangerouslyDisableSandbox: true` in Bash tool input, which falls back to the `canUseTool` permission handler.
|
||||
|
||||
## MCP Server Helpers
|
||||
|
||||
### tool()
|
||||
|
||||
Creates type-safe MCP tool definitions with Zod schemas:
|
||||
|
||||
```typescript
|
||||
function tool<Schema extends ZodRawShape>(
|
||||
name: string,
|
||||
description: string,
|
||||
inputSchema: Schema,
|
||||
handler: (args: z.infer<ZodObject<Schema>>, extra: unknown) => Promise<CallToolResult>
|
||||
): SdkMcpToolDefinition<Schema>
|
||||
```
|
||||
|
||||
### createSdkMcpServer()
|
||||
|
||||
Creates an in-process MCP server (we use stdio instead for subagent inheritance):
|
||||
|
||||
```typescript
|
||||
function createSdkMcpServer(options: {
|
||||
name: string;
|
||||
version?: string;
|
||||
tools?: Array<SdkMcpToolDefinition<any>>;
|
||||
}): McpSdkServerConfigWithInstance
|
||||
```
|
||||
|
||||
## Internals Reference
|
||||
|
||||
### Key minified identifiers (sdk.mjs)
|
||||
|
||||
| Minified | Purpose |
|
||||
|----------|---------|
|
||||
| `s_` | V1 `query()` export |
|
||||
| `e_` | `unstable_v2_createSession` |
|
||||
| `Xx` | `unstable_v2_resumeSession` |
|
||||
| `Qx` | `unstable_v2_prompt` |
|
||||
| `U9` | V2 Session class (`send`/`stream`/`close`) |
|
||||
| `XX` | ProcessTransport (spawns cli.js) |
|
||||
| `$X` | Query class (JSON-line routing, async iterable) |
|
||||
| `QX` | AsyncQueue (input stream buffer) |
|
||||
|
||||
### Key minified identifiers (cli.js)
|
||||
|
||||
| Minified | Purpose |
|
||||
|----------|---------|
|
||||
| `EZ` | Core recursive agentic loop (async generator) |
|
||||
| `_t4` | Stop hook handler (runs when no tool_use blocks) |
|
||||
| `PU1` | Streaming tool executor (parallel during API response) |
|
||||
| `TP6` | Standard tool executor (after API response) |
|
||||
| `GU1` | Individual tool executor |
|
||||
| `lTq` | SDK session runner (calls EZ directly) |
|
||||
| `bd1` | stdin reader (JSON-lines from transport) |
|
||||
| `mW1` | Anthropic API streaming caller |
|
||||
|
||||
## Key Files
|
||||
|
||||
- `sdk.d.ts` — All type definitions (1777 lines)
|
||||
- `sdk-tools.d.ts` — Tool input schemas
|
||||
- `sdk.mjs` — SDK runtime (minified, 376KB)
|
||||
- `cli.js` — CLI executable (minified, runs as subprocess)
|
||||
110
docs/SPEC.md
110
docs/SPEC.md
@@ -98,11 +98,13 @@ nanoclaw/
|
||||
├── .gitignore
|
||||
│
|
||||
├── src/
|
||||
│ ├── index.ts # Main application (WhatsApp + routing)
|
||||
│ ├── index.ts # Main application (WhatsApp + routing + message loop)
|
||||
│ ├── config.ts # Configuration constants
|
||||
│ ├── types.ts # TypeScript interfaces
|
||||
│ ├── utils.ts # Generic utility functions
|
||||
│ ├── db.ts # Database initialization and queries
|
||||
│ ├── logger.ts # Pino logger setup
|
||||
│ ├── db.ts # SQLite database initialization and queries
|
||||
│ ├── group-queue.ts # Per-group queue with global concurrency limit
|
||||
│ ├── mount-security.ts # Mount allowlist validation for containers
|
||||
│ ├── whatsapp-auth.ts # Standalone WhatsApp authentication
|
||||
│ ├── task-scheduler.ts # Runs scheduled tasks when due
|
||||
│ └── container-runner.ts # Spawns agents in Apple Containers
|
||||
@@ -114,8 +116,8 @@ nanoclaw/
|
||||
│ │ ├── package.json
|
||||
│ │ ├── tsconfig.json
|
||||
│ │ └── src/
|
||||
│ │ ├── index.ts # Entry point (reads JSON, runs agent)
|
||||
│ │ └── ipc-mcp.ts # MCP server for host communication
|
||||
│ │ ├── index.ts # Entry point (query loop, IPC polling, session resume)
|
||||
│ │ └── ipc-mcp-stdio.ts # Stdio-based MCP server for host communication
|
||||
│ └── skills/
|
||||
│ └── agent-browser.md # Browser automation skill
|
||||
│
|
||||
@@ -123,12 +125,15 @@ nanoclaw/
|
||||
│
|
||||
├── .claude/
|
||||
│ └── skills/
|
||||
│ ├── setup/
|
||||
│ │ └── SKILL.md # /setup skill
|
||||
│ ├── customize/
|
||||
│ │ └── SKILL.md # /customize skill
|
||||
│ └── debug/
|
||||
│ └── SKILL.md # /debug skill (container debugging)
|
||||
│ ├── setup/SKILL.md # /setup - First-time installation
|
||||
│ ├── customize/SKILL.md # /customize - Add capabilities
|
||||
│ ├── debug/SKILL.md # /debug - Container debugging
|
||||
│ ├── add-telegram/SKILL.md # /add-telegram - Telegram channel
|
||||
│ ├── add-gmail/SKILL.md # /add-gmail - Gmail integration
|
||||
│ ├── add-voice-transcription/ # /add-voice-transcription - Whisper
|
||||
│ ├── x-integration/SKILL.md # /x-integration - X/Twitter
|
||||
│ ├── convert-to-docker/SKILL.md # /convert-to-docker - Docker runtime
|
||||
│ └── add-parallel/SKILL.md # /add-parallel - Parallel agents
|
||||
│
|
||||
├── groups/
|
||||
│ ├── CLAUDE.md # Global memory (all groups read this)
|
||||
@@ -142,12 +147,10 @@ nanoclaw/
|
||||
│
|
||||
├── store/ # Local data (gitignored)
|
||||
│ ├── auth/ # WhatsApp authentication state
|
||||
│ └── messages.db # SQLite database (messages, scheduled_tasks, task_run_logs)
|
||||
│ └── messages.db # SQLite database (messages, chats, scheduled_tasks, task_run_logs, registered_groups, sessions, router_state)
|
||||
│
|
||||
├── data/ # Application state (gitignored)
|
||||
│ ├── sessions.json # Active session IDs per group
|
||||
│ ├── registered_groups.json # Group JID → folder mapping
|
||||
│ ├── router_state.json # Last processed timestamp + last agent timestamps
|
||||
│ ├── sessions/ # Per-group session data (.claude/ dirs with JSONL transcripts)
|
||||
│ ├── env/env # Copy of .env for container mounting
|
||||
│ └── ipc/ # Container IPC (messages/, tasks/)
|
||||
│
|
||||
@@ -181,8 +184,10 @@ export const DATA_DIR = path.resolve(PROJECT_ROOT, 'data');
|
||||
|
||||
// Container configuration
|
||||
export const CONTAINER_IMAGE = process.env.CONTAINER_IMAGE || 'nanoclaw-agent:latest';
|
||||
export const CONTAINER_TIMEOUT = parseInt(process.env.CONTAINER_TIMEOUT || '300000', 10);
|
||||
export const CONTAINER_TIMEOUT = parseInt(process.env.CONTAINER_TIMEOUT || '1800000', 10); // 30min default
|
||||
export const IPC_POLL_INTERVAL = 1000;
|
||||
export const IDLE_TIMEOUT = parseInt(process.env.IDLE_TIMEOUT || '1800000', 10); // 30min — keep container alive after last result
|
||||
export const MAX_CONCURRENT_CONTAINERS = Math.max(1, parseInt(process.env.MAX_CONCURRENT_CONTAINERS || '5', 10) || 5);
|
||||
|
||||
export const TRIGGER_PATTERN = new RegExp(`^@${ASSISTANT_NAME}\\b`, 'i');
|
||||
```
|
||||
@@ -191,27 +196,25 @@ export const TRIGGER_PATTERN = new RegExp(`^@${ASSISTANT_NAME}\\b`, 'i');
|
||||
|
||||
### Container Configuration
|
||||
|
||||
Groups can have additional directories mounted via `containerConfig` in `data/registered_groups.json`:
|
||||
Groups can have additional directories mounted via `containerConfig` in the SQLite `registered_groups` table (stored as JSON in the `container_config` column). Example registration:
|
||||
|
||||
```json
|
||||
{
|
||||
"1234567890@g.us": {
|
||||
"name": "Dev Team",
|
||||
"folder": "dev-team",
|
||||
"trigger": "@Andy",
|
||||
"added_at": "2026-01-31T12:00:00Z",
|
||||
"containerConfig": {
|
||||
"additionalMounts": [
|
||||
{
|
||||
"hostPath": "~/projects/webapp",
|
||||
"containerPath": "webapp",
|
||||
"readonly": false
|
||||
}
|
||||
],
|
||||
"timeout": 600000
|
||||
}
|
||||
}
|
||||
}
|
||||
```typescript
|
||||
registerGroup("1234567890@g.us", {
|
||||
name: "Dev Team",
|
||||
folder: "dev-team",
|
||||
trigger: "@Andy",
|
||||
added_at: new Date().toISOString(),
|
||||
containerConfig: {
|
||||
additionalMounts: [
|
||||
{
|
||||
hostPath: "~/projects/webapp",
|
||||
containerPath: "webapp",
|
||||
readonly: false,
|
||||
},
|
||||
],
|
||||
timeout: 600000,
|
||||
},
|
||||
});
|
||||
```
|
||||
|
||||
Additional mounts appear at `/workspace/extra/{containerPath}` inside the container.
|
||||
@@ -233,7 +236,7 @@ The token can be extracted from `~/.claude/.credentials.json` if you're logged i
|
||||
ANTHROPIC_API_KEY=sk-ant-api03-...
|
||||
```
|
||||
|
||||
Only the authentication variables (`CLAUDE_CODE_OAUTH_TOKEN` and `ANTHROPIC_API_KEY`) are extracted from `.env` and mounted into the container at `/workspace/env-dir/env`, then sourced by the entrypoint script. This ensures other environment variables in `.env` are not exposed to the agent. This workaround is needed because Apple Container loses `-e` environment variables when using `-i` (interactive mode with piped stdin).
|
||||
Only the authentication variables (`CLAUDE_CODE_OAUTH_TOKEN` and `ANTHROPIC_API_KEY`) are extracted from `.env` and written to `data/env/env`, then mounted into the container at `/workspace/env-dir/env` and sourced by the entrypoint script. This ensures other environment variables in `.env` are not exposed to the agent. This workaround is needed because Apple Container loses `-e` environment variables when using `-i` (interactive mode with piped stdin).
|
||||
|
||||
### Changing the Assistant Name
|
||||
|
||||
@@ -295,17 +298,10 @@ Sessions enable conversation continuity - Claude remembers what you talked about
|
||||
|
||||
### How Sessions Work
|
||||
|
||||
1. Each group has a session ID stored in `data/sessions.json`
|
||||
1. Each group has a session ID stored in SQLite (`sessions` table, keyed by `group_folder`)
|
||||
2. Session ID is passed to Claude Agent SDK's `resume` option
|
||||
3. Claude continues the conversation with full context
|
||||
|
||||
**data/sessions.json:**
|
||||
```json
|
||||
{
|
||||
"main": "session-abc123",
|
||||
"Family Chat": "session-def456"
|
||||
}
|
||||
```
|
||||
4. Session transcripts are stored as JSONL files in `data/sessions/{group}/.claude/`
|
||||
|
||||
---
|
||||
|
||||
@@ -327,8 +323,8 @@ Sessions enable conversation continuity - Claude remembers what you talked about
|
||||
│
|
||||
▼
|
||||
5. Router checks:
|
||||
├── Is chat_jid in registered_groups.json? → No: ignore
|
||||
└── Does message start with @Assistant? → No: ignore
|
||||
├── Is chat_jid in registered groups (SQLite)? → No: ignore
|
||||
└── Does message match trigger pattern? → No: store but don't process
|
||||
│
|
||||
▼
|
||||
6. Router catches up conversation:
|
||||
@@ -484,13 +480,15 @@ NanoClaw runs as a single macOS launchd service.
|
||||
### Startup Sequence
|
||||
|
||||
When NanoClaw starts, it:
|
||||
1. **Ensures Apple Container system is running** - Automatically starts it if needed (survives reboots)
|
||||
2. Initializes the SQLite database
|
||||
3. Loads state (registered groups, sessions, router state)
|
||||
4. Connects to WhatsApp
|
||||
5. Starts the message polling loop
|
||||
6. Starts the scheduler loop
|
||||
7. Starts the IPC watcher for container messages
|
||||
1. **Ensures Apple Container system is running** - Automatically starts it if needed; kills orphaned NanoClaw containers from previous runs
|
||||
2. Initializes the SQLite database (migrates from JSON files if they exist)
|
||||
3. Loads state from SQLite (registered groups, sessions, router state)
|
||||
4. Connects to WhatsApp (on `connection.open`):
|
||||
- Starts the scheduler loop
|
||||
- Starts the IPC watcher for container messages
|
||||
- Sets up the per-group queue with `processGroupMessages`
|
||||
- Recovers any unprocessed messages from before shutdown
|
||||
- Starts the message polling loop
|
||||
|
||||
### Service: com.nanoclaw
|
||||
|
||||
@@ -605,7 +603,7 @@ chmod 700 groups/
|
||||
| No response to messages | Service not running | Check `launchctl list | grep nanoclaw` |
|
||||
| "Claude Code process exited with code 1" | Apple Container failed to start | Check logs; NanoClaw auto-starts container system but may fail |
|
||||
| "Claude Code process exited with code 1" | Session mount path wrong | Ensure mount is to `/home/node/.claude/` not `/root/.claude/` |
|
||||
| Session not continuing | Session ID not saved | Check `data/sessions.json` |
|
||||
| Session not continuing | Session ID not saved | Check SQLite: `sqlite3 store/messages.db "SELECT * FROM sessions"` |
|
||||
| Session not continuing | Mount path mismatch | Container user is `node` with HOME=/home/node; sessions must be at `/home/node/.claude/` |
|
||||
| "QR code expired" | WhatsApp session expired | Delete store/auth/ and restart |
|
||||
| "No groups registered" | Haven't added groups | Use `@Andy add group "Name"` in main |
|
||||
|
||||
@@ -14,14 +14,25 @@ You are Andy, a personal assistant. You help with tasks, answer questions, and c
|
||||
|
||||
## Communication
|
||||
|
||||
You have two ways to send messages to the user or group:
|
||||
Your output is sent to the user or group.
|
||||
|
||||
- **mcp__nanoclaw__send_message tool** — Sends a message to the user or group immediately, while you're still running. You can call it multiple times.
|
||||
- **Output userMessage** — When your outputType is "message", this is sent to the user or group.
|
||||
You also have `mcp__nanoclaw__send_message` which sends a message immediately while you're still working. This is useful when you want to acknowledge a request before starting longer work.
|
||||
|
||||
Your output **internalLog** is information that will be logged internally but not sent to the user or group.
|
||||
### Internal thoughts
|
||||
|
||||
For requests that can take time, consider sending a quick acknowledgment if appropriate via mcp__nanoclaw__send_message so the user knows you're working on it.
|
||||
If part of your output is internal reasoning rather than something for the user, wrap it in `<internal>` tags:
|
||||
|
||||
```
|
||||
<internal>Compiled all three reports, ready to summarize.</internal>
|
||||
|
||||
Here are the key findings from the research...
|
||||
```
|
||||
|
||||
Text inside `<internal>` tags is logged but not sent to the user. If you've already sent the key information via `send_message`, you can wrap the recap in `<internal>` to avoid sending it again.
|
||||
|
||||
### Sub-agents and teammates
|
||||
|
||||
When working as a sub-agent or teammate, only use `send_message` if instructed to by the main agent.
|
||||
|
||||
## Your Workspace
|
||||
|
||||
|
||||
@@ -14,14 +14,25 @@ You are Andy, a personal assistant. You help with tasks, answer questions, and c
|
||||
|
||||
## Communication
|
||||
|
||||
You have two ways to send messages to the user or group:
|
||||
Your output is sent to the user or group.
|
||||
|
||||
- **mcp__nanoclaw__send_message tool** — Sends a message to the user or group immediately, while you're still running. You can call it multiple times.
|
||||
- **Output userMessage** — When your outputType is "message", this is sent to the user or group.
|
||||
You also have `mcp__nanoclaw__send_message` which sends a message immediately while you're still working. This is useful when you want to acknowledge a request before starting longer work.
|
||||
|
||||
Your output **internalLog** is information that will be logged internally but not sent to the user or group.
|
||||
### Internal thoughts
|
||||
|
||||
For requests that can take time, consider sending a quick acknowledgment if appropriate via mcp__nanoclaw__send_message so the user knows you're working on it.
|
||||
If part of your output is internal reasoning rather than something for the user, wrap it in `<internal>` tags:
|
||||
|
||||
```
|
||||
<internal>Compiled all three reports, ready to summarize.</internal>
|
||||
|
||||
Here are the key findings from the research...
|
||||
```
|
||||
|
||||
Text inside `<internal>` tags is logged but not sent to the user. If you've already sent the key information via `send_message`, you can wrap the recap in `<internal>` to avoid sending it again.
|
||||
|
||||
### Sub-agents and teammates
|
||||
|
||||
When working as a sub-agent or teammate, only use `send_message` if instructed to by the main agent.
|
||||
|
||||
## Memory
|
||||
|
||||
@@ -60,7 +71,7 @@ Main has access to the entire project:
|
||||
|
||||
Key paths inside the container:
|
||||
- `/workspace/project/store/messages.db` - SQLite database
|
||||
- `/workspace/project/data/registered_groups.json` - Group config
|
||||
- `/workspace/project/store/messages.db` (registered_groups table) - Group config
|
||||
- `/workspace/project/groups/` - All group folders
|
||||
|
||||
---
|
||||
|
||||
@@ -23,7 +23,7 @@ export const MAIN_GROUP_FOLDER = 'main';
|
||||
export const CONTAINER_IMAGE =
|
||||
process.env.CONTAINER_IMAGE || 'nanoclaw-agent:latest';
|
||||
export const CONTAINER_TIMEOUT = parseInt(
|
||||
process.env.CONTAINER_TIMEOUT || '300000',
|
||||
process.env.CONTAINER_TIMEOUT || '1800000',
|
||||
10,
|
||||
);
|
||||
export const CONTAINER_MAX_OUTPUT_SIZE = parseInt(
|
||||
@@ -31,6 +31,10 @@ export const CONTAINER_MAX_OUTPUT_SIZE = parseInt(
|
||||
10,
|
||||
); // 10MB default
|
||||
export const IPC_POLL_INTERVAL = 1000;
|
||||
export const IDLE_TIMEOUT = parseInt(
|
||||
process.env.IDLE_TIMEOUT || '1800000',
|
||||
10,
|
||||
); // 30min default — how long to keep container alive after last result
|
||||
export const MAX_CONCURRENT_CONTAINERS = Math.max(
|
||||
1,
|
||||
parseInt(process.env.MAX_CONCURRENT_CONTAINERS || '5', 10) || 5,
|
||||
|
||||
@@ -38,17 +38,12 @@ export interface ContainerInput {
|
||||
groupFolder: string;
|
||||
chatJid: string;
|
||||
isMain: boolean;
|
||||
}
|
||||
|
||||
export interface AgentResponse {
|
||||
outputType: 'message' | 'log';
|
||||
userMessage?: string;
|
||||
internalLog?: string;
|
||||
isScheduledTask?: boolean;
|
||||
}
|
||||
|
||||
export interface ContainerOutput {
|
||||
status: 'success' | 'error';
|
||||
result: AgentResponse | null;
|
||||
result: string | null;
|
||||
newSessionId?: string;
|
||||
error?: string;
|
||||
}
|
||||
@@ -110,6 +105,29 @@ function buildVolumeMounts(
|
||||
'.claude',
|
||||
);
|
||||
fs.mkdirSync(groupSessionsDir, { recursive: true });
|
||||
const settingsFile = path.join(groupSessionsDir, 'settings.json');
|
||||
if (!fs.existsSync(settingsFile)) {
|
||||
fs.writeFileSync(settingsFile, JSON.stringify({
|
||||
env: { CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS: '1' },
|
||||
}, null, 2) + '\n');
|
||||
}
|
||||
|
||||
// Sync skills from container/skills/ into each group's .claude/skills/
|
||||
const skillsSrc = path.join(process.cwd(), 'container', 'skills');
|
||||
const skillsDst = path.join(groupSessionsDir, 'skills');
|
||||
if (fs.existsSync(skillsSrc)) {
|
||||
for (const skillDir of fs.readdirSync(skillsSrc)) {
|
||||
const srcDir = path.join(skillsSrc, skillDir);
|
||||
if (!fs.statSync(srcDir).isDirectory()) continue;
|
||||
const dstDir = path.join(skillsDst, skillDir);
|
||||
fs.mkdirSync(dstDir, { recursive: true });
|
||||
for (const file of fs.readdirSync(srcDir)) {
|
||||
const srcFile = path.join(srcDir, file);
|
||||
const dstFile = path.join(dstDir, file);
|
||||
fs.copyFileSync(srcFile, dstFile);
|
||||
}
|
||||
}
|
||||
}
|
||||
mounts.push({
|
||||
hostPath: groupSessionsDir,
|
||||
containerPath: '/home/node/.claude',
|
||||
@@ -121,6 +139,7 @@ function buildVolumeMounts(
|
||||
const groupIpcDir = path.join(DATA_DIR, 'ipc', group.folder);
|
||||
fs.mkdirSync(path.join(groupIpcDir, 'messages'), { recursive: true });
|
||||
fs.mkdirSync(path.join(groupIpcDir, 'tasks'), { recursive: true });
|
||||
fs.mkdirSync(path.join(groupIpcDir, 'input'), { recursive: true });
|
||||
mounts.push({
|
||||
hostPath: groupIpcDir,
|
||||
containerPath: '/workspace/ipc',
|
||||
@@ -154,6 +173,15 @@ function buildVolumeMounts(
|
||||
}
|
||||
}
|
||||
|
||||
// Mount agent-runner source from host — recompiled on container startup.
|
||||
// Bypasses Apple Container's sticky build cache for code changes.
|
||||
const agentRunnerSrc = path.join(projectRoot, 'container', 'agent-runner', 'src');
|
||||
mounts.push({
|
||||
hostPath: agentRunnerSrc,
|
||||
containerPath: '/app/src',
|
||||
readonly: true,
|
||||
});
|
||||
|
||||
// Additional mounts validated against external allowlist (tamper-proof from containers)
|
||||
if (group.containerConfig?.additionalMounts) {
|
||||
const validatedMounts = validateAdditionalMounts(
|
||||
@@ -191,6 +219,7 @@ export async function runContainerAgent(
|
||||
group: RegisteredGroup,
|
||||
input: ContainerInput,
|
||||
onProcess: (proc: ChildProcess, containerName: string) => void,
|
||||
onOutput?: (output: ContainerOutput) => Promise<void>,
|
||||
): Promise<ContainerOutput> {
|
||||
const startTime = Date.now();
|
||||
|
||||
@@ -240,22 +269,63 @@ export async function runContainerAgent(
|
||||
let stdoutTruncated = false;
|
||||
let stderrTruncated = false;
|
||||
|
||||
// Write input and close stdin (Apple Container doesn't flush pipe without EOF)
|
||||
container.stdin.write(JSON.stringify(input));
|
||||
container.stdin.end();
|
||||
|
||||
// Streaming output: parse OUTPUT_START/END marker pairs as they arrive
|
||||
let parseBuffer = '';
|
||||
let newSessionId: string | undefined;
|
||||
let outputChain = Promise.resolve();
|
||||
|
||||
container.stdout.on('data', (data) => {
|
||||
if (stdoutTruncated) return;
|
||||
const chunk = data.toString();
|
||||
const remaining = CONTAINER_MAX_OUTPUT_SIZE - stdout.length;
|
||||
if (chunk.length > remaining) {
|
||||
stdout += chunk.slice(0, remaining);
|
||||
stdoutTruncated = true;
|
||||
logger.warn(
|
||||
{ group: group.name, size: stdout.length },
|
||||
'Container stdout truncated due to size limit',
|
||||
);
|
||||
} else {
|
||||
stdout += chunk;
|
||||
|
||||
// Always accumulate for logging
|
||||
if (!stdoutTruncated) {
|
||||
const remaining = CONTAINER_MAX_OUTPUT_SIZE - stdout.length;
|
||||
if (chunk.length > remaining) {
|
||||
stdout += chunk.slice(0, remaining);
|
||||
stdoutTruncated = true;
|
||||
logger.warn(
|
||||
{ group: group.name, size: stdout.length },
|
||||
'Container stdout truncated due to size limit',
|
||||
);
|
||||
} else {
|
||||
stdout += chunk;
|
||||
}
|
||||
}
|
||||
|
||||
// Stream-parse for output markers
|
||||
if (onOutput) {
|
||||
parseBuffer += chunk;
|
||||
let startIdx: number;
|
||||
while ((startIdx = parseBuffer.indexOf(OUTPUT_START_MARKER)) !== -1) {
|
||||
const endIdx = parseBuffer.indexOf(OUTPUT_END_MARKER, startIdx);
|
||||
if (endIdx === -1) break; // Incomplete pair, wait for more data
|
||||
|
||||
const jsonStr = parseBuffer
|
||||
.slice(startIdx + OUTPUT_START_MARKER.length, endIdx)
|
||||
.trim();
|
||||
parseBuffer = parseBuffer.slice(endIdx + OUTPUT_END_MARKER.length);
|
||||
|
||||
try {
|
||||
const parsed: ContainerOutput = JSON.parse(jsonStr);
|
||||
if (parsed.newSessionId) {
|
||||
newSessionId = parsed.newSessionId;
|
||||
}
|
||||
// Activity detected — reset the hard timeout
|
||||
resetTimeout();
|
||||
// Call onOutput for all markers (including null results)
|
||||
// so idle timers start even for "silent" query completions.
|
||||
outputChain = outputChain.then(() => onOutput(parsed));
|
||||
} catch (err) {
|
||||
logger.warn(
|
||||
{ group: group.name, error: err },
|
||||
'Failed to parse streamed output chunk',
|
||||
);
|
||||
}
|
||||
}
|
||||
}
|
||||
});
|
||||
|
||||
@@ -265,6 +335,8 @@ export async function runContainerAgent(
|
||||
for (const line of lines) {
|
||||
if (line) logger.debug({ container: group.folder }, line);
|
||||
}
|
||||
// Don't reset timeout on stderr — SDK writes debug logs continuously.
|
||||
// Timeout only resets on actual output (OUTPUT_MARKER in stdout).
|
||||
if (stderrTruncated) return;
|
||||
const remaining = CONTAINER_MAX_OUTPUT_SIZE - stderr.length;
|
||||
if (chunk.length > remaining) {
|
||||
@@ -280,18 +352,26 @@ export async function runContainerAgent(
|
||||
});
|
||||
|
||||
let timedOut = false;
|
||||
const timeoutMs = group.containerConfig?.timeout || CONTAINER_TIMEOUT;
|
||||
|
||||
const timeout = setTimeout(() => {
|
||||
const killOnTimeout = () => {
|
||||
timedOut = true;
|
||||
logger.error({ group: group.name, containerName }, 'Container timeout, stopping gracefully');
|
||||
// Graceful stop: sends SIGTERM, waits, then SIGKILL — lets --rm fire
|
||||
exec(`container stop ${containerName}`, { timeout: 15000 }, (err) => {
|
||||
if (err) {
|
||||
logger.warn({ group: group.name, containerName, err }, 'Graceful stop failed, force killing');
|
||||
container.kill('SIGKILL');
|
||||
}
|
||||
});
|
||||
}, group.containerConfig?.timeout || CONTAINER_TIMEOUT);
|
||||
};
|
||||
|
||||
let timeout = setTimeout(killOnTimeout, timeoutMs);
|
||||
|
||||
// Reset the timeout whenever there's activity (streaming output)
|
||||
const resetTimeout = () => {
|
||||
clearTimeout(timeout);
|
||||
timeout = setTimeout(killOnTimeout, timeoutMs);
|
||||
};
|
||||
|
||||
container.on('close', (code) => {
|
||||
clearTimeout(timeout);
|
||||
@@ -324,8 +404,7 @@ export async function runContainerAgent(
|
||||
|
||||
const timestamp = new Date().toISOString().replace(/[:.]/g, '-');
|
||||
const logFile = path.join(logsDir, `container-${timestamp}.log`);
|
||||
const isVerbose =
|
||||
process.env.LOG_LEVEL === 'debug' || process.env.LOG_LEVEL === 'trace';
|
||||
const isVerbose = process.env.LOG_LEVEL === 'debug' || process.env.LOG_LEVEL === 'trace';
|
||||
|
||||
const logLines = [
|
||||
`=== Container Run Log ===`,
|
||||
@@ -401,6 +480,23 @@ export async function runContainerAgent(
|
||||
return;
|
||||
}
|
||||
|
||||
// Streaming mode: wait for output chain to settle, return completion marker
|
||||
if (onOutput) {
|
||||
outputChain.then(() => {
|
||||
logger.info(
|
||||
{ group: group.name, duration, newSessionId },
|
||||
'Container completed (streaming mode)',
|
||||
);
|
||||
resolve({
|
||||
status: 'success',
|
||||
result: null,
|
||||
newSessionId,
|
||||
});
|
||||
});
|
||||
return;
|
||||
}
|
||||
|
||||
// Legacy mode: parse the last output marker pair from accumulated stdout
|
||||
try {
|
||||
// Extract JSON between sentinel markers for robust parsing
|
||||
const startIdx = stdout.indexOf(OUTPUT_START_MARKER);
|
||||
|
||||
@@ -1,6 +1,8 @@
|
||||
import { ChildProcess, exec } from 'child_process';
|
||||
import { ChildProcess } from 'child_process';
|
||||
import fs from 'fs';
|
||||
import path from 'path';
|
||||
|
||||
import { MAX_CONCURRENT_CONTAINERS } from './config.js';
|
||||
import { DATA_DIR, MAX_CONCURRENT_CONTAINERS } from './config.js';
|
||||
import { logger } from './logger.js';
|
||||
|
||||
interface QueuedTask {
|
||||
@@ -18,6 +20,7 @@ interface GroupState {
|
||||
pendingTasks: QueuedTask[];
|
||||
process: ChildProcess | null;
|
||||
containerName: string | null;
|
||||
groupFolder: string | null;
|
||||
retryCount: number;
|
||||
}
|
||||
|
||||
@@ -38,6 +41,7 @@ export class GroupQueue {
|
||||
pendingTasks: [],
|
||||
process: null,
|
||||
containerName: null,
|
||||
groupFolder: null,
|
||||
retryCount: 0,
|
||||
};
|
||||
this.groups.set(groupJid, state);
|
||||
@@ -108,10 +112,49 @@ export class GroupQueue {
|
||||
this.runTask(groupJid, { id: taskId, groupJid, fn });
|
||||
}
|
||||
|
||||
registerProcess(groupJid: string, proc: ChildProcess, containerName: string): void {
|
||||
registerProcess(groupJid: string, proc: ChildProcess, containerName: string, groupFolder?: string): void {
|
||||
const state = this.getGroup(groupJid);
|
||||
state.process = proc;
|
||||
state.containerName = containerName;
|
||||
if (groupFolder) state.groupFolder = groupFolder;
|
||||
}
|
||||
|
||||
/**
|
||||
* Send a follow-up message to the active container via IPC file.
|
||||
* Returns true if the message was written, false if no active container.
|
||||
*/
|
||||
sendMessage(groupJid: string, text: string): boolean {
|
||||
const state = this.getGroup(groupJid);
|
||||
if (!state.active || !state.groupFolder) return false;
|
||||
|
||||
const inputDir = path.join(DATA_DIR, 'ipc', state.groupFolder, 'input');
|
||||
try {
|
||||
fs.mkdirSync(inputDir, { recursive: true });
|
||||
const filename = `${Date.now()}-${Math.random().toString(36).slice(2, 6)}.json`;
|
||||
const filepath = path.join(inputDir, filename);
|
||||
const tempPath = `${filepath}.tmp`;
|
||||
fs.writeFileSync(tempPath, JSON.stringify({ type: 'message', text }));
|
||||
fs.renameSync(tempPath, filepath);
|
||||
return true;
|
||||
} catch {
|
||||
return false;
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Signal the active container to wind down by writing a close sentinel.
|
||||
*/
|
||||
closeStdin(groupJid: string): void {
|
||||
const state = this.getGroup(groupJid);
|
||||
if (!state.active || !state.groupFolder) return;
|
||||
|
||||
const inputDir = path.join(DATA_DIR, 'ipc', state.groupFolder, 'input');
|
||||
try {
|
||||
fs.mkdirSync(inputDir, { recursive: true });
|
||||
fs.writeFileSync(path.join(inputDir, '_close'), '');
|
||||
} catch {
|
||||
// ignore
|
||||
}
|
||||
}
|
||||
|
||||
private async runForGroup(
|
||||
@@ -144,6 +187,7 @@ export class GroupQueue {
|
||||
state.active = false;
|
||||
state.process = null;
|
||||
state.containerName = null;
|
||||
state.groupFolder = null;
|
||||
this.activeCount--;
|
||||
this.drainGroup(groupJid);
|
||||
}
|
||||
@@ -167,6 +211,7 @@ export class GroupQueue {
|
||||
state.active = false;
|
||||
state.process = null;
|
||||
state.containerName = null;
|
||||
state.groupFolder = null;
|
||||
this.activeCount--;
|
||||
this.drainGroup(groupJid);
|
||||
}
|
||||
@@ -236,65 +281,22 @@ export class GroupQueue {
|
||||
}
|
||||
}
|
||||
|
||||
async shutdown(gracePeriodMs: number): Promise<void> {
|
||||
async shutdown(_gracePeriodMs: number): Promise<void> {
|
||||
this.shuttingDown = true;
|
||||
logger.info(
|
||||
{ activeCount: this.activeCount, gracePeriodMs },
|
||||
'GroupQueue shutting down',
|
||||
);
|
||||
|
||||
// Collect all active processes
|
||||
const activeProcs: Array<{ jid: string; proc: ChildProcess; containerName: string | null }> = [];
|
||||
// Count active containers but don't kill them — they'll finish on their own
|
||||
// via idle timeout or container timeout. The --rm flag cleans them up on exit.
|
||||
// This prevents WhatsApp reconnection restarts from killing working agents.
|
||||
const activeContainers: string[] = [];
|
||||
for (const [jid, state] of this.groups) {
|
||||
if (state.process && !state.process.killed) {
|
||||
activeProcs.push({ jid, proc: state.process, containerName: state.containerName });
|
||||
if (state.process && !state.process.killed && state.containerName) {
|
||||
activeContainers.push(state.containerName);
|
||||
}
|
||||
}
|
||||
|
||||
if (activeProcs.length === 0) return;
|
||||
|
||||
// Stop all active containers gracefully
|
||||
for (const { jid, proc, containerName } of activeProcs) {
|
||||
if (containerName) {
|
||||
// Defense-in-depth: re-sanitize before shell interpolation.
|
||||
// Primary sanitization is in container-runner.ts when building the name,
|
||||
// but we sanitize again here since exec() runs through a shell.
|
||||
const safeName = containerName.replace(/[^a-zA-Z0-9-]/g, '');
|
||||
logger.info({ jid, containerName: safeName }, 'Stopping container');
|
||||
exec(`container stop ${safeName}`, (err) => {
|
||||
if (err) {
|
||||
logger.warn({ jid, containerName: safeName, err: err.message }, 'container stop failed');
|
||||
}
|
||||
});
|
||||
} else {
|
||||
logger.info({ jid, pid: proc.pid }, 'Sending SIGTERM to process');
|
||||
proc.kill('SIGTERM');
|
||||
}
|
||||
}
|
||||
|
||||
// Wait for grace period
|
||||
await new Promise<void>((resolve) => {
|
||||
const checkInterval = setInterval(() => {
|
||||
const alive = activeProcs.filter(
|
||||
({ proc }) => !proc.killed && proc.exitCode === null,
|
||||
);
|
||||
if (alive.length === 0) {
|
||||
clearInterval(checkInterval);
|
||||
resolve();
|
||||
}
|
||||
}, 500);
|
||||
|
||||
setTimeout(() => {
|
||||
clearInterval(checkInterval);
|
||||
// SIGKILL survivors
|
||||
for (const { jid, proc } of activeProcs) {
|
||||
if (!proc.killed && proc.exitCode === null) {
|
||||
logger.warn({ jid, pid: proc.pid }, 'Sending SIGKILL to container');
|
||||
proc.kill('SIGKILL');
|
||||
}
|
||||
}
|
||||
resolve();
|
||||
}, gracePeriodMs);
|
||||
});
|
||||
logger.info(
|
||||
{ activeCount: this.activeCount, detachedContainers: activeContainers },
|
||||
'GroupQueue shutting down (containers detached, not killed)',
|
||||
);
|
||||
}
|
||||
}
|
||||
|
||||
250
src/index.ts
250
src/index.ts
@@ -13,6 +13,7 @@ import { CronExpressionParser } from 'cron-parser';
|
||||
import {
|
||||
ASSISTANT_NAME,
|
||||
DATA_DIR,
|
||||
IDLE_TIMEOUT,
|
||||
IPC_POLL_INTERVAL,
|
||||
MAIN_GROUP_FOLDER,
|
||||
POLL_INTERVAL,
|
||||
@@ -21,8 +22,8 @@ import {
|
||||
TRIGGER_PATTERN,
|
||||
} from './config.js';
|
||||
import {
|
||||
AgentResponse,
|
||||
AvailableGroup,
|
||||
ContainerOutput,
|
||||
runContainerAgent,
|
||||
writeGroupsSnapshot,
|
||||
writeTasksSnapshot,
|
||||
@@ -51,7 +52,7 @@ import {
|
||||
} from './db.js';
|
||||
import { GroupQueue } from './group-queue.js';
|
||||
import { startSchedulerLoop } from './task-scheduler.js';
|
||||
import { RegisteredGroup } from './types.js';
|
||||
import { NewMessage, RegisteredGroup } from './types.js';
|
||||
import { logger } from './logger.js';
|
||||
|
||||
const GROUP_SYNC_INTERVAL_MS = 24 * 60 * 60 * 1000; // 24 hours
|
||||
@@ -67,6 +68,9 @@ let lidToPhoneMap: Record<string, string> = {};
|
||||
let messageLoopRunning = false;
|
||||
let ipcWatcherRunning = false;
|
||||
let groupSyncTimerStarted = false;
|
||||
// WhatsApp connection state and outgoing message queue
|
||||
let waConnected = false;
|
||||
const outgoingQueue: Array<{ jid: string; text: string }> = [];
|
||||
|
||||
const queue = new GroupQueue();
|
||||
|
||||
@@ -189,9 +193,28 @@ function getAvailableGroups(): AvailableGroup[] {
|
||||
}));
|
||||
}
|
||||
|
||||
function escapeXml(s: string): string {
|
||||
return s
|
||||
.replace(/&/g, '&')
|
||||
.replace(/</g, '<')
|
||||
.replace(/>/g, '>')
|
||||
.replace(/"/g, '"');
|
||||
}
|
||||
|
||||
function formatMessages(messages: NewMessage[]): string {
|
||||
const lines = messages.map((m) =>
|
||||
`<message sender="${escapeXml(m.sender_name)}" time="${m.timestamp}">${escapeXml(m.content)}</message>`,
|
||||
);
|
||||
return `<messages>\n${lines.join('\n')}\n</messages>`;
|
||||
}
|
||||
|
||||
/**
|
||||
* Process all pending messages for a group.
|
||||
* Called by the GroupQueue when it's this group's turn.
|
||||
*
|
||||
* Uses streaming output: agent results are sent to WhatsApp as they arrive.
|
||||
* The container stays alive for IDLE_TIMEOUT after each result, allowing
|
||||
* rapid-fire messages to be piped in without spawning a new container.
|
||||
*/
|
||||
async function processGroupMessages(chatJid: string): Promise<boolean> {
|
||||
const group = registeredGroups[chatJid];
|
||||
@@ -217,47 +240,64 @@ async function processGroupMessages(chatJid: string): Promise<boolean> {
|
||||
if (!hasTrigger) return true;
|
||||
}
|
||||
|
||||
const lines = missedMessages.map((m) => {
|
||||
const escapeXml = (s: string) =>
|
||||
s
|
||||
.replace(/&/g, '&')
|
||||
.replace(/</g, '<')
|
||||
.replace(/>/g, '>')
|
||||
.replace(/"/g, '"');
|
||||
return `<message sender="${escapeXml(m.sender_name)}" time="${m.timestamp}">${escapeXml(m.content)}</message>`;
|
||||
});
|
||||
const prompt = `<messages>\n${lines.join('\n')}\n</messages>`;
|
||||
const prompt = formatMessages(missedMessages);
|
||||
|
||||
// Advance cursor so the piping path in startMessageLoop won't re-fetch
|
||||
// these messages. Save the old cursor so we can roll back on error.
|
||||
const previousCursor = lastAgentTimestamp[chatJid] || '';
|
||||
lastAgentTimestamp[chatJid] =
|
||||
missedMessages[missedMessages.length - 1].timestamp;
|
||||
saveState();
|
||||
|
||||
logger.info(
|
||||
{ group: group.name, messageCount: missedMessages.length },
|
||||
'Processing messages',
|
||||
);
|
||||
|
||||
// Track idle timer for closing stdin when agent is idle
|
||||
let idleTimer: ReturnType<typeof setTimeout> | null = null;
|
||||
|
||||
const resetIdleTimer = () => {
|
||||
if (idleTimer) clearTimeout(idleTimer);
|
||||
idleTimer = setTimeout(() => {
|
||||
logger.debug({ group: group.name }, 'Idle timeout, closing container stdin');
|
||||
queue.closeStdin(chatJid);
|
||||
}, IDLE_TIMEOUT);
|
||||
};
|
||||
|
||||
await setTyping(chatJid, true);
|
||||
const response = await runAgent(group, prompt, chatJid);
|
||||
let hadError = false;
|
||||
|
||||
const output = await runAgent(group, prompt, chatJid, async (result) => {
|
||||
// Streaming output callback — called for each agent result
|
||||
if (result.result) {
|
||||
const raw = typeof result.result === 'string' ? result.result : JSON.stringify(result.result);
|
||||
// Strip <internal>...</internal> blocks — agent uses these for internal reasoning
|
||||
const text = raw.replace(/<internal>[\s\S]*?<\/internal>/g, '').trim();
|
||||
logger.info({ group: group.name }, `Agent output: ${raw.slice(0, 200)}`);
|
||||
if (text) {
|
||||
await sendMessage(chatJid, `${ASSISTANT_NAME}: ${text}`);
|
||||
}
|
||||
// Only reset idle timer on actual results, not session-update markers (result: null)
|
||||
resetIdleTimer();
|
||||
}
|
||||
|
||||
if (result.status === 'error') {
|
||||
hadError = true;
|
||||
}
|
||||
});
|
||||
|
||||
await setTyping(chatJid, false);
|
||||
if (idleTimer) clearTimeout(idleTimer);
|
||||
|
||||
if (response === 'error') {
|
||||
// Container or agent error — signal failure so queue can retry with backoff
|
||||
if (output === 'error' || hadError) {
|
||||
// Roll back cursor so retries can re-process these messages
|
||||
lastAgentTimestamp[chatJid] = previousCursor;
|
||||
saveState();
|
||||
logger.warn({ group: group.name }, 'Agent error, rolled back message cursor for retry');
|
||||
return false;
|
||||
}
|
||||
|
||||
// Agent processed messages successfully (whether it responded or stayed silent)
|
||||
lastAgentTimestamp[chatJid] =
|
||||
missedMessages[missedMessages.length - 1].timestamp;
|
||||
saveState();
|
||||
|
||||
if (response.outputType === 'message' && response.userMessage) {
|
||||
await sendMessage(chatJid, `${ASSISTANT_NAME}: ${response.userMessage}`);
|
||||
}
|
||||
|
||||
if (response.internalLog) {
|
||||
logger.info(
|
||||
{ group: group.name, outputType: response.outputType },
|
||||
`Agent: ${response.internalLog}`,
|
||||
);
|
||||
}
|
||||
|
||||
return true;
|
||||
}
|
||||
|
||||
@@ -265,7 +305,8 @@ async function runAgent(
|
||||
group: RegisteredGroup,
|
||||
prompt: string,
|
||||
chatJid: string,
|
||||
): Promise<AgentResponse | 'error'> {
|
||||
onOutput?: (output: ContainerOutput) => Promise<void>,
|
||||
): Promise<'success' | 'error'> {
|
||||
const isMain = group.folder === MAIN_GROUP_FOLDER;
|
||||
const sessionId = sessions[group.folder];
|
||||
|
||||
@@ -294,6 +335,17 @@ async function runAgent(
|
||||
new Set(Object.keys(registeredGroups)),
|
||||
);
|
||||
|
||||
// Wrap onOutput to track session ID from streamed results
|
||||
const wrappedOnOutput = onOutput
|
||||
? async (output: ContainerOutput) => {
|
||||
if (output.newSessionId) {
|
||||
sessions[group.folder] = output.newSessionId;
|
||||
setSession(group.folder, output.newSessionId);
|
||||
}
|
||||
await onOutput(output);
|
||||
}
|
||||
: undefined;
|
||||
|
||||
try {
|
||||
const output = await runContainerAgent(
|
||||
group,
|
||||
@@ -304,7 +356,8 @@ async function runAgent(
|
||||
chatJid,
|
||||
isMain,
|
||||
},
|
||||
(proc, containerName) => queue.registerProcess(chatJid, proc, containerName),
|
||||
(proc, containerName) => queue.registerProcess(chatJid, proc, containerName, group.folder),
|
||||
wrappedOnOutput,
|
||||
);
|
||||
|
||||
if (output.newSessionId) {
|
||||
@@ -320,7 +373,7 @@ async function runAgent(
|
||||
return 'error';
|
||||
}
|
||||
|
||||
return output.result ?? { outputType: 'log' };
|
||||
return 'success';
|
||||
} catch (err) {
|
||||
logger.error({ group: group.name, err }, 'Agent error');
|
||||
return 'error';
|
||||
@@ -328,11 +381,36 @@ async function runAgent(
|
||||
}
|
||||
|
||||
async function sendMessage(jid: string, text: string): Promise<void> {
|
||||
if (!waConnected) {
|
||||
outgoingQueue.push({ jid, text });
|
||||
logger.info({ jid, length: text.length, queueSize: outgoingQueue.length }, 'WA disconnected, message queued');
|
||||
return;
|
||||
}
|
||||
try {
|
||||
await sock.sendMessage(jid, { text });
|
||||
logger.info({ jid, length: text.length }, 'Message sent');
|
||||
} catch (err) {
|
||||
logger.error({ jid, err }, 'Failed to send message');
|
||||
// If send fails, queue it for retry on reconnect
|
||||
outgoingQueue.push({ jid, text });
|
||||
logger.warn({ jid, err, queueSize: outgoingQueue.length }, 'Failed to send, message queued');
|
||||
}
|
||||
}
|
||||
|
||||
let flushing = false;
|
||||
async function flushOutgoingQueue(): Promise<void> {
|
||||
if (flushing || outgoingQueue.length === 0) return;
|
||||
flushing = true;
|
||||
try {
|
||||
logger.info({ count: outgoingQueue.length }, 'Flushing outgoing message queue');
|
||||
// Process one at a time — sendMessage re-queues on failure internally.
|
||||
// Shift instead of splice so unattempted messages stay in the queue
|
||||
// if an unexpected error occurs.
|
||||
while (outgoingQueue.length > 0) {
|
||||
const item = outgoingQueue.shift()!;
|
||||
await sendMessage(item.jid, item.text);
|
||||
}
|
||||
} finally {
|
||||
flushing = false;
|
||||
}
|
||||
}
|
||||
|
||||
@@ -710,18 +788,27 @@ async function connectWhatsApp(): Promise<void> {
|
||||
}
|
||||
|
||||
if (connection === 'close') {
|
||||
waConnected = false;
|
||||
const reason = (lastDisconnect?.error as any)?.output?.statusCode;
|
||||
const shouldReconnect = reason !== DisconnectReason.loggedOut;
|
||||
logger.info({ reason, shouldReconnect }, 'Connection closed');
|
||||
logger.info({ reason, shouldReconnect, queuedMessages: outgoingQueue.length }, 'Connection closed');
|
||||
|
||||
if (shouldReconnect) {
|
||||
logger.info('Reconnecting...');
|
||||
connectWhatsApp();
|
||||
connectWhatsApp().catch((err) => {
|
||||
logger.error({ err }, 'Failed to reconnect, retrying in 5s');
|
||||
setTimeout(() => {
|
||||
connectWhatsApp().catch((err2) => {
|
||||
logger.error({ err: err2 }, 'Reconnection retry failed');
|
||||
});
|
||||
}, 5000);
|
||||
});
|
||||
} else {
|
||||
logger.info('Logged out. Run /setup to re-authenticate.');
|
||||
process.exit(0);
|
||||
}
|
||||
} else if (connection === 'open') {
|
||||
waConnected = true;
|
||||
logger.info('Connected to WhatsApp');
|
||||
|
||||
// Build LID to phone mapping from auth state for self-chat translation
|
||||
@@ -734,6 +821,11 @@ async function connectWhatsApp(): Promise<void> {
|
||||
}
|
||||
}
|
||||
|
||||
// Flush any messages queued while disconnected
|
||||
flushOutgoingQueue().catch((err) =>
|
||||
logger.error({ err }, 'Failed to flush outgoing queue'),
|
||||
);
|
||||
|
||||
// Sync group metadata on startup (respects 24h cache)
|
||||
syncGroupMetadata().catch((err) =>
|
||||
logger.error({ err }, 'Initial group sync failed'),
|
||||
@@ -748,11 +840,12 @@ async function connectWhatsApp(): Promise<void> {
|
||||
}, GROUP_SYNC_INTERVAL_MS);
|
||||
}
|
||||
startSchedulerLoop({
|
||||
sendMessage,
|
||||
registeredGroups: () => registeredGroups,
|
||||
getSessions: () => sessions,
|
||||
queue,
|
||||
onProcess: (groupJid, proc, containerName) => queue.registerProcess(groupJid, proc, containerName),
|
||||
onProcess: (groupJid, proc, containerName, groupFolder) => queue.registerProcess(groupJid, proc, containerName, groupFolder),
|
||||
sendMessage,
|
||||
assistantName: ASSISTANT_NAME,
|
||||
});
|
||||
startIpcWatcher();
|
||||
queue.setProcessMessagesFn(processGroupMessages);
|
||||
@@ -817,14 +910,57 @@ async function startMessageLoop(): Promise<void> {
|
||||
lastTimestamp = newTimestamp;
|
||||
saveState();
|
||||
|
||||
// Deduplicate by group and enqueue
|
||||
const groupsWithMessages = new Set<string>();
|
||||
// Deduplicate by group
|
||||
const messagesByGroup = new Map<string, NewMessage[]>();
|
||||
for (const msg of messages) {
|
||||
groupsWithMessages.add(msg.chat_jid);
|
||||
const existing = messagesByGroup.get(msg.chat_jid);
|
||||
if (existing) {
|
||||
existing.push(msg);
|
||||
} else {
|
||||
messagesByGroup.set(msg.chat_jid, [msg]);
|
||||
}
|
||||
}
|
||||
|
||||
for (const chatJid of groupsWithMessages) {
|
||||
queue.enqueueMessageCheck(chatJid);
|
||||
for (const [chatJid, groupMessages] of messagesByGroup) {
|
||||
const group = registeredGroups[chatJid];
|
||||
if (!group) continue;
|
||||
|
||||
const isMainGroup = group.folder === MAIN_GROUP_FOLDER;
|
||||
const needsTrigger = !isMainGroup && group.requiresTrigger !== false;
|
||||
|
||||
// For non-main groups, only act on trigger messages.
|
||||
// Non-trigger messages accumulate in DB and get pulled as
|
||||
// context when a trigger eventually arrives.
|
||||
if (needsTrigger) {
|
||||
const hasTrigger = groupMessages.some((m) =>
|
||||
TRIGGER_PATTERN.test(m.content.trim()),
|
||||
);
|
||||
if (!hasTrigger) continue;
|
||||
}
|
||||
|
||||
// Pull all messages since lastAgentTimestamp so non-trigger
|
||||
// context that accumulated between triggers is included.
|
||||
const allPending = getMessagesSince(
|
||||
chatJid,
|
||||
lastAgentTimestamp[chatJid] || '',
|
||||
ASSISTANT_NAME,
|
||||
);
|
||||
const messagesToSend =
|
||||
allPending.length > 0 ? allPending : groupMessages;
|
||||
const formatted = formatMessages(messagesToSend);
|
||||
|
||||
if (queue.sendMessage(chatJid, formatted)) {
|
||||
logger.debug(
|
||||
{ chatJid, count: messagesToSend.length },
|
||||
'Piped messages to active container',
|
||||
);
|
||||
lastAgentTimestamp[chatJid] =
|
||||
messagesToSend[messagesToSend.length - 1].timestamp;
|
||||
saveState();
|
||||
} else {
|
||||
// No active container — enqueue for a new one
|
||||
queue.enqueueMessageCheck(chatJid);
|
||||
}
|
||||
}
|
||||
}
|
||||
} catch (err) {
|
||||
@@ -891,22 +1027,26 @@ function ensureContainerSystemRunning(): void {
|
||||
}
|
||||
}
|
||||
|
||||
// Clean up stopped NanoClaw containers from previous runs
|
||||
// Kill and clean up orphaned NanoClaw containers from previous runs
|
||||
try {
|
||||
const output = execSync('container ls -a --format {{.Names}}', {
|
||||
const output = execSync('container ls --format json', {
|
||||
stdio: ['pipe', 'pipe', 'pipe'],
|
||||
encoding: 'utf-8',
|
||||
});
|
||||
const stale = output
|
||||
.split('\n')
|
||||
.map((n) => n.trim())
|
||||
.filter((n) => n.startsWith('nanoclaw-'));
|
||||
if (stale.length > 0) {
|
||||
execSync(`container rm ${stale.join(' ')}`, { stdio: 'pipe' });
|
||||
logger.info({ count: stale.length }, 'Cleaned up stopped containers');
|
||||
const containers: { status: string; configuration: { id: string } }[] = JSON.parse(output || '[]');
|
||||
const orphans = containers
|
||||
.filter((c) => c.status === 'running' && c.configuration.id.startsWith('nanoclaw-'))
|
||||
.map((c) => c.configuration.id);
|
||||
for (const name of orphans) {
|
||||
try {
|
||||
execSync(`container stop ${name}`, { stdio: 'pipe' });
|
||||
} catch { /* already stopped */ }
|
||||
}
|
||||
} catch {
|
||||
// No stopped containers or ls/rm not supported
|
||||
if (orphans.length > 0) {
|
||||
logger.info({ count: orphans.length, names: orphans }, 'Stopped orphaned containers');
|
||||
}
|
||||
} catch (err) {
|
||||
logger.warn({ err }, 'Failed to clean up orphaned containers');
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
@@ -4,3 +4,13 @@ export const logger = pino({
|
||||
level: process.env.LOG_LEVEL || 'info',
|
||||
transport: { target: 'pino-pretty', options: { colorize: true } },
|
||||
});
|
||||
|
||||
// Route uncaught errors through pino so they get timestamps in stderr
|
||||
process.on('uncaughtException', (err) => {
|
||||
logger.fatal({ err }, 'Uncaught exception');
|
||||
process.exit(1);
|
||||
});
|
||||
|
||||
process.on('unhandledRejection', (reason) => {
|
||||
logger.error({ err: reason }, 'Unhandled rejection');
|
||||
});
|
||||
|
||||
@@ -221,6 +221,7 @@ export interface MountValidationResult {
|
||||
allowed: boolean;
|
||||
reason: string;
|
||||
realHostPath?: string;
|
||||
resolvedContainerPath?: string;
|
||||
effectiveReadonly?: boolean;
|
||||
}
|
||||
|
||||
@@ -242,11 +243,14 @@ export function validateMount(
|
||||
};
|
||||
}
|
||||
|
||||
// Validate container path first (cheap check)
|
||||
if (!isValidContainerPath(mount.containerPath)) {
|
||||
// Derive containerPath from hostPath basename if not specified
|
||||
const containerPath = mount.containerPath || path.basename(mount.hostPath);
|
||||
|
||||
// Validate container path (cheap check)
|
||||
if (!isValidContainerPath(containerPath)) {
|
||||
return {
|
||||
allowed: false,
|
||||
reason: `Invalid container path: "${mount.containerPath}" - must be relative, non-empty, and not contain ".."`,
|
||||
reason: `Invalid container path: "${containerPath}" - must be relative, non-empty, and not contain ".."`,
|
||||
};
|
||||
}
|
||||
|
||||
@@ -318,6 +322,7 @@ export function validateMount(
|
||||
allowed: true,
|
||||
reason: `Allowed under root "${allowedRoot.path}"${allowedRoot.description ? ` (${allowedRoot.description})` : ''}`,
|
||||
realHostPath: realPath,
|
||||
resolvedContainerPath: containerPath,
|
||||
effectiveReadonly,
|
||||
};
|
||||
}
|
||||
@@ -348,7 +353,7 @@ export function validateAdditionalMounts(
|
||||
if (result.allowed) {
|
||||
validatedMounts.push({
|
||||
hostPath: result.realHostPath!,
|
||||
containerPath: `/workspace/extra/${mount.containerPath}`,
|
||||
containerPath: `/workspace/extra/${result.resolvedContainerPath}`,
|
||||
readonly: result.effectiveReadonly!,
|
||||
});
|
||||
|
||||
@@ -356,7 +361,7 @@ export function validateAdditionalMounts(
|
||||
{
|
||||
group: groupName,
|
||||
hostPath: result.realHostPath,
|
||||
containerPath: mount.containerPath,
|
||||
containerPath: result.resolvedContainerPath,
|
||||
readonly: result.effectiveReadonly,
|
||||
reason: result.reason,
|
||||
},
|
||||
|
||||
@@ -4,13 +4,13 @@ import fs from 'fs';
|
||||
import path from 'path';
|
||||
|
||||
import {
|
||||
ASSISTANT_NAME,
|
||||
GROUPS_DIR,
|
||||
IDLE_TIMEOUT,
|
||||
MAIN_GROUP_FOLDER,
|
||||
SCHEDULER_POLL_INTERVAL,
|
||||
TIMEZONE,
|
||||
} from './config.js';
|
||||
import { runContainerAgent, writeTasksSnapshot } from './container-runner.js';
|
||||
import { ContainerOutput, runContainerAgent, writeTasksSnapshot } from './container-runner.js';
|
||||
import {
|
||||
getAllTasks,
|
||||
getDueTasks,
|
||||
@@ -23,11 +23,12 @@ import { logger } from './logger.js';
|
||||
import { RegisteredGroup, ScheduledTask } from './types.js';
|
||||
|
||||
export interface SchedulerDependencies {
|
||||
sendMessage: (jid: string, text: string) => Promise<void>;
|
||||
registeredGroups: () => Record<string, RegisteredGroup>;
|
||||
getSessions: () => Record<string, string>;
|
||||
queue: GroupQueue;
|
||||
onProcess: (groupJid: string, proc: ChildProcess, containerName: string) => void;
|
||||
onProcess: (groupJid: string, proc: ChildProcess, containerName: string, groupFolder: string) => void;
|
||||
sendMessage: (jid: string, text: string) => Promise<void>;
|
||||
assistantName: string;
|
||||
}
|
||||
|
||||
async function runTask(
|
||||
@@ -89,6 +90,18 @@ async function runTask(
|
||||
const sessionId =
|
||||
task.context_mode === 'group' ? sessions[task.group_folder] : undefined;
|
||||
|
||||
// Idle timer: writes _close sentinel after IDLE_TIMEOUT of no output,
|
||||
// so the container exits instead of hanging at waitForIpcMessage forever.
|
||||
let idleTimer: ReturnType<typeof setTimeout> | null = null;
|
||||
|
||||
const resetIdleTimer = () => {
|
||||
if (idleTimer) clearTimeout(idleTimer);
|
||||
idleTimer = setTimeout(() => {
|
||||
logger.debug({ taskId: task.id }, 'Scheduled task idle timeout, closing container stdin');
|
||||
deps.queue.closeStdin(task.chat_jid);
|
||||
}, IDLE_TIMEOUT);
|
||||
};
|
||||
|
||||
try {
|
||||
const output = await runContainerAgent(
|
||||
group,
|
||||
@@ -98,17 +111,33 @@ async function runTask(
|
||||
groupFolder: task.group_folder,
|
||||
chatJid: task.chat_jid,
|
||||
isMain,
|
||||
isScheduledTask: true,
|
||||
},
|
||||
(proc, containerName) => deps.onProcess(task.chat_jid, proc, containerName, task.group_folder),
|
||||
async (streamedOutput: ContainerOutput) => {
|
||||
if (streamedOutput.result) {
|
||||
result = streamedOutput.result;
|
||||
// Forward result to user (strip <internal> tags)
|
||||
const text = streamedOutput.result.replace(/<internal>[\s\S]*?<\/internal>/g, '').trim();
|
||||
if (text) {
|
||||
await deps.sendMessage(task.chat_jid, `${deps.assistantName}: ${text}`);
|
||||
}
|
||||
// Only reset idle timer on actual results, not session-update markers
|
||||
resetIdleTimer();
|
||||
}
|
||||
if (streamedOutput.status === 'error') {
|
||||
error = streamedOutput.error || 'Unknown error';
|
||||
}
|
||||
},
|
||||
(proc, containerName) => deps.onProcess(task.chat_jid, proc, containerName),
|
||||
);
|
||||
|
||||
if (idleTimer) clearTimeout(idleTimer);
|
||||
|
||||
if (output.status === 'error') {
|
||||
error = output.error || 'Unknown error';
|
||||
} else if (output.result) {
|
||||
if (output.result.outputType === 'message' && output.result.userMessage) {
|
||||
await deps.sendMessage(task.chat_jid, `${ASSISTANT_NAME}: ${output.result.userMessage}`);
|
||||
}
|
||||
result = output.result.userMessage || output.result.internalLog || null;
|
||||
// Messages are sent via MCP tool (IPC), result text is just logged
|
||||
result = output.result;
|
||||
}
|
||||
|
||||
logger.info(
|
||||
@@ -116,6 +145,7 @@ async function runTask(
|
||||
'Task completed',
|
||||
);
|
||||
} catch (err) {
|
||||
if (idleTimer) clearTimeout(idleTimer);
|
||||
error = err instanceof Error ? err.message : String(err);
|
||||
logger.error({ taskId: task.id, error }, 'Task failed');
|
||||
}
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
export interface AdditionalMount {
|
||||
hostPath: string; // Absolute path on host (supports ~ for home)
|
||||
containerPath: string; // Path inside container (under /workspace/extra/)
|
||||
containerPath?: string; // Optional — defaults to basename of hostPath. Mounted at /workspace/extra/{value}
|
||||
readonly?: boolean; // Default: true for safety
|
||||
}
|
||||
|
||||
|
||||
Reference in New Issue
Block a user