Regolith

Author	SHA1	Message	Date
gavrielc	6f02ee530b	Adds Agent Swarms * feat: streaming container mode, IPC messaging, agent teams support Major architectural shift from single-shot container runs to long-lived streaming containers with IPC-based message injection. - Agent runner: query loop with AsyncIterable prompt to keep stdin open for agent teams (fixes isSingleUserTurn premature shutdown) - New standalone stdio MCP server (ipc-mcp-stdio.ts) inheritable by subagents, with send_message and schedule_task tools - Streaming output: parse OUTPUT_START/END markers in real-time, send results to WhatsApp as they arrive - IPC file-based messaging: host writes to ipc/{group}/input/, agent polls for follow-up messages without respawning containers - Per-group settings.json with CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1 - SDK bumped to 0.2.34 for TeamCreate tool support - Container idle timeout (30min) with _close sentinel for shutdown - Orphaned container cleanup on startup - alwaysRespond flag for groups that skip trigger pattern check - Uncaught exception/rejection handlers with timestamps in logger - Combined SDK documentation into single deep dive reference Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * chore: remove unused ipc-mcp.ts (replaced by ipc-mcp-stdio.ts) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: clarify agent communication model in docs and tool descriptions - CLAUDE.md (main + global): split communication instructions into "responding to messages" vs "scheduled tasks" sections - send_message tool: note that scheduled task output is not sent to user - Remove structured output (outputFormat) — not needed with current flow - Regular output is sent to WhatsApp; scheduled task output is only logged Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * chore: ignore dynamic group data while preserving base structure Only track groups/main/CLAUDE.md and groups/global/CLAUDE.md. All other group directories and files are ignored to prevent tracking user-specific session data. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: resolve critical bugs in streaming container mode Bug 1 (scheduled task hang): Task scheduler now passes onOutput callback with idle timer that writes _close sentinel after IDLE_TIMEOUT, so containers exit cleanly instead of blocking queue slots for 30 minutes. Scheduled tasks stay alive for interactive follow-up via IPC. Bug 2 (timeout disabled): Remove resetTimeout() from stderr handler. SDK writes debug logs continuously, resetting the timer on every line. Timeout now only resets on actual output markers in stdout. Bug 3 (trigger bypass): Piped messages in startMessageLoop now check trigger pattern for non-main groups. Non-trigger messages accumulate in DB and are pulled as context via getMessagesSince when a trigger arrives. Bug 7 (non-atomic IPC writes): GroupQueue.sendMessage uses temp file + rename for atomic writes, matching ipc-mcp-stdio.ts pattern. Also: flip isVerbose back to false (debug leftover), add isScheduledTask to host-side ContainerInput interface. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: idle timer not starting + scheduled task groupFolder missing Two bugs that prevented the scheduled task idle timeout fix from working: 1. onOutput was only called when parsed.result !== null, but session update markers have result: null. The idle timer never started for "silent" query completions, leaving containers parked at waitForIpcMessage until hard timeout. 2. Scheduler's onProcess callback didn't pass groupFolder to queue.registerProcess, so closeStdin no-oped (groupFolder was null). The _close sentinel was never written even when the idle timer fired. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: duplicate messages and timestamp rollback in piping path Two bugs introduced by the trigger context accumulation change: 1. processGroupMessages didn't advance lastAgentTimestamp until after the container finished. The piping path's getMessagesSince(lastAgent Timestamp) re-fetched messages already sent as the initial prompt, causing duplicates. 2. processGroupMessages overwrote lastAgentTimestamp with the original batch timestamp on completion, rolling back any advancement made by the piping path while the container was running. Fix: advance lastAgentTimestamp immediately after building the prompt, before starting the container. This matches the piping path behavior and eliminates both the overlap and the rollback. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: container idles 30 extra minutes after _close during query When _close was detected during pollIpcDuringQuery, it was consumed (deleted) and stream.end() was called. But after runQuery returned, main() still emitted a session-update marker (resetting the host's idle timer) and called waitForIpcMessage (which polled forever since _close was already gone). The container had to wait for a second _close. Fix: runQuery now returns closedDuringQuery. When true, main() skips the session-update marker and waitForIpcMessage, exiting immediately. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: resume branching, internal tags, and output forwarding - Fix resume branching: pass resumeSessionAt with last assistant UUID to anchor each query loop resume to the correct conversation tree position. Prevents agent responses landing on invisible branches when agent teams subagents create parallel JSONL entries. - Add <internal> tag stripping: agent can wrap internal reasoning in <internal> tags which are logged but not sent to WhatsApp. Prevents duplicate messages and internal monologue reaching users. - Forward scheduled task output: scheduled tasks now send result text to WhatsApp (with <internal> stripping), matching regular message behavior. No more special-case instructions. - Update Communication guidance in CLAUDE.md: simplified to "your output is sent to the user or group" with soft guidance on <internal> tags and send_message usage. - Add messaging behavior docs to schedule_task tool: prompts the scheduling agent to include guidance on whether the task should always/conditionally/never message the user. - Mount security: containerPath now optional, defaults to basename of hostPath. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: cursor rollback on error, flush guard, verbose logging - Roll back lastAgentTimestamp on container error so retries can re-process the messages instead of silently losing them. - Add guard flag to flushOutgoingQueue to prevent duplicate sends from concurrent flushes during rapid WA reconnects. - Revert isVerbose from hardcoded false back to env-based check (LOG_LEVEL=debug\|trace). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: orphan container cleanup was silently failing The startup cleanup used `container ls --format {{.Names}}` which is Docker Go-template syntax. Apple Container only supports `--format json` or `--format table`. The command errored with exit code 64, but the catch block silently swallowed it — orphan containers were never cleaned up on restart. Fixed to use `--format json` and parse `configuration.id` from the JSON output. Also filters by `status: running` and logs a warning on failure instead of silently catching. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: add Discord badge and community section Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: idle timer reset on null results and flush queue message loss - Only reset idle timer on actual results (non-null), not session-update markers. Prevents containers staying alive 30 extra minutes after the agent finishes work. - flushOutgoingQueue now uses shift() instead of splice(0) so unattempted messages stay in the queue if an unexpected error bails the loop. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: add Agent Swarms to README Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: update Telegram skill for current architecture Rewrite integration instructions to match the per-group queue/SQLite architecture: remove onMessage callback pattern (store to DB, let message loop pick up), fix startSchedulerLoop signature, add TELEGRAM_ONLY service startup, SQLite registration, data/env/env sync, @mention-to-trigger translation, and BotFather group privacy docs. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: Telegram skill message chunking, media placeholders, chat discovery - Split long messages at Telegram's 4096 char limit to prevent silent send failures - Store placeholder text for non-text messages (photos, voice, stickers, etc.) so the agent knows media was sent - Update getAvailableGroups filter to include tg: chats so the agent can discover and register Telegram chats via IPC - Fix removal step numbering Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: update REQUIREMENTS.md and SPEC.md for SQLite architecture - Replace all registered_groups.json / sessions.json / router_state.json references with SQLite equivalents - Fix CONTAINER_TIMEOUT default (300000 → 1800000) - Add missing config exports (IDLE_TIMEOUT, MAX_CONCURRENT_CONTAINERS) - Update folder structure: add missing src files (logger, group-queue, mount-security), remove non-existent utils.ts, list all skills - Fix agent-runner entry (ipc-mcp.ts → ipc-mcp-stdio.ts) - Update startup sequence to reflect per-group queue architecture - Fix env mounting description (data/env/env, not extracted vars) - Update troubleshooting to use sqlite3 commands Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: fix README architecture description, revert SPEC.md env error - README: update architecture blurb to mention per-group queue, add group-queue.ts to key files, update file descriptions - SPEC.md: restore correct credential filtering description (only auth vars are extracted from .env, not the full file) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-09 02:50:43 +02:00
gavrielc	f26468c9b0	fix: setup skill reliability, requiresTrigger option, agent-browser visibility Setup skill fixes: - Run QR auth in foreground with long timeout, not background - Replace fragile message-based registration with DB group sync lookup - Personal chats: ask for phone number instead of querying empty DB - Consolidate trigger word + security model + channel selection into one step - Remove `timeout` shell command (unavailable on macOS), use Bash tool timeout - Query 40 groups, display 10 at a time, support name lookup requiresTrigger support: - Add requiresTrigger field to RegisteredGroup type and DB schema - Skip trigger check when requiresTrigger is false (for solo/personal chats) - Main group still always processes all messages (unchanged) Agent-browser visibility: - Append global CLAUDE.md to non-main agent system prompts via SDK - Add browser tool docs to global and main CLAUDE.md - Update skill description to be broader (not just "web testing") - Reference agent-browser.md in root CLAUDE.md key files Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-07 01:39:31 +02:00
gavrielc	8dd27bc58d	fix: defend against missing structured output and message without content - Fall back to text result when success subtype has no structured_output - Treat outputType 'message' without userMessage as 'log' with warning Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-06 20:29:04 +02:00
gavrielc	44f0b3d99c	fix: improve agent output schema, tool descriptions, and shutdown robustness - Rename status→outputType, responded/silent→message/log for clarity - Remove scheduled task special-casing: userMessage now sent for all contexts - Update schema, tool, and CLAUDE.md descriptions to be clear and non-contradictory about communication mechanisms - Use full tool name mcp__nanoclaw__send_message in docs - Change schedule_task target_group to accept JID instead of folder name - Only show target_group_jid parameter to main group agents - Add defense-in-depth sanitization and error callback to exec() in shutdown - Use "user or group" consistently (supports both 1:1 and group chats) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-06 20:22:45 +02:00
gavrielc	ae177156ec	feat: per-group queue, SQLite state, graceful shutdown (#111 ) * fix: wire up queue processMessagesFn before recovery to prevent silent message loss recoverPendingMessages() was called after startMessageLoop(), which meant: 1. Recovery could race with the message loop's first iteration 2. processMessagesFn was set inside startMessageLoop, so recovery enqueues would fire runForGroup with processMessagesFn still null, silently skipping message processing Move setProcessMessagesFn and recoverPendingMessages before startMessageLoop so the queue is fully wired before any messages are enqueued. https://claude.ai/code/session_01PCY8zNjDa2N29jvBAV5vfL * feat: structured agent output to fix infinite retry on silent responses (#113) Use Agent SDK's outputFormat with json_schema to get typed responses from the agent. The agent now returns { status: 'responded' \| 'silent', userMessage?, internalLog? } instead of a plain string. This fixes a critical bug where a null/empty agent response caused infinite 5-second retry loops by conflating "nothing to say" with "error". - Agent runner: add AGENT_RESPONSE_SCHEMA and parse structured_output - Host: advance lastAgentTimestamp on both responded AND silent status - GroupQueue: add exponential backoff (5s-80s) with max 5 retries for actual errors, replacing unbounded fixed-interval retries https://claude.ai/code/session_014SLc8MxP9BYhEhDCLox9U8 Co-authored-by: Claude <noreply@anthropic.com> --------- Co-authored-by: Claude <noreply@anthropic.com>	2026-02-06 18:54:26 +02:00
Gavriel	4711ec435a	Add register_group IPC command for dynamic group registration Main agent can now register new groups via MCP tool without restart. Host updates both in-memory state and JSON file, creates group folders. Authorization enforced at both agent and host level. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-02 00:08:40 +02:00
gavrielc	05a29d562f	Security improvements: per-group session isolation, remove built-in Gmail - Isolate Claude sessions per-group (data/sessions/{group}/.claude/) to prevent cross-group access to conversation history - Remove Gmail MCP from built-in (now available via /add-gmail skill) - Add SECURITY.md documenting the security model - Move docs to docs/ folder (SPEC.md, REQUIREMENTS.md, SECURITY.md) - Update documentation to reflect changes Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-02 00:07:59 +02:00
Gavriel	5760b75fa9	Fix timezone handling and message filtering - Add TIMEZONE config using system timezone for cron expressions - Filter bot messages by content prefix instead of is_from_me (user shares WhatsApp account with bot) - Format messages as XML for cleaner agent parsing - Update schedule_task tool to clarify local time usage Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-01 22:54:44 +02:00
Gavriel	572338b9a6	Add context_mode option for scheduled tasks Scheduled tasks can now run in either: - "group" mode: uses the group's conversation session for context - "isolated" mode: runs with a fresh session (previous behavior) The tool description guides the agent on when to use each mode and prompts them to ask the user if unclear. Group mode is now the default. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-01 22:23:50 +02:00
gavrielc	6745a1c54b	Apply fixes from closed PRs: sentinel markers, JID lookup, schedule validation - PR #10: Add sentinel markers for robust JSON parsing between container and host. Fallback to last-line parsing for backwards compatibility. - PR #5: Look up target JID from registeredGroups instead of trusting IPC payload, fixing cross-group scheduled tasks getting wrong chat_jid. - PR #8: Add lightweight schedule validation in container MCP that returns errors to agents (cron syntax, positive interval, valid ISO timestamp). Also defensive validation on host side. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-01 20:49:57 +02:00
Gavriel	2dedd18491	Fix scheduled tasks and improve task scheduling UX - Fix Apple Container mount issue: move groups/CLAUDE.md to groups/global/ directory (Apple Container only supports directory mounts, not file mounts) - Fix scheduled tasks for main group: properly detect isMain based on group_folder instead of always setting false - Add isScheduledTask flag so agent knows when running as scheduled task - Improve schedule_task tool description with clear format examples for cron, interval, and once schedule types - Update global CLAUDE.md with instructions for scheduled tasks to use mcp__nanoclaw__send_message when needed Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-01 17:24:12 +02:00
gavrielc	f25e0f9a10	Remove redundant comments throughout codebase Keep only comments that explain non-obvious behavior or add context not apparent from reading the code. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-01 16:00:44 +02:00
gavrielc	552b26cc95	Add PreCompact hook for conversation archiving, remove /clear command - Add PreCompact hook in agent-runner that archives conversations before compaction, using session summary from sessions-index.json for filename - Remove /clear command (programmatic compaction not supported by SDK) - Add /add-clear to RFS for future implementation - Update CLAUDE.md templates with memory system instructions Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-01 15:37:13 +02:00
Gavriel	8ca4c95517	Fix session persistence and auto-start container system - Fix session mount path: ~/.claude/ now mounts to /home/node/.claude/ (container runs as 'node' user with HOME=/home/node, not root) - Fix ~/.gmail-mcp/ mount path similarly - Use absolute paths for GROUPS_DIR and DATA_DIR (required for container mounts) - Auto-start Apple Container system on NanoClaw startup - Update debug skill with session troubleshooting guide - Update spec.md with startup sequence and troubleshooting Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-01 11:31:52 +02:00
Gavriel	67e0295d82	Fix container execution and add debug tooling Container fixes: - Run as non-root 'node' user (required for --dangerously-skip-permissions) - Add allowDangerouslySkipPermissions: true to SDK options - Mount .env file to work around Apple Container -i env var bug - Use --mount for readonly, -v for read-write (Apple Container quirk) - Bump SDK to 0.2.29, zod to v4 - Install Claude Code CLI globally in container Logging improvements: - Write per-run logs to groups/{folder}/logs/container-*.log - Add debug-level logging for mounts and container args Documentation: - Add /debug skill with comprehensive troubleshooting guide - Update /setup skill with API key configuration step - Update SPEC.md with container details, mount syntax, security notes Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-01 10:35:08 +02:00
gavrielc	09c0e8142e	Add containerized agent execution with Apple Container - Agents run in isolated Linux VMs via Apple Container - All groups get Bash access (safe - sandboxed in container) - Browser automation via agent-browser + Chromium - Per-group configurable additional directory mounts - File-based IPC for messages and scheduled tasks - Container image with Node.js 22, Chromium, agent-browser Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-31 22:55:57 +02:00

16 Commits