Initial commit: Discord-Claude Gateway with event-driven agent runtime

2026-02-22 13:34:26 -05:00
parent 82b0905a98
commit 68f24d50e1
5 changed files with 602 additions and 338 deletions
--- a/docs/PROCESS-FLOW.md
+++ b/docs/PROCESS-FLOW.md
@@ -0,0 +1,437 @@
+# Aetheel — Process Flow
+
+How a Discord message becomes an AI response, step by step.
+
+## The Big Picture
+
+```
+Discord User                    Aetheel Gateway                     Claude Code CLI
+     │                               │                                    │
+     │  @Aetheel what's 2+2?         │                                    │
+     ├──────────────────────────────► │                                    │
+     │                                │  1. Extract prompt "what's 2+2?"  │
+     │                                │  2. Check concurrency limit       │
+     │                                │  3. Enqueue message event         │
+     │                                │  4. Read config/*.md files        │
+     │                                │  5. Assemble system prompt        │
+     │                                │  6. Write prompt to temp file     │
+     │                                │  7. Spawn CLI process             │
+     │                                │                                    │
+     │                                │  claude -p "what's 2+2?"          │
+     │                                │  --output-format json             │
+     │                                │  --append-system-prompt-file ...  │
+     │                                │  --dangerously-skip-permissions   │
+     │                                ├──────────────────────────────────► │
+     │                                │                                    │
+     │                                │  ◄── JSON stream (init, result)   │
+     │                                │ ◄─────────────────────────────────┤
+     │                                │                                    │
+     │                                │  8. Parse session_id from init    │
+     │                                │  9. Parse result text             │
+     │                                │  10. Split if > 2000 chars        │
+     │  "2 + 2 = 4"                   │                                    │
+     │ ◄──────────────────────────────┤                                    │
+     │                                │  11. Save session for channel     │
+```
+
+## Step-by-Step: Discord Message → Response
+
+### Step 1: Message Arrives in Discord
+
+A user types `@Aetheel what's the weather like?` in a Discord channel.
+
+Discord delivers this to the bot as a `messageCreate` event via the WebSocket gateway.
+
+**File:** `src/discord-bot.ts` → `setupMessageHandler()`
+
+```
+Raw message content: "<@1473096872372600978> what's the weather like?"
+Author: tanmay11k6417 (bot: false)
+Channel: 1475008084022788312
+```
+
+### Step 2: Message Filtering & Prompt Extraction
+
+The bot checks:
+1. Is the author a bot? → Skip (prevents feedback loops)
+2. Does the message mention the bot? → Continue
+3. Extract the prompt by stripping all mention tags
+
+**File:** `src/discord-bot.ts` → `extractPromptFromMention()`
+
+```
+Input:  "<@1473096872372600978> what's the weather like?"
+Output: "what's the weather like?"
+```
+
+The regex `/<@[!&]?\d+>/g` strips user mentions (`<@ID>`), nickname mentions (`<@!ID>`), and role mentions (`<@&ID>`).
+
+### Step 3: Prompt Handler (Gateway Core)
+
+The extracted prompt is wrapped in a `Prompt` object and passed to the gateway core's `onPrompt` handler.
+
+**File:** `src/gateway-core.ts` → `onPrompt` callback
+
+```typescript
+{
+  text: "what's the weather like?",
+  channelId: "1475008084022788312",
+  userId: "123456789",
+  guildId: "987654321"
+}
+```
+
+The handler checks:
+- Is the gateway shutting down? → Reply "Gateway is shutting down"
+- Is `activeQueryCount >= maxConcurrentQueries` (default 5)? → Reply "System is busy"
+- Otherwise: increment counter, send typing indicator, enqueue event
+
+### Step 4: Event Queue
+
+The prompt becomes a **message event** in the unified event queue.
+
+**File:** `src/event-queue.ts`
+
+```typescript
+{
+  id: 2,                    // Monotonically increasing
+  type: "message",
+  payload: {
+    prompt: {
+      text: "what's the weather like?",
+      channelId: "1475008084022788312",
+      userId: "123456789",
+      guildId: "987654321"
+    }
+  },
+  timestamp: "2026-02-22T10:30:00.000Z",
+  source: "discord"
+}
+```
+
+The queue processes events one at a time (FIFO). If a heartbeat or cron event is ahead in the queue, the message waits.
+
+### Step 5: Agent Runtime — Read Config Files
+
+When the event reaches the front of the queue, the Agent Runtime reads ALL markdown config files fresh from disk.
+
+**File:** `src/markdown-config-loader.ts` → `loadAll()`
+
+```
+config/
+├── identity.md    → Agent name, role, vibe
+├── soul.md        → Personality, tone, values
+├── agents.md      → Operating rules, safety boundaries
+├── user.md        → Info about the human
+├── memory.md      → Long-term memory (agent can write to this)
+└── tools.md       → Tool configs, API notes
+```
+
+Files are read fresh every time — edit them while the gateway is running and the next event picks up changes.
+
+If `memory.md` doesn't exist, it's auto-created with `# Memory\n`.
+
+### Step 6: Assemble System Prompt
+
+The markdown file contents are concatenated into a single system prompt with section headers.
+
+**File:** `src/system-prompt-assembler.ts` → `assemble()`
+
+The assembled prompt looks like this:
+
+```
+You may update your long-term memory by writing to memory.md using the Write tool.
+Use this to persist important facts, lessons learned, and context across sessions.
+
+## Identity
+
+# Identity
+
+- **Name:** Aetheel
+- **Vibe:** Helpful, sharp, slightly witty
+- **Emoji:** ⚡
+
+## Personality
+
+# Soul
+
+Be genuinely helpful. Have opinions. Be resourceful before asking.
+Keep responses concise for Discord. Use markdown formatting.
+
+## Operating Rules
+
+# Operating Rules
+
+Be helpful and concise. Keep Discord messages short.
+
+## Cron Jobs
+...
+
+## User Context
+
+# User Context
+
+- **Name:** Tanmay
+- **Timezone:** IST
+...
+
+## Long-Term Memory
+
+# Memory
+
+- Tanmay prefers short responses
+- Project aetheel-2 is the Discord gateway
+...
+
+## Tool Configuration
+
+# Tool Configuration
+
+(empty or tool-specific notes)
+```
+
+Sections with null or empty content are omitted entirely.
+
+### Step 7: Write System Prompt to Temp File
+
+The assembled system prompt is written to a temporary file because it can be thousands of characters — too large for a CLI argument.
+
+**File:** `src/agent-runtime.ts` → `executeClaude()`
+
+```
+/tmp/aetheel-prompt-1d6c77f1-4a4e-49f8-ae9b-cff6fb47b971.txt
+```
+
+This file is deleted after the CLI process completes.
+
+### Step 8: Spawn Claude CLI
+
+The gateway spawns the Claude Code CLI as a child process.
+
+**File:** `src/agent-runtime.ts` → `runClaude()`
+
+The actual command:
+
+```bash
+claude \
+  -p "what's the weather like?" \
+  --output-format json \
+  --dangerously-skip-permissions \
+  --append-system-prompt-file /tmp/aetheel-prompt-xxx.txt \
+  --allowedTools Read \
+  --allowedTools Write \
+  --allowedTools Edit \
+  --allowedTools Glob \
+  --allowedTools Grep \
+  --allowedTools WebSearch \
+  --allowedTools WebFetch \
+  --max-turns 25
+```
+
+Key flags:
+- `-p` — Print mode (non-interactive, exits after response)
+- `--output-format json` — Returns JSON array of message objects
+- `--dangerously-skip-permissions` — No interactive permission prompts
+- `--append-system-prompt-file` — Appends our persona/memory to Claude's default prompt
+- `--allowedTools` — Which tools Claude can use (one flag per tool)
+- `--max-turns` — Prevents runaway agent loops
+- `--resume SESSION_ID` — Added when resuming an existing conversation
+
+The process runs with `cwd` set to the `config/` directory, so Claude can read/write files there (like `memory.md`).
+
+`stdin` is set to `"ignore"` to prevent the CLI from waiting for interactive input.
+
+### Step 9: Session Resumption
+
+If this channel has chatted before, the session manager has a stored session ID.
+
+**File:** `src/session-manager.ts`
+
+```
+config/sessions.json:
+{
+  "1475008084022788312": "37336c32-73cb-4cf5-9771-1c8f694398ff"
+}
+```
+
+When a session ID exists, `--resume 37336c32-73cb-4cf5-9771-1c8f694398ff` is added to the CLI args. Claude loads the full conversation history from `~/.claude/` and continues the conversation.
+
+### Step 10: Parse CLI Output (Streaming)
+
+The CLI returns a JSON array on stdout. The gateway parses it as chunks arrive.
+
+**File:** `src/agent-runtime.ts` → `runClaude()` stdout handler
+
+Example CLI output:
+
+```json
+[
+  {
+    "type": "system",
+    "subtype": "init",
+    "session_id": "37336c32-73cb-4cf5-9771-1c8f694398ff",
+    "tools": ["Read", "Write", "Edit", "Bash", "Glob", "Grep", "WebSearch", "WebFetch"]
+  },
+  {
+    "type": "assistant",
+    "message": { "content": [{ "type": "text", "text": "Let me check..." }] }
+  },
+  {
+    "type": "result",
+    "subtype": "success",
+    "result": "I don't have access to real-time weather data, but I can help you check! Try asking me to search the web for current weather in your area.",
+    "session_id": "37336c32-73cb-4cf5-9771-1c8f694398ff",
+    "is_error": false,
+    "cost_usd": 0.003
+  }
+]
+```
+
+The parser extracts:
+1. `session_id` from the `init` message → stored for future resumption
+2. `result` from the `result` message → sent to Discord
+
+When streaming is active, result text is sent to Discord immediately as it's parsed, before the CLI process exits.
+
+### Step 11: Response Formatting & Delivery
+
+The result text is split into Discord-safe chunks (max 2000 characters each).
+
+**File:** `src/response-formatter.ts` → `splitMessage()`
+
+If the response contains code blocks that span a split boundary, the formatter closes the code block with ` ``` ` at the end of one chunk and reopens it with ` ``` ` at the start of the next.
+
+The chunks are sent sequentially to the Discord channel via the bot.
+
+### Step 12: Session Persistence
+
+The session ID is saved to `config/sessions.json` so it survives gateway restarts.
+
+**File:** `src/session-manager.ts` → `saveToDisk()`
+
+Next time the user sends a message in the same channel, the conversation continues from where it left off.
+
+---
+
+## Other Event Types
+
+### Heartbeat Flow
+
+```
+Timer fires (every N seconds)
+  → HeartbeatScheduler creates heartbeat event
+  → Event enters queue
+  → AgentRuntime reads config files, assembles prompt
+  → CLI runs with heartbeat instruction as prompt
+  → Response sent to OUTPUT_CHANNEL_ID
+```
+
+Example heartbeat.md:
+```markdown
+## check-email
+Interval: 1800
+Instruction: Check my inbox for anything urgent. If nothing, reply HEARTBEAT_OK.
+```
+
+The instruction becomes the `-p` argument to the CLI.
+
+### Cron Job Flow
+
+```
+Cron expression matches (e.g., "0 9 * * *" = 9am daily)
+  → CronScheduler creates cron event
+  → Event enters queue
+  → AgentRuntime reads config files, assembles prompt
+  → CLI runs with cron instruction as prompt
+  → Response sent to OUTPUT_CHANNEL_ID
+```
+
+Cron jobs are defined in `config/agents.md`:
+```markdown
+## Cron Jobs
+
+### morning-briefing
+Cron: 0 9 * * *
+Instruction: Good morning! Check email and give me a brief summary.
+```
+
+### Hook Flow
+
+```
+Lifecycle event occurs (startup, shutdown)
+  → HookManager creates hook event
+  → Event enters queue
+  → AgentRuntime reads config files, assembles prompt
+  → CLI runs with hook instruction as prompt
+  → Response sent to OUTPUT_CHANNEL_ID
+```
+
+Hooks are defined in `config/agents.md`:
+```markdown
+## Hooks
+
+### startup
+Instruction: Say hello, you just came online.
+
+### shutdown
+Instruction: Save important context to memory.md before shutting down.
+```
+
+`agent_begin` and `agent_stop` hooks fire inline (not through the queue) before and after every non-hook event.
+
+---
+
+## What Gets Sent to Claude
+
+For every event, Claude receives:
+
+1. **Default Claude Code system prompt** (built-in, from the CLI)
+2. **Appended system prompt** (from our assembled markdown files):
+   - Identity (who the agent is)
+   - Personality (how it behaves)
+   - Operating rules (safety, workflows)
+   - User context (who it's helping)
+   - Long-term memory (persistent facts)
+   - Tool configuration (API notes)
+   - Preamble about writing to memory.md
+3. **The prompt text** (user message, heartbeat instruction, or cron instruction)
+4. **Session history** (if resuming via `--resume`)
+5. **Allowed tools** (Read, Write, Edit, Glob, Grep, WebSearch, WebFetch)
+
+Claude runs in the `config/` directory, so it can read and write files there — including updating `memory.md` with new facts.
+
+---
+
+## File Map
+
+```
+src/
+├── index.ts                   ← Entry point: creates GatewayCore, registers shutdown handler
+├── gateway-core.ts            ← Orchestrator: wires everything, manages lifecycle
+├── config.ts                  ← Reads env vars (DISCORD_BOT_TOKEN, etc.)
+├── discord-bot.ts             ← Discord.js wrapper: messages, slash commands, typing
+├── event-queue.ts             ← FIFO queue: all events (message, heartbeat, cron, hook)
+├── agent-runtime.ts           ← Core engine: reads configs, spawns CLI, parses output
+├── markdown-config-loader.ts  ← Reads config/*.md files fresh each event
+├── system-prompt-assembler.ts ← Concatenates markdown into system prompt with headers
+├── session-manager.ts         ← Channel → session ID mapping (persisted to JSON)
+├── response-formatter.ts      ← Splits long text for Discord's 2000 char limit
+├── error-formatter.ts         ← Sanitizes errors (strips keys, paths, stacks)
+├── heartbeat-scheduler.ts     ← setInterval timers from heartbeat.md
+├── cron-scheduler.ts          ← node-cron jobs from agents.md
+├── hook-manager.ts            ← Lifecycle hooks from agents.md
+├── bootstrap-manager.ts       ← First-run: validates/creates config files
+├── channel-queue.ts           ← Per-channel sequential processing
+└── shutdown-handler.ts        ← SIGTERM/SIGINT → graceful shutdown
+
+config/
+├── identity.md                ← Agent name, role, specialization
+├── soul.md                    ← Personality, tone, values
+├── agents.md                  ← Rules, cron jobs, hooks
+├── user.md                    ← Human's info and preferences
+├── memory.md                  ← Long-term memory (agent-writable)
+├── tools.md                   ← Tool configs and notes
+├── heartbeat.md               ← Proactive check definitions
+├── boot.md                    ← Bootstrap parameters (optional)
+└── sessions.json              ← Channel → session ID map (auto-generated)
+```