# Design Document: Discord-Claude Gateway ## Overview The Discord-Claude Gateway is a TypeScript event-driven agent runtime platform that bridges Discord's messaging platform with the Claude Agent SDK. Inspired by the OpenClaw architecture, it goes beyond a simple chat bridge: it is a long-running process that accepts inputs from multiple sources (Discord messages, heartbeat timers, cron jobs, lifecycle hooks), routes them through a unified event queue, and dispatches them to an AI agent runtime for processing. The agent's personality, identity, user context, long-term memory, tool configuration, and operating rules are all defined in local markdown files. The runtime reads these files fresh on each event processing cycle, assembles a dynamic system prompt, and passes it to the Agent SDK's `query()` function via the `systemPrompt` option. The agent can write back to `memory.md` using the Write tool, completing a read-process-write loop that persists state across sessions. ### Key Design Decisions - **discord.js** for the Discord bot client — the most mature and widely-used Discord library for Node.js/TypeScript. - **Unified event queue** — all inputs (messages, heartbeats, crons, hooks) enter a single in-memory FIFO queue, ensuring consistent ordering and preventing race conditions. - **Markdown files as single source of truth** — all agent state and configuration lives in markdown files on disk. No database, no external state store. The runtime reconstructs full context by reading CONFIG_DIR on startup. - **Fresh reads per event** — markdown config files are read from disk on each event processing cycle, so edits take effect immediately without restarts. - **Per-channel session binding** — each Discord channel maps to at most one Agent SDK session, enabling conversational continuity. - **Sequential event processing** — the event queue processes one event at a time to avoid concurrent writes to markdown files and ensure deterministic behavior. - **Environment-variable-driven configuration** — all deployment settings are read from environment variables with sensible defaults. - **node-cron** for cron scheduling — lightweight, well-maintained cron expression parser for Node.js. - **Agent SDK `systemPrompt` option** — the assembled markdown content is injected via the `systemPrompt` option in the `query()` function call. ## Architecture ```mermaid graph TD A[Discord Users] -->|Messages / Slash Commands| B[DiscordBot] B -->|message Event| C[EventQueue] D[HeartbeatScheduler] -->|heartbeat Event| C E[CronScheduler] -->|cron Event| C F[HookManager] -->|hook Event| C C -->|Dequeue FIFO| G[AgentRuntime] G -->|Read configs| H[MarkdownConfigLoader] H -->|soul.md, identity.md, etc.| I[CONFIG_DIR] G -->|Assemble prompt| J[SystemPromptAssembler] J -->|systemPrompt option| K[Agent SDK query] K -->|Response Stream| L[ResponseFormatter] L -->|Split & send| B G -->|Agent writes memory.md| I G -->|Signal complete| C M[BootstrapManager] -->|Validate/create files| I N[ConfigLoader] -->|Env vars| G O[SessionManager] -->|Channel Bindings| G P[GatewayCore] -->|Orchestrates| B P -->|Orchestrates| C P -->|Orchestrates| D P -->|Orchestrates| E P -->|Orchestrates| F P -->|Orchestrates| M P -->|Shutdown| Q[ShutdownHandler] ``` The system is composed of the following layers: 1. **Input Layer** — Multiple event sources feed into the unified queue: - **DiscordBot**: Handles bot authentication, message/interaction reception, typing indicators, and message sending. - **HeartbeatScheduler**: Manages recurring timers that fire heartbeat events at configured intervals. - **CronScheduler**: Manages cron-expression-based scheduled events. - **HookManager**: Fires lifecycle hook events (startup, agent_begin, agent_stop, shutdown). 2. **Event Queue** — A single in-memory FIFO queue that accepts all event types and dispatches them one at a time to the Agent Runtime. 3. **Agent Runtime** — The core processing engine that: - Dequeues events from the EventQueue. - Reads markdown config files via MarkdownConfigLoader. - Assembles the system prompt via SystemPromptAssembler. - Calls the Agent SDK `query()` with the assembled `systemPrompt` option. - Routes responses back to the appropriate output channel. - Signals the EventQueue when processing is complete. 4. **Configuration Layer**: - **ConfigLoader**: Reads and validates environment variables at startup. - **MarkdownConfigLoader**: Reads markdown files from CONFIG_DIR on each event cycle. - **SystemPromptAssembler**: Concatenates markdown file contents with section headers into the system prompt. - **BootstrapManager**: Validates and creates missing markdown files on first run. 5. **Session & Response Layer**: - **SessionManager**: Maintains the mapping between Discord channel IDs and Agent SDK session IDs. - **ResponseFormatter**: Splits long responses at safe boundaries (respecting code blocks and the 2000-char limit). 6. **Lifecycle Layer**: - **GatewayCore**: The main orchestrator that wires all components and manages the startup/shutdown sequence. - **ShutdownHandler**: Listens for SIGTERM/SIGINT, fires shutdown hook, drains the queue, and disconnects cleanly. ## Components and Interfaces ### ConfigLoader Responsible for reading and validating environment variables at startup. ```typescript interface GatewayConfig { discordBotToken: string; // DISCORD_BOT_TOKEN (required) anthropicApiKey: string; // ANTHROPIC_API_KEY (required) allowedTools: string[]; // ALLOWED_TOOLS, default: ["Read","Write","Edit","Glob","Grep","WebSearch","WebFetch"] permissionMode: string; // PERMISSION_MODE, default: "bypassPermissions" queryTimeoutMs: number; // QUERY_TIMEOUT_MS, default: 120000 maxConcurrentQueries: number; // MAX_CONCURRENT_QUERIES, default: 5 configDir: string; // CONFIG_DIR, default: "./config" maxQueueDepth: number; // MAX_QUEUE_DEPTH, default: 100 outputChannelId?: string; // OUTPUT_CHANNEL_ID, optional — default channel for heartbeat/cron output } function loadConfig(): GatewayConfig; // Throws with descriptive message listing missing required vars if validation fails. ``` ### DiscordBot Wraps the discord.js `Client`, registers slash commands, and exposes event handlers. ```typescript interface DiscordBot { start(token: string): Promise; // Authenticates and waits for ready state. Logs username and guild count. registerCommands(): Promise; // Registers /claude and /claude-reset slash commands. sendMessage(channelId: string, content: string): Promise; // Sends a message to a channel. Logs errors if Discord API rejects. sendTyping(channelId: string): Promise; // Sends a typing indicator to a channel. destroy(): Promise; // Disconnects the bot from Discord. onPrompt(handler: (prompt: Prompt) => void): void; // Registers a callback for incoming prompts (from mentions or /claude). onReset(handler: (channelId: string) => void): void; // Registers a callback for /claude-reset commands. } interface Prompt { text: string; channelId: string; userId: string; guildId: string | null; } ``` ### EventQueue Unified in-memory FIFO queue that accepts all event types and dispatches them sequentially to the AgentRuntime. ```typescript interface Event { id: number; // Monotonically increasing sequence number type: EventType; // "message" | "heartbeat" | "cron" | "hook" | "webhook" payload: EventPayload; // Type-specific payload timestamp: Date; // Enqueue timestamp source: string; // Source identifier (e.g., "discord", "heartbeat-scheduler", "cron-scheduler") } type EventType = "message" | "heartbeat" | "cron" | "hook" | "webhook"; interface MessagePayload { prompt: Prompt; // The Discord prompt } interface HeartbeatPayload { instruction: string; // The heartbeat check instruction checkName: string; // Name of the heartbeat check } interface CronPayload { instruction: string; // The cron job instruction jobName: string; // Name of the cron job } interface HookPayload { hookType: HookType; // "startup" | "agent_begin" | "agent_stop" | "shutdown" instruction?: string; // Optional instruction prompt from agents.md } type HookType = "startup" | "agent_begin" | "agent_stop" | "shutdown"; type EventPayload = MessagePayload | HeartbeatPayload | CronPayload | HookPayload; interface EventQueue { enqueue(event: Omit): Event | null; // Assigns sequence number and timestamp. Returns the event, or null if queue is full. dequeue(): Event | undefined; // Returns the next event in FIFO order, or undefined if empty. size(): number; // Returns current queue depth. onEvent(handler: (event: Event) => Promise): void; // Registers the processing handler. The queue calls this for each event // and waits for the promise to resolve before dispatching the next. drain(): Promise; // Returns a promise that resolves when the queue is empty and no event is processing. } ``` ### AgentRuntime Core processing engine that reads markdown configs, assembles the system prompt, and calls the Agent SDK. ```typescript interface AgentRuntime { processEvent(event: Event): Promise; // Main entry point. Reads markdown configs, assembles system prompt, // calls Agent SDK query(), and returns the result. // For message events: uses the prompt text and resumes/creates sessions. // For heartbeat/cron events: uses the instruction text as the prompt. // For hook events: uses the hook instruction (if any) as the prompt. } interface EventResult { responseText?: string; // The agent's response text (if any) targetChannelId?: string; // Discord channel to send the response to sessionId?: string; // Session ID (for message events) } ``` ### MarkdownConfigLoader Reads markdown configuration files from CONFIG_DIR. Files are read fresh on each call (no caching) so that edits take effect immediately. ```typescript interface MarkdownConfigs { soul: string | null; // soul.md content, null if missing identity: string | null; // identity.md content, null if missing agents: string | null; // agents.md content, null if missing user: string | null; // user.md content, null if missing memory: string | null; // memory.md content, null if missing tools: string | null; // tools.md content, null if missing } interface MarkdownConfigLoader { loadAll(configDir: string): Promise; // Reads all markdown config files. Returns null for missing files. // Logs warnings for missing soul.md, identity.md, agents.md, user.md, tools.md. // Creates memory.md with "# Memory" header if missing. loadFile(configDir: string, filename: string): Promise; // Reads a single markdown file. Returns null if missing. } ``` ### SystemPromptAssembler Assembles the system prompt from markdown config file contents. ```typescript interface SystemPromptAssembler { assemble(configs: MarkdownConfigs): string; // Concatenates markdown file contents in order: // 1. Identity (## Identity) // 2. Soul (## Personality) // 3. Agents (## Operating Rules) // 4. User (## User Context) // 5. Memory (## Long-Term Memory) // 6. Tools (## Tool Configuration) // // Each section is wrapped: "## [Section Name]\n\n[content]\n\n" // Sections with null/empty content are omitted. // A preamble is prepended instructing the agent it may update memory.md. } ``` ### HeartbeatScheduler Manages recurring heartbeat timers based on heartbeat.md configuration. ```typescript interface HeartbeatCheck { name: string; // Check name/identifier instruction: string; // Instruction prompt for the agent intervalSeconds: number; // Interval between checks (minimum 60) } interface HeartbeatScheduler { start(checks: HeartbeatCheck[], enqueue: (event: Omit) => Event | null): void; // Starts a recurring timer for each check. On each tick, creates a heartbeat // event and enqueues it. Rejects checks with interval < 60 seconds. stop(): void; // Stops all heartbeat timers. parseConfig(content: string): HeartbeatCheck[]; // Parses heartbeat.md content into check definitions. } ``` ### CronScheduler Manages cron-expression-based scheduled events parsed from agents.md. ```typescript interface CronJob { name: string; // Job name/identifier expression: string; // Cron expression (e.g., "0 9 * * 1") instruction: string; // Instruction prompt for the agent } interface CronScheduler { start(jobs: CronJob[], enqueue: (event: Omit) => Event | null): void; // Schedules each cron job. On each trigger, creates a cron event and enqueues it. // Logs warning and skips jobs with invalid cron expressions. stop(): void; // Stops all cron jobs. parseConfig(content: string): CronJob[]; // Parses the "Cron Jobs" section from agents.md into job definitions. } ``` ### HookManager Fires lifecycle hook events at appropriate points in the Gateway lifecycle. ```typescript interface HookConfig { startup?: string; // Instruction prompt for startup hook agent_begin?: string; // Instruction prompt for agent_begin hook agent_stop?: string; // Instruction prompt for agent_stop hook shutdown?: string; // Instruction prompt for shutdown hook } interface HookManager { fire(hookType: HookType, enqueue: (event: Omit) => Event | null): void; // Creates a hook event and enqueues it. // agent_begin and agent_stop are processed inline (not re-enqueued). fireInline(hookType: HookType, runtime: AgentRuntime): Promise; // For agent_begin/agent_stop: processes the hook inline without going through the queue. parseConfig(content: string): HookConfig; // Parses the "Hooks" section from agents.md into hook configuration. } ``` ### BootstrapManager Handles first-run setup: validates markdown files exist, creates missing ones with defaults. ```typescript interface BootConfig { requiredFiles: string[]; // Files that must exist (default: ["soul.md", "identity.md"]) optionalFiles: string[]; // Files created with defaults if missing defaults: Record; // Default content for each file } interface BootstrapManager { run(configDir: string): Promise; // 1. Reads boot.md for bootstrap parameters (or uses built-in defaults). // 2. Verifies all required markdown files exist. // 3. Creates missing optional files with default content. // 4. Logs loaded files and any files created with defaults. parseBootConfig(content: string | null): BootConfig; // Parses boot.md content into bootstrap parameters. // Returns built-in defaults if content is null. } interface BootstrapResult { loadedFiles: string[]; // Files that existed and were loaded createdFiles: string[]; // Files that were created with defaults } ``` ### SessionManager Manages the mapping between Discord channels and Agent SDK sessions. ```typescript interface SessionManager { getSessionId(channelId: string): string | undefined; setSessionId(channelId: string, sessionId: string): void; removeSession(channelId: string): void; clear(): void; } ``` ### ResponseFormatter Splits long response text into Discord-safe chunks. ```typescript function splitMessage(text: string, maxLength?: number): string[]; // Splits text into chunks of at most maxLength (default 2000) characters. // Preserves code block formatting: if a split occurs inside a code block, // the chunk is closed with ``` and the next chunk reopens with ```. // Splits prefer line boundaries over mid-line breaks. ``` ### GatewayCore The main orchestrator that wires all components together and manages the full lifecycle. ```typescript interface GatewayCore { start(): Promise; // 1. Load config (ConfigLoader) // 2. Run bootstrap (BootstrapManager) // 3. Start Discord bot (DiscordBot) // 4. Initialize EventQueue and AgentRuntime // 5. Parse heartbeat.md → start HeartbeatScheduler // 6. Parse agents.md → start CronScheduler, load HookConfig // 7. Fire startup hook // 8. Begin accepting events shutdown(): Promise; // 1. Stop accepting new events from Discord // 2. Stop HeartbeatScheduler and CronScheduler // 3. Fire shutdown hook (enqueue and wait for processing) // 4. Drain EventQueue // 5. Disconnect DiscordBot // 6. Exit with code 0 } ``` ### Agent SDK Integration The gateway calls the Agent SDK `query()` function with the assembled system prompt: ```typescript import { query } from "@anthropic-ai/claude-agent-sdk"; // For a message event (new session): const stream = query({ prompt: event.payload.prompt.text, options: { allowedTools: config.allowedTools, permissionMode: config.permissionMode, systemPrompt: assembledSystemPrompt, // Injected via systemPrompt option } }); // For a message event (resumed session): const stream = query({ prompt: event.payload.prompt.text, options: { resume: sessionId, allowedTools: config.allowedTools, permissionMode: config.permissionMode, systemPrompt: assembledSystemPrompt, } }); // For a heartbeat/cron event: const stream = query({ prompt: event.payload.instruction, options: { allowedTools: config.allowedTools, permissionMode: config.permissionMode, systemPrompt: assembledSystemPrompt, } }); // Processing the stream: for await (const message of stream) { if (message.type === "system" && message.subtype === "init") { sessionManager.setSessionId(channelId, message.session_id); } if ("result" in message) { const chunks = splitMessage(message.result); for (const chunk of chunks) { await discordBot.sendMessage(targetChannelId, chunk); } } } ``` ## Data Models ### Event ```typescript interface Event { id: number; // Monotonically increasing sequence number type: EventType; // "message" | "heartbeat" | "cron" | "hook" | "webhook" payload: EventPayload; // Type-specific payload timestamp: Date; // Enqueue timestamp source: string; // Source identifier } ``` ### Channel Binding ```typescript // In-memory Map // Key: Discord channel ID // Value: Agent SDK session ID type ChannelBindings = Map; ``` ### Prompt ```typescript interface Prompt { text: string; // The extracted prompt text channelId: string; // Discord channel ID where the prompt originated userId: string; // Discord user ID of the sender guildId: string | null; // Discord guild ID (null for DMs) } ``` ### Markdown Configs ```typescript interface MarkdownConfigs { soul: string | null; // soul.md content identity: string | null; // identity.md content agents: string | null; // agents.md content user: string | null; // user.md content memory: string | null; // memory.md content tools: string | null; // tools.md content } ``` ### Gateway State ```typescript interface GatewayState { config: GatewayConfig; channelBindings: ChannelBindings; activeQueryCount: number; // Number of currently executing Agent SDK queries isShuttingDown: boolean; // True after receiving shutdown signal eventQueue: EventQueue; nextEventId: number; // Next sequence number for events } ``` ### Heartbeat and Cron Definitions ```typescript interface HeartbeatCheck { name: string; instruction: string; intervalSeconds: number; // Minimum 60 } interface CronJob { name: string; expression: string; // Standard cron expression instruction: string; } ``` ### Hook Configuration ```typescript interface HookConfig { startup?: string; agent_begin?: string; agent_stop?: string; shutdown?: string; } ``` ### Bootstrap Configuration ```typescript interface BootConfig { requiredFiles: string[]; optionalFiles: string[]; defaults: Record; } interface BootstrapResult { loadedFiles: string[]; createdFiles: string[]; } ``` ## Correctness Properties *A property is a characteristic or behavior that should hold true across all valid executions of a system — essentially, a formal statement about what the system should do. Properties serve as the bridge between human-readable specifications and machine-verifiable correctness guarantees.* ### Property 1: Config loading round-trip *For any* set of valid environment variable values for DISCORD_BOT_TOKEN, ANTHROPIC_API_KEY, ALLOWED_TOOLS, PERMISSION_MODE, QUERY_TIMEOUT_MS, CONFIG_DIR, and MAX_QUEUE_DEPTH, calling `loadConfig()` should produce a `GatewayConfig` whose fields match the corresponding environment variable values, with ALLOWED_TOOLS correctly split from a comma-separated string into an array. **Validates: Requirements 8.1, 2.3** ### Property 2: Missing required config reports all missing values *For any* subset of the required environment variables (DISCORD_BOT_TOKEN, ANTHROPIC_API_KEY) that are unset, calling `loadConfig()` should throw an error whose message contains the names of all missing variables. **Validates: Requirements 8.3, 1.2, 2.2** ### Property 3: Mention prompt extraction *For any* Discord message containing a bot mention and arbitrary surrounding text, the prompt extraction function should return the message text with the mention removed and leading/trailing whitespace trimmed. **Validates: Requirements 3.1** ### Property 4: Slash command prompt extraction *For any* slash command interaction with a prompt option value, the prompt extraction function should return exactly the option value as the prompt text. **Validates: Requirements 3.2** ### Property 5: Bot message filtering *For any* Discord message where the author has the bot flag set, the gateway should not process it as a prompt. Conversely, for any message from a non-bot user that mentions the bot, the gateway should process it. **Validates: Requirements 3.3** ### Property 6: Query arguments correctness *For any* prompt text, gateway configuration, and assembled system prompt, the arguments passed to the Agent SDK `query()` function should include the prompt text, the configured allowed tools array, the configured permission mode, and the assembled system prompt via the `systemPrompt` option. **Validates: Requirements 4.1, 4.4, 12.3** ### Property 7: Session resume on existing binding *For any* channel that has a stored session ID in the channel bindings, when a new prompt is received for that channel, the `query()` call should include the `resume` option set to the stored session ID. **Validates: Requirements 4.2, 6.2** ### Property 8: New session creation and storage *For any* channel without an existing channel binding, when a prompt is processed and the Agent SDK returns an init message with a session_id, the session manager should store that session_id for the channel. A subsequent lookup for that channel should return the stored session_id. **Validates: Requirements 4.3, 6.1** ### Property 9: Message splitting with code block preservation *For any* string of arbitrary length, `splitMessage(text)` should produce chunks where: (a) every chunk is at most 2000 characters, (b) concatenating all chunks reproduces the original text (modulo inserted code block delimiters), and (c) if the original text contains a code block that spans a split boundary, each chunk is a valid markdown fragment with properly opened and closed code fences. **Validates: Requirements 5.3, 5.4** ### Property 10: Reset removes channel binding *For any* channel with a stored session ID, invoking the reset handler for that channel should result in the session manager returning `undefined` for that channel's session ID. **Validates: Requirements 6.3** ### Property 11: Concurrent session isolation *For any* two distinct channel IDs with different stored session IDs, operations on one channel (setting, getting, or removing its session) should not affect the session ID stored for the other channel. **Validates: Requirements 6.4** ### Property 12: Error message formatting *For any* error thrown by the Agent SDK (with a type/name and a message), the user-facing error message produced by the gateway should contain the error type/name but should not contain stack traces, API keys, or file paths from the server environment. **Validates: Requirements 7.1** ### Property 13: Sequential per-channel queue ordering *For any* sequence of tasks enqueued for the same channel, the tasks should execute in FIFO order, and no two tasks for the same channel should execute concurrently. **Validates: Requirements 9.2** ### Property 14: Concurrency limit rejection *For any* gateway state where the active query count is at or above the configured maximum, attempting to process a new prompt should be rejected with a "system busy" response rather than being forwarded to the Agent SDK. **Validates: Requirements 9.3** ### Property 15: Event queue accepts all event types with monotonic IDs *For any* sequence of events of mixed types (message, heartbeat, cron, hook, webhook), enqueuing them should succeed (when below max depth), and each event should be assigned a strictly increasing sequence number and a timestamp no earlier than the previous event's timestamp. **Validates: Requirements 11.1, 11.2** ### Property 16: Event queue FIFO dispatch with sequential processing *For any* sequence of events enqueued in the EventQueue, the processing handler should be called with events in the exact order they were enqueued, and the handler for event N+1 should not be called until the handler for event N has completed. **Validates: Requirements 11.3, 11.4, 12.5** ### Property 17: Event queue depth rejection *For any* EventQueue that has reached its configured maximum depth, attempting to enqueue a new event should return null (rejection) and the queue size should remain at the maximum. **Validates: Requirements 11.5** ### Property 18: Non-message events use instruction as prompt *For any* heartbeat or cron event with an instruction string, when the AgentRuntime processes it, the Agent SDK `query()` call should use the instruction string as the prompt text and include the assembled system prompt via the `systemPrompt` option. **Validates: Requirements 12.4, 17.4, 18.4** ### Property 19: System prompt assembly with section headers and ordering *For any* set of MarkdownConfigs where at least one field is non-null, the assembled system prompt should: (a) wrap each non-null config in a section with the format `## [Section Name]\n\n[content]`, (b) order sections as Identity, Personality, Operating Rules, User Context, Long-Term Memory, Tool Configuration, and (c) include a preamble instructing the agent it may update memory.md using the Write tool. **Validates: Requirements 22.1, 22.2, 22.4, 12.2, 14.2, 15.2, 15.3, 16.2** ### Property 20: Missing or empty configs are omitted from system prompt *For any* set of MarkdownConfigs where some fields are null or empty strings, the assembled system prompt should not contain section headers for those missing/empty configs, and the number of section headers in the output should equal the number of non-null, non-empty config fields. **Validates: Requirements 22.3, 13.4, 14.3, 16.3** ### Property 21: System prompt assembly round-trip *For any* valid set of MarkdownConfigs (with non-null, non-empty values), assembling the system prompt and then parsing the section headers from the output should produce the same set of section names as the non-null input config fields. **Validates: Requirements 22.5** ### Property 22: Heartbeat config parsing *For any* valid heartbeat.md content containing check definitions with names, instructions, and intervals, parsing the content should produce HeartbeatCheck objects whose fields match the defined values. **Validates: Requirements 17.1** ### Property 23: Heartbeat minimum interval enforcement *For any* heartbeat check definition with an interval less than 60 seconds, the HeartbeatScheduler should reject the definition and not start a timer for it. **Validates: Requirements 17.6** ### Property 24: Cron job config parsing *For any* valid agents.md content containing a "Cron Jobs" section with job definitions (name, cron expression, instruction), parsing the content should produce CronJob objects whose fields match the defined values. **Validates: Requirements 18.1** ### Property 25: Invalid cron expression rejection *For any* cron job definition with a syntactically invalid cron expression, the CronScheduler should skip scheduling for that job without affecting other valid jobs. **Validates: Requirements 18.5** ### Property 26: Lifecycle hooks fire for every event *For any* event processed by the AgentRuntime, the agent_begin hook should fire before the main processing and the agent_stop hook should fire after the main processing completes, both processed inline. **Validates: Requirements 19.3, 19.4** ### Property 27: Bootstrap creates missing files with defaults *For any* set of required files specified in BootConfig where some files are missing from CONFIG_DIR, the bootstrap process should create each missing file with its default content, and after bootstrap all required files should exist. **Validates: Requirements 20.2, 20.3** ### Property 28: State reconstruction after restart *For any* set of markdown configuration files in CONFIG_DIR, reading all configs and assembling the system prompt should produce the same result regardless of whether it's the first read or a subsequent read after a simulated restart (i.e., the markdown files are the complete source of truth with no in-memory state dependency). **Validates: Requirements 21.3** ## Error Handling ### Startup Errors - **Missing required config**: `loadConfig()` throws with a message listing all missing required environment variables. The process exits with code 1. - **Invalid Discord token**: The discord.js client emits an error event. The gateway catches it, logs the error, and exits with code 1. - **Network failures on startup**: If Discord or the API is unreachable, the gateway logs the connection error and exits with code 1. - **Missing boot.md**: The BootstrapManager falls back to built-in defaults (require soul.md and identity.md, create missing optional files with empty headers). - **Missing required markdown files**: The BootstrapManager creates them with default content and logs which files were created. ### Runtime Errors - **Agent SDK query errors**: Caught per-event. For message events, the gateway formats a user-friendly message (error type only, no internals) and sends it to the Discord channel. For heartbeat/cron events, the error is logged. - **Session corruption**: If the Agent SDK returns an error indicating the session is invalid, the gateway removes the channel binding and informs the user to retry. - **Query timeout**: A `Promise.race` between the query stream processing and a timeout timer. On timeout, the gateway sends a timeout notification to the channel (for message events) and aborts the stream iteration. - **Discord API send failures**: Caught and logged with channel ID and content length. The event processing continues. - **Concurrency limit exceeded**: New prompts are immediately rejected with a "system busy" message. No query is started. - **Event queue overflow**: When the queue reaches max depth, new events are rejected (enqueue returns null). For message events, a "system busy" message is sent. For heartbeat/cron events, the rejection is logged. - **Markdown file read errors**: If a config file cannot be read (permissions, I/O error), the MarkdownConfigLoader logs the error and returns null for that file. The system prompt is assembled without the failed section. - **Invalid heartbeat interval**: Heartbeat checks with interval < 60 seconds are rejected with a warning log. Other valid checks still start. - **Invalid cron expression**: Invalid cron jobs are skipped with a warning log. Other valid jobs still schedule. - **Memory.md write failures**: If the agent's Write tool call to memory.md fails, the error is logged. The event still completes, but the memory update is lost. ### Shutdown Errors - **In-flight event timeout during shutdown**: The gateway waits up to the configured timeout for the current event to complete. If it doesn't complete, it is abandoned after the timeout. - **Shutdown hook processing failure**: If the shutdown hook event fails, the error is logged and shutdown continues. - **Discord disconnect failure**: Logged but does not prevent process exit. ## Testing Strategy ### Property-Based Testing Use **fast-check** as the property-based testing library for TypeScript. Each correctness property from the design document maps to a single property-based test. Tests should be configured with a minimum of 100 iterations per property. Each test must be tagged with a comment in the format: `// Feature: discord-claude-gateway, Property {number}: {property_text}` Key property tests organized by component: **Config & Startup (Properties 1, 2)**: - Generate random env var combinations and verify config loading behavior. - Generate subsets of missing required vars and verify error messages. **Prompt Handling (Properties 3, 4, 5)**: - Generate random message content with mentions and verify extraction. - Generate random slash command option values and verify extraction. - Generate messages with random bot/non-bot authors and verify filtering. **Agent SDK Integration (Properties 6, 7, 8, 18)**: - Generate random prompts, configs, and system prompts; verify query arguments. - Generate random channel/session pairs; verify resume behavior. - Generate random heartbeat/cron instructions; verify they're used as prompts. **Response Formatting (Property 9)**: - Generate strings of varying lengths with and without code blocks; verify chunk sizes and formatting. **Session Management (Properties 10, 11)**: - Generate random channel IDs and session IDs; verify CRUD operations and isolation. **Error Handling (Property 12)**: - Generate random error objects; verify formatted message excludes sensitive data. **Concurrency (Properties 13, 14)**: - Enqueue tasks with observable side effects; verify execution order. - Generate random active query counts; verify rejection behavior. **Event Queue (Properties 15, 16, 17)**: - Generate mixed-type event sequences; verify monotonic IDs and FIFO dispatch. - Generate queue-at-capacity scenarios; verify rejection. **System Prompt Assembly (Properties 19, 20, 21)**: - Generate random MarkdownConfigs with various null/non-null combinations; verify section headers, ordering, omission of empty sections, and round-trip parsing. **Config Parsing (Properties 22, 23, 24, 25)**: - Generate random heartbeat.md content; verify parsing and interval enforcement. - Generate random agents.md cron sections; verify parsing and invalid expression rejection. **Lifecycle & Bootstrap (Properties 26, 27, 28)**: - Generate event sequences; verify agent_begin/agent_stop hooks fire for each. - Generate file sets with missing files; verify bootstrap creates them. - Generate markdown file sets; verify state reconstruction produces consistent results. ### Unit Testing Unit tests complement property tests by covering specific examples, edge cases, and integration points: - **Config defaults** (Req 8.2): Verify specific default values when optional env vars are unset. - **Startup logging** (Req 1.3): Verify log output contains bot username and guild count. - **Typing indicator** (Req 3.4, 5.2): Verify typing indicator is sent when processing starts and maintained during streaming. - **Timeout handling** (Req 7.2): Verify timeout notification is sent after the configured period. - **Discord API error logging** (Req 7.3): Verify log contains channel ID and content length. - **Session corruption recovery** (Req 7.4): Verify binding removal and user notification on session error. - **Shutdown sequence** (Req 10.1, 10.2, 10.3): Verify the gateway stops accepting prompts, waits for in-flight queries, and disconnects. - **Result forwarding** (Req 5.1): Verify result messages are sent to the correct channel. - **Markdown file reading** (Req 13.1-13.3, 14.1, 15.1, 16.1): Verify each config file is read from the correct path. - **Hot-reload behavior** (Req 13.5, 14.4, 16.4): Verify modified files are picked up on next event cycle. - **Memory.md auto-creation** (Req 15.5): Verify memory.md is created with "# Memory" header when missing. - **Heartbeat timer startup** (Req 17.2, 17.3): Verify timers start and fire events. - **Missing heartbeat.md** (Req 17.5): Verify gateway operates without heartbeat events. - **Cron job scheduling** (Req 18.2, 18.3): Verify cron jobs schedule and fire events. - **Hook types** (Req 19.1): Verify all four hook types are supported. - **Startup hook** (Req 19.2): Verify startup hook fires after initialization. - **Shutdown hook** (Req 19.5): Verify shutdown hook fires and is processed before exit. - **Hook config parsing** (Req 19.6): Verify hooks section is parsed from agents.md. - **Boot.md reading** (Req 20.1): Verify boot config is read. - **Bootstrap logging** (Req 20.4): Verify log lists loaded and created files. - **Missing boot.md defaults** (Req 20.5): Verify built-in defaults are used. - **Memory write ordering** (Req 21.2): Verify memory changes are written before event completion signal. ### Test Organization ``` tests/ unit/ config-loader.test.ts discord-bot.test.ts session-manager.test.ts response-formatter.test.ts gateway-core.test.ts event-queue.test.ts agent-runtime.test.ts markdown-config-loader.test.ts system-prompt-assembler.test.ts heartbeat-scheduler.test.ts cron-scheduler.test.ts hook-manager.test.ts bootstrap-manager.test.ts property/ config.property.test.ts prompt-extraction.property.test.ts message-splitting.property.test.ts session-manager.property.test.ts error-formatting.property.test.ts channel-queue.property.test.ts concurrency.property.test.ts event-queue.property.test.ts system-prompt.property.test.ts heartbeat-config.property.test.ts cron-config.property.test.ts lifecycle-hooks.property.test.ts bootstrap.property.test.ts state-reconstruction.property.test.ts ``` ### Testing Tools - **vitest** as the test runner - **fast-check** for property-based testing - Discord.js client and Agent SDK `query()` should be mocked in unit tests using vitest mocks - File system operations should be mocked using `memfs` or vitest's `vi.mock` for markdown config tests - **node-cron** should be mocked for cron scheduler tests to avoid real timer dependencies