Initial commit: Discord-Claude Gateway with event-driven agent runtime

This commit is contained in:
2026-02-22 00:31:25 -05:00
commit 77d7c74909
58 changed files with 11772 additions and 0 deletions

View File

@@ -0,0 +1,908 @@
# Design Document: Discord-Claude Gateway
## Overview
The Discord-Claude Gateway is a TypeScript event-driven agent runtime platform that bridges Discord's messaging platform with the Claude Agent SDK. Inspired by the OpenClaw architecture, it goes beyond a simple chat bridge: it is a long-running process that accepts inputs from multiple sources (Discord messages, heartbeat timers, cron jobs, lifecycle hooks), routes them through a unified event queue, and dispatches them to an AI agent runtime for processing.
The agent's personality, identity, user context, long-term memory, tool configuration, and operating rules are all defined in local markdown files. The runtime reads these files fresh on each event processing cycle, assembles a dynamic system prompt, and passes it to the Agent SDK's `query()` function via the `systemPrompt` option. The agent can write back to `memory.md` using the Write tool, completing a read-process-write loop that persists state across sessions.
### Key Design Decisions
- **discord.js** for the Discord bot client — the most mature and widely-used Discord library for Node.js/TypeScript.
- **Unified event queue** — all inputs (messages, heartbeats, crons, hooks) enter a single in-memory FIFO queue, ensuring consistent ordering and preventing race conditions.
- **Markdown files as single source of truth** — all agent state and configuration lives in markdown files on disk. No database, no external state store. The runtime reconstructs full context by reading CONFIG_DIR on startup.
- **Fresh reads per event** — markdown config files are read from disk on each event processing cycle, so edits take effect immediately without restarts.
- **Per-channel session binding** — each Discord channel maps to at most one Agent SDK session, enabling conversational continuity.
- **Sequential event processing** — the event queue processes one event at a time to avoid concurrent writes to markdown files and ensure deterministic behavior.
- **Environment-variable-driven configuration** — all deployment settings are read from environment variables with sensible defaults.
- **node-cron** for cron scheduling — lightweight, well-maintained cron expression parser for Node.js.
- **Agent SDK `systemPrompt` option** — the assembled markdown content is injected via the `systemPrompt` option in the `query()` function call.
## Architecture
```mermaid
graph TD
A[Discord Users] -->|Messages / Slash Commands| B[DiscordBot]
B -->|message Event| C[EventQueue]
D[HeartbeatScheduler] -->|heartbeat Event| C
E[CronScheduler] -->|cron Event| C
F[HookManager] -->|hook Event| C
C -->|Dequeue FIFO| G[AgentRuntime]
G -->|Read configs| H[MarkdownConfigLoader]
H -->|soul.md, identity.md, etc.| I[CONFIG_DIR]
G -->|Assemble prompt| J[SystemPromptAssembler]
J -->|systemPrompt option| K[Agent SDK query]
K -->|Response Stream| L[ResponseFormatter]
L -->|Split & send| B
G -->|Agent writes memory.md| I
G -->|Signal complete| C
M[BootstrapManager] -->|Validate/create files| I
N[ConfigLoader] -->|Env vars| G
O[SessionManager] -->|Channel Bindings| G
P[GatewayCore] -->|Orchestrates| B
P -->|Orchestrates| C
P -->|Orchestrates| D
P -->|Orchestrates| E
P -->|Orchestrates| F
P -->|Orchestrates| M
P -->|Shutdown| Q[ShutdownHandler]
```
The system is composed of the following layers:
1. **Input Layer** — Multiple event sources feed into the unified queue:
- **DiscordBot**: Handles bot authentication, message/interaction reception, typing indicators, and message sending.
- **HeartbeatScheduler**: Manages recurring timers that fire heartbeat events at configured intervals.
- **CronScheduler**: Manages cron-expression-based scheduled events.
- **HookManager**: Fires lifecycle hook events (startup, agent_begin, agent_stop, shutdown).
2. **Event Queue** — A single in-memory FIFO queue that accepts all event types and dispatches them one at a time to the Agent Runtime.
3. **Agent Runtime** — The core processing engine that:
- Dequeues events from the EventQueue.
- Reads markdown config files via MarkdownConfigLoader.
- Assembles the system prompt via SystemPromptAssembler.
- Calls the Agent SDK `query()` with the assembled `systemPrompt` option.
- Routes responses back to the appropriate output channel.
- Signals the EventQueue when processing is complete.
4. **Configuration Layer**:
- **ConfigLoader**: Reads and validates environment variables at startup.
- **MarkdownConfigLoader**: Reads markdown files from CONFIG_DIR on each event cycle.
- **SystemPromptAssembler**: Concatenates markdown file contents with section headers into the system prompt.
- **BootstrapManager**: Validates and creates missing markdown files on first run.
5. **Session & Response Layer**:
- **SessionManager**: Maintains the mapping between Discord channel IDs and Agent SDK session IDs.
- **ResponseFormatter**: Splits long responses at safe boundaries (respecting code blocks and the 2000-char limit).
6. **Lifecycle Layer**:
- **GatewayCore**: The main orchestrator that wires all components and manages the startup/shutdown sequence.
- **ShutdownHandler**: Listens for SIGTERM/SIGINT, fires shutdown hook, drains the queue, and disconnects cleanly.
## Components and Interfaces
### ConfigLoader
Responsible for reading and validating environment variables at startup.
```typescript
interface GatewayConfig {
discordBotToken: string; // DISCORD_BOT_TOKEN (required)
anthropicApiKey: string; // ANTHROPIC_API_KEY (required)
allowedTools: string[]; // ALLOWED_TOOLS, default: ["Read","Write","Edit","Glob","Grep","WebSearch","WebFetch"]
permissionMode: string; // PERMISSION_MODE, default: "bypassPermissions"
queryTimeoutMs: number; // QUERY_TIMEOUT_MS, default: 120000
maxConcurrentQueries: number; // MAX_CONCURRENT_QUERIES, default: 5
configDir: string; // CONFIG_DIR, default: "./config"
maxQueueDepth: number; // MAX_QUEUE_DEPTH, default: 100
outputChannelId?: string; // OUTPUT_CHANNEL_ID, optional — default channel for heartbeat/cron output
}
function loadConfig(): GatewayConfig;
// Throws with descriptive message listing missing required vars if validation fails.
```
### DiscordBot
Wraps the discord.js `Client`, registers slash commands, and exposes event handlers.
```typescript
interface DiscordBot {
start(token: string): Promise<void>;
// Authenticates and waits for ready state. Logs username and guild count.
registerCommands(): Promise<void>;
// Registers /claude and /claude-reset slash commands.
sendMessage(channelId: string, content: string): Promise<void>;
// Sends a message to a channel. Logs errors if Discord API rejects.
sendTyping(channelId: string): Promise<void>;
// Sends a typing indicator to a channel.
destroy(): Promise<void>;
// Disconnects the bot from Discord.
onPrompt(handler: (prompt: Prompt) => void): void;
// Registers a callback for incoming prompts (from mentions or /claude).
onReset(handler: (channelId: string) => void): void;
// Registers a callback for /claude-reset commands.
}
interface Prompt {
text: string;
channelId: string;
userId: string;
guildId: string | null;
}
```
### EventQueue
Unified in-memory FIFO queue that accepts all event types and dispatches them sequentially to the AgentRuntime.
```typescript
interface Event {
id: number; // Monotonically increasing sequence number
type: EventType; // "message" | "heartbeat" | "cron" | "hook" | "webhook"
payload: EventPayload; // Type-specific payload
timestamp: Date; // Enqueue timestamp
source: string; // Source identifier (e.g., "discord", "heartbeat-scheduler", "cron-scheduler")
}
type EventType = "message" | "heartbeat" | "cron" | "hook" | "webhook";
interface MessagePayload {
prompt: Prompt; // The Discord prompt
}
interface HeartbeatPayload {
instruction: string; // The heartbeat check instruction
checkName: string; // Name of the heartbeat check
}
interface CronPayload {
instruction: string; // The cron job instruction
jobName: string; // Name of the cron job
}
interface HookPayload {
hookType: HookType; // "startup" | "agent_begin" | "agent_stop" | "shutdown"
instruction?: string; // Optional instruction prompt from agents.md
}
type HookType = "startup" | "agent_begin" | "agent_stop" | "shutdown";
type EventPayload = MessagePayload | HeartbeatPayload | CronPayload | HookPayload;
interface EventQueue {
enqueue(event: Omit<Event, "id" | "timestamp">): Event | null;
// Assigns sequence number and timestamp. Returns the event, or null if queue is full.
dequeue(): Event | undefined;
// Returns the next event in FIFO order, or undefined if empty.
size(): number;
// Returns current queue depth.
onEvent(handler: (event: Event) => Promise<void>): void;
// Registers the processing handler. The queue calls this for each event
// and waits for the promise to resolve before dispatching the next.
drain(): Promise<void>;
// Returns a promise that resolves when the queue is empty and no event is processing.
}
```
### AgentRuntime
Core processing engine that reads markdown configs, assembles the system prompt, and calls the Agent SDK.
```typescript
interface AgentRuntime {
processEvent(event: Event): Promise<EventResult>;
// Main entry point. Reads markdown configs, assembles system prompt,
// calls Agent SDK query(), and returns the result.
// For message events: uses the prompt text and resumes/creates sessions.
// For heartbeat/cron events: uses the instruction text as the prompt.
// For hook events: uses the hook instruction (if any) as the prompt.
}
interface EventResult {
responseText?: string; // The agent's response text (if any)
targetChannelId?: string; // Discord channel to send the response to
sessionId?: string; // Session ID (for message events)
}
```
### MarkdownConfigLoader
Reads markdown configuration files from CONFIG_DIR. Files are read fresh on each call (no caching) so that edits take effect immediately.
```typescript
interface MarkdownConfigs {
soul: string | null; // soul.md content, null if missing
identity: string | null; // identity.md content, null if missing
agents: string | null; // agents.md content, null if missing
user: string | null; // user.md content, null if missing
memory: string | null; // memory.md content, null if missing
tools: string | null; // tools.md content, null if missing
}
interface MarkdownConfigLoader {
loadAll(configDir: string): Promise<MarkdownConfigs>;
// Reads all markdown config files. Returns null for missing files.
// Logs warnings for missing soul.md, identity.md, agents.md, user.md, tools.md.
// Creates memory.md with "# Memory" header if missing.
loadFile(configDir: string, filename: string): Promise<string | null>;
// Reads a single markdown file. Returns null if missing.
}
```
### SystemPromptAssembler
Assembles the system prompt from markdown config file contents.
```typescript
interface SystemPromptAssembler {
assemble(configs: MarkdownConfigs): string;
// Concatenates markdown file contents in order:
// 1. Identity (## Identity)
// 2. Soul (## Personality)
// 3. Agents (## Operating Rules)
// 4. User (## User Context)
// 5. Memory (## Long-Term Memory)
// 6. Tools (## Tool Configuration)
//
// Each section is wrapped: "## [Section Name]\n\n[content]\n\n"
// Sections with null/empty content are omitted.
// A preamble is prepended instructing the agent it may update memory.md.
}
```
### HeartbeatScheduler
Manages recurring heartbeat timers based on heartbeat.md configuration.
```typescript
interface HeartbeatCheck {
name: string; // Check name/identifier
instruction: string; // Instruction prompt for the agent
intervalSeconds: number; // Interval between checks (minimum 60)
}
interface HeartbeatScheduler {
start(checks: HeartbeatCheck[], enqueue: (event: Omit<Event, "id" | "timestamp">) => Event | null): void;
// Starts a recurring timer for each check. On each tick, creates a heartbeat
// event and enqueues it. Rejects checks with interval < 60 seconds.
stop(): void;
// Stops all heartbeat timers.
parseConfig(content: string): HeartbeatCheck[];
// Parses heartbeat.md content into check definitions.
}
```
### CronScheduler
Manages cron-expression-based scheduled events parsed from agents.md.
```typescript
interface CronJob {
name: string; // Job name/identifier
expression: string; // Cron expression (e.g., "0 9 * * 1")
instruction: string; // Instruction prompt for the agent
}
interface CronScheduler {
start(jobs: CronJob[], enqueue: (event: Omit<Event, "id" | "timestamp">) => Event | null): void;
// Schedules each cron job. On each trigger, creates a cron event and enqueues it.
// Logs warning and skips jobs with invalid cron expressions.
stop(): void;
// Stops all cron jobs.
parseConfig(content: string): CronJob[];
// Parses the "Cron Jobs" section from agents.md into job definitions.
}
```
### HookManager
Fires lifecycle hook events at appropriate points in the Gateway lifecycle.
```typescript
interface HookConfig {
startup?: string; // Instruction prompt for startup hook
agent_begin?: string; // Instruction prompt for agent_begin hook
agent_stop?: string; // Instruction prompt for agent_stop hook
shutdown?: string; // Instruction prompt for shutdown hook
}
interface HookManager {
fire(hookType: HookType, enqueue: (event: Omit<Event, "id" | "timestamp">) => Event | null): void;
// Creates a hook event and enqueues it.
// agent_begin and agent_stop are processed inline (not re-enqueued).
fireInline(hookType: HookType, runtime: AgentRuntime): Promise<void>;
// For agent_begin/agent_stop: processes the hook inline without going through the queue.
parseConfig(content: string): HookConfig;
// Parses the "Hooks" section from agents.md into hook configuration.
}
```
### BootstrapManager
Handles first-run setup: validates markdown files exist, creates missing ones with defaults.
```typescript
interface BootConfig {
requiredFiles: string[]; // Files that must exist (default: ["soul.md", "identity.md"])
optionalFiles: string[]; // Files created with defaults if missing
defaults: Record<string, string>; // Default content for each file
}
interface BootstrapManager {
run(configDir: string): Promise<BootstrapResult>;
// 1. Reads boot.md for bootstrap parameters (or uses built-in defaults).
// 2. Verifies all required markdown files exist.
// 3. Creates missing optional files with default content.
// 4. Logs loaded files and any files created with defaults.
parseBootConfig(content: string | null): BootConfig;
// Parses boot.md content into bootstrap parameters.
// Returns built-in defaults if content is null.
}
interface BootstrapResult {
loadedFiles: string[]; // Files that existed and were loaded
createdFiles: string[]; // Files that were created with defaults
}
```
### SessionManager
Manages the mapping between Discord channels and Agent SDK sessions.
```typescript
interface SessionManager {
getSessionId(channelId: string): string | undefined;
setSessionId(channelId: string, sessionId: string): void;
removeSession(channelId: string): void;
clear(): void;
}
```
### ResponseFormatter
Splits long response text into Discord-safe chunks.
```typescript
function splitMessage(text: string, maxLength?: number): string[];
// Splits text into chunks of at most maxLength (default 2000) characters.
// Preserves code block formatting: if a split occurs inside a code block,
// the chunk is closed with ``` and the next chunk reopens with ```.
// Splits prefer line boundaries over mid-line breaks.
```
### GatewayCore
The main orchestrator that wires all components together and manages the full lifecycle.
```typescript
interface GatewayCore {
start(): Promise<void>;
// 1. Load config (ConfigLoader)
// 2. Run bootstrap (BootstrapManager)
// 3. Start Discord bot (DiscordBot)
// 4. Initialize EventQueue and AgentRuntime
// 5. Parse heartbeat.md → start HeartbeatScheduler
// 6. Parse agents.md → start CronScheduler, load HookConfig
// 7. Fire startup hook
// 8. Begin accepting events
shutdown(): Promise<void>;
// 1. Stop accepting new events from Discord
// 2. Stop HeartbeatScheduler and CronScheduler
// 3. Fire shutdown hook (enqueue and wait for processing)
// 4. Drain EventQueue
// 5. Disconnect DiscordBot
// 6. Exit with code 0
}
```
### Agent SDK Integration
The gateway calls the Agent SDK `query()` function with the assembled system prompt:
```typescript
import { query } from "@anthropic-ai/claude-agent-sdk";
// For a message event (new session):
const stream = query({
prompt: event.payload.prompt.text,
options: {
allowedTools: config.allowedTools,
permissionMode: config.permissionMode,
systemPrompt: assembledSystemPrompt, // Injected via systemPrompt option
}
});
// For a message event (resumed session):
const stream = query({
prompt: event.payload.prompt.text,
options: {
resume: sessionId,
allowedTools: config.allowedTools,
permissionMode: config.permissionMode,
systemPrompt: assembledSystemPrompt,
}
});
// For a heartbeat/cron event:
const stream = query({
prompt: event.payload.instruction,
options: {
allowedTools: config.allowedTools,
permissionMode: config.permissionMode,
systemPrompt: assembledSystemPrompt,
}
});
// Processing the stream:
for await (const message of stream) {
if (message.type === "system" && message.subtype === "init") {
sessionManager.setSessionId(channelId, message.session_id);
}
if ("result" in message) {
const chunks = splitMessage(message.result);
for (const chunk of chunks) {
await discordBot.sendMessage(targetChannelId, chunk);
}
}
}
```
## Data Models
### Event
```typescript
interface Event {
id: number; // Monotonically increasing sequence number
type: EventType; // "message" | "heartbeat" | "cron" | "hook" | "webhook"
payload: EventPayload; // Type-specific payload
timestamp: Date; // Enqueue timestamp
source: string; // Source identifier
}
```
### Channel Binding
```typescript
// In-memory Map<string, string>
// Key: Discord channel ID
// Value: Agent SDK session ID
type ChannelBindings = Map<string, string>;
```
### Prompt
```typescript
interface Prompt {
text: string; // The extracted prompt text
channelId: string; // Discord channel ID where the prompt originated
userId: string; // Discord user ID of the sender
guildId: string | null; // Discord guild ID (null for DMs)
}
```
### Markdown Configs
```typescript
interface MarkdownConfigs {
soul: string | null; // soul.md content
identity: string | null; // identity.md content
agents: string | null; // agents.md content
user: string | null; // user.md content
memory: string | null; // memory.md content
tools: string | null; // tools.md content
}
```
### Gateway State
```typescript
interface GatewayState {
config: GatewayConfig;
channelBindings: ChannelBindings;
activeQueryCount: number; // Number of currently executing Agent SDK queries
isShuttingDown: boolean; // True after receiving shutdown signal
eventQueue: EventQueue;
nextEventId: number; // Next sequence number for events
}
```
### Heartbeat and Cron Definitions
```typescript
interface HeartbeatCheck {
name: string;
instruction: string;
intervalSeconds: number; // Minimum 60
}
interface CronJob {
name: string;
expression: string; // Standard cron expression
instruction: string;
}
```
### Hook Configuration
```typescript
interface HookConfig {
startup?: string;
agent_begin?: string;
agent_stop?: string;
shutdown?: string;
}
```
### Bootstrap Configuration
```typescript
interface BootConfig {
requiredFiles: string[];
optionalFiles: string[];
defaults: Record<string, string>;
}
interface BootstrapResult {
loadedFiles: string[];
createdFiles: string[];
}
```
## Correctness Properties
*A property is a characteristic or behavior that should hold true across all valid executions of a system — essentially, a formal statement about what the system should do. Properties serve as the bridge between human-readable specifications and machine-verifiable correctness guarantees.*
### Property 1: Config loading round-trip
*For any* set of valid environment variable values for DISCORD_BOT_TOKEN, ANTHROPIC_API_KEY, ALLOWED_TOOLS, PERMISSION_MODE, QUERY_TIMEOUT_MS, CONFIG_DIR, and MAX_QUEUE_DEPTH, calling `loadConfig()` should produce a `GatewayConfig` whose fields match the corresponding environment variable values, with ALLOWED_TOOLS correctly split from a comma-separated string into an array.
**Validates: Requirements 8.1, 2.3**
### Property 2: Missing required config reports all missing values
*For any* subset of the required environment variables (DISCORD_BOT_TOKEN, ANTHROPIC_API_KEY) that are unset, calling `loadConfig()` should throw an error whose message contains the names of all missing variables.
**Validates: Requirements 8.3, 1.2, 2.2**
### Property 3: Mention prompt extraction
*For any* Discord message containing a bot mention and arbitrary surrounding text, the prompt extraction function should return the message text with the mention removed and leading/trailing whitespace trimmed.
**Validates: Requirements 3.1**
### Property 4: Slash command prompt extraction
*For any* slash command interaction with a prompt option value, the prompt extraction function should return exactly the option value as the prompt text.
**Validates: Requirements 3.2**
### Property 5: Bot message filtering
*For any* Discord message where the author has the bot flag set, the gateway should not process it as a prompt. Conversely, for any message from a non-bot user that mentions the bot, the gateway should process it.
**Validates: Requirements 3.3**
### Property 6: Query arguments correctness
*For any* prompt text, gateway configuration, and assembled system prompt, the arguments passed to the Agent SDK `query()` function should include the prompt text, the configured allowed tools array, the configured permission mode, and the assembled system prompt via the `systemPrompt` option.
**Validates: Requirements 4.1, 4.4, 12.3**
### Property 7: Session resume on existing binding
*For any* channel that has a stored session ID in the channel bindings, when a new prompt is received for that channel, the `query()` call should include the `resume` option set to the stored session ID.
**Validates: Requirements 4.2, 6.2**
### Property 8: New session creation and storage
*For any* channel without an existing channel binding, when a prompt is processed and the Agent SDK returns an init message with a session_id, the session manager should store that session_id for the channel. A subsequent lookup for that channel should return the stored session_id.
**Validates: Requirements 4.3, 6.1**
### Property 9: Message splitting with code block preservation
*For any* string of arbitrary length, `splitMessage(text)` should produce chunks where: (a) every chunk is at most 2000 characters, (b) concatenating all chunks reproduces the original text (modulo inserted code block delimiters), and (c) if the original text contains a code block that spans a split boundary, each chunk is a valid markdown fragment with properly opened and closed code fences.
**Validates: Requirements 5.3, 5.4**
### Property 10: Reset removes channel binding
*For any* channel with a stored session ID, invoking the reset handler for that channel should result in the session manager returning `undefined` for that channel's session ID.
**Validates: Requirements 6.3**
### Property 11: Concurrent session isolation
*For any* two distinct channel IDs with different stored session IDs, operations on one channel (setting, getting, or removing its session) should not affect the session ID stored for the other channel.
**Validates: Requirements 6.4**
### Property 12: Error message formatting
*For any* error thrown by the Agent SDK (with a type/name and a message), the user-facing error message produced by the gateway should contain the error type/name but should not contain stack traces, API keys, or file paths from the server environment.
**Validates: Requirements 7.1**
### Property 13: Sequential per-channel queue ordering
*For any* sequence of tasks enqueued for the same channel, the tasks should execute in FIFO order, and no two tasks for the same channel should execute concurrently.
**Validates: Requirements 9.2**
### Property 14: Concurrency limit rejection
*For any* gateway state where the active query count is at or above the configured maximum, attempting to process a new prompt should be rejected with a "system busy" response rather than being forwarded to the Agent SDK.
**Validates: Requirements 9.3**
### Property 15: Event queue accepts all event types with monotonic IDs
*For any* sequence of events of mixed types (message, heartbeat, cron, hook, webhook), enqueuing them should succeed (when below max depth), and each event should be assigned a strictly increasing sequence number and a timestamp no earlier than the previous event's timestamp.
**Validates: Requirements 11.1, 11.2**
### Property 16: Event queue FIFO dispatch with sequential processing
*For any* sequence of events enqueued in the EventQueue, the processing handler should be called with events in the exact order they were enqueued, and the handler for event N+1 should not be called until the handler for event N has completed.
**Validates: Requirements 11.3, 11.4, 12.5**
### Property 17: Event queue depth rejection
*For any* EventQueue that has reached its configured maximum depth, attempting to enqueue a new event should return null (rejection) and the queue size should remain at the maximum.
**Validates: Requirements 11.5**
### Property 18: Non-message events use instruction as prompt
*For any* heartbeat or cron event with an instruction string, when the AgentRuntime processes it, the Agent SDK `query()` call should use the instruction string as the prompt text and include the assembled system prompt via the `systemPrompt` option.
**Validates: Requirements 12.4, 17.4, 18.4**
### Property 19: System prompt assembly with section headers and ordering
*For any* set of MarkdownConfigs where at least one field is non-null, the assembled system prompt should: (a) wrap each non-null config in a section with the format `## [Section Name]\n\n[content]`, (b) order sections as Identity, Personality, Operating Rules, User Context, Long-Term Memory, Tool Configuration, and (c) include a preamble instructing the agent it may update memory.md using the Write tool.
**Validates: Requirements 22.1, 22.2, 22.4, 12.2, 14.2, 15.2, 15.3, 16.2**
### Property 20: Missing or empty configs are omitted from system prompt
*For any* set of MarkdownConfigs where some fields are null or empty strings, the assembled system prompt should not contain section headers for those missing/empty configs, and the number of section headers in the output should equal the number of non-null, non-empty config fields.
**Validates: Requirements 22.3, 13.4, 14.3, 16.3**
### Property 21: System prompt assembly round-trip
*For any* valid set of MarkdownConfigs (with non-null, non-empty values), assembling the system prompt and then parsing the section headers from the output should produce the same set of section names as the non-null input config fields.
**Validates: Requirements 22.5**
### Property 22: Heartbeat config parsing
*For any* valid heartbeat.md content containing check definitions with names, instructions, and intervals, parsing the content should produce HeartbeatCheck objects whose fields match the defined values.
**Validates: Requirements 17.1**
### Property 23: Heartbeat minimum interval enforcement
*For any* heartbeat check definition with an interval less than 60 seconds, the HeartbeatScheduler should reject the definition and not start a timer for it.
**Validates: Requirements 17.6**
### Property 24: Cron job config parsing
*For any* valid agents.md content containing a "Cron Jobs" section with job definitions (name, cron expression, instruction), parsing the content should produce CronJob objects whose fields match the defined values.
**Validates: Requirements 18.1**
### Property 25: Invalid cron expression rejection
*For any* cron job definition with a syntactically invalid cron expression, the CronScheduler should skip scheduling for that job without affecting other valid jobs.
**Validates: Requirements 18.5**
### Property 26: Lifecycle hooks fire for every event
*For any* event processed by the AgentRuntime, the agent_begin hook should fire before the main processing and the agent_stop hook should fire after the main processing completes, both processed inline.
**Validates: Requirements 19.3, 19.4**
### Property 27: Bootstrap creates missing files with defaults
*For any* set of required files specified in BootConfig where some files are missing from CONFIG_DIR, the bootstrap process should create each missing file with its default content, and after bootstrap all required files should exist.
**Validates: Requirements 20.2, 20.3**
### Property 28: State reconstruction after restart
*For any* set of markdown configuration files in CONFIG_DIR, reading all configs and assembling the system prompt should produce the same result regardless of whether it's the first read or a subsequent read after a simulated restart (i.e., the markdown files are the complete source of truth with no in-memory state dependency).
**Validates: Requirements 21.3**
## Error Handling
### Startup Errors
- **Missing required config**: `loadConfig()` throws with a message listing all missing required environment variables. The process exits with code 1.
- **Invalid Discord token**: The discord.js client emits an error event. The gateway catches it, logs the error, and exits with code 1.
- **Network failures on startup**: If Discord or the API is unreachable, the gateway logs the connection error and exits with code 1.
- **Missing boot.md**: The BootstrapManager falls back to built-in defaults (require soul.md and identity.md, create missing optional files with empty headers).
- **Missing required markdown files**: The BootstrapManager creates them with default content and logs which files were created.
### Runtime Errors
- **Agent SDK query errors**: Caught per-event. For message events, the gateway formats a user-friendly message (error type only, no internals) and sends it to the Discord channel. For heartbeat/cron events, the error is logged.
- **Session corruption**: If the Agent SDK returns an error indicating the session is invalid, the gateway removes the channel binding and informs the user to retry.
- **Query timeout**: A `Promise.race` between the query stream processing and a timeout timer. On timeout, the gateway sends a timeout notification to the channel (for message events) and aborts the stream iteration.
- **Discord API send failures**: Caught and logged with channel ID and content length. The event processing continues.
- **Concurrency limit exceeded**: New prompts are immediately rejected with a "system busy" message. No query is started.
- **Event queue overflow**: When the queue reaches max depth, new events are rejected (enqueue returns null). For message events, a "system busy" message is sent. For heartbeat/cron events, the rejection is logged.
- **Markdown file read errors**: If a config file cannot be read (permissions, I/O error), the MarkdownConfigLoader logs the error and returns null for that file. The system prompt is assembled without the failed section.
- **Invalid heartbeat interval**: Heartbeat checks with interval < 60 seconds are rejected with a warning log. Other valid checks still start.
- **Invalid cron expression**: Invalid cron jobs are skipped with a warning log. Other valid jobs still schedule.
- **Memory.md write failures**: If the agent's Write tool call to memory.md fails, the error is logged. The event still completes, but the memory update is lost.
### Shutdown Errors
- **In-flight event timeout during shutdown**: The gateway waits up to the configured timeout for the current event to complete. If it doesn't complete, it is abandoned after the timeout.
- **Shutdown hook processing failure**: If the shutdown hook event fails, the error is logged and shutdown continues.
- **Discord disconnect failure**: Logged but does not prevent process exit.
## Testing Strategy
### Property-Based Testing
Use **fast-check** as the property-based testing library for TypeScript.
Each correctness property from the design document maps to a single property-based test. Tests should be configured with a minimum of 100 iterations per property.
Each test must be tagged with a comment in the format:
`// Feature: discord-claude-gateway, Property {number}: {property_text}`
Key property tests organized by component:
**Config & Startup (Properties 1, 2)**:
- Generate random env var combinations and verify config loading behavior.
- Generate subsets of missing required vars and verify error messages.
**Prompt Handling (Properties 3, 4, 5)**:
- Generate random message content with mentions and verify extraction.
- Generate random slash command option values and verify extraction.
- Generate messages with random bot/non-bot authors and verify filtering.
**Agent SDK Integration (Properties 6, 7, 8, 18)**:
- Generate random prompts, configs, and system prompts; verify query arguments.
- Generate random channel/session pairs; verify resume behavior.
- Generate random heartbeat/cron instructions; verify they're used as prompts.
**Response Formatting (Property 9)**:
- Generate strings of varying lengths with and without code blocks; verify chunk sizes and formatting.
**Session Management (Properties 10, 11)**:
- Generate random channel IDs and session IDs; verify CRUD operations and isolation.
**Error Handling (Property 12)**:
- Generate random error objects; verify formatted message excludes sensitive data.
**Concurrency (Properties 13, 14)**:
- Enqueue tasks with observable side effects; verify execution order.
- Generate random active query counts; verify rejection behavior.
**Event Queue (Properties 15, 16, 17)**:
- Generate mixed-type event sequences; verify monotonic IDs and FIFO dispatch.
- Generate queue-at-capacity scenarios; verify rejection.
**System Prompt Assembly (Properties 19, 20, 21)**:
- Generate random MarkdownConfigs with various null/non-null combinations; verify section headers, ordering, omission of empty sections, and round-trip parsing.
**Config Parsing (Properties 22, 23, 24, 25)**:
- Generate random heartbeat.md content; verify parsing and interval enforcement.
- Generate random agents.md cron sections; verify parsing and invalid expression rejection.
**Lifecycle & Bootstrap (Properties 26, 27, 28)**:
- Generate event sequences; verify agent_begin/agent_stop hooks fire for each.
- Generate file sets with missing files; verify bootstrap creates them.
- Generate markdown file sets; verify state reconstruction produces consistent results.
### Unit Testing
Unit tests complement property tests by covering specific examples, edge cases, and integration points:
- **Config defaults** (Req 8.2): Verify specific default values when optional env vars are unset.
- **Startup logging** (Req 1.3): Verify log output contains bot username and guild count.
- **Typing indicator** (Req 3.4, 5.2): Verify typing indicator is sent when processing starts and maintained during streaming.
- **Timeout handling** (Req 7.2): Verify timeout notification is sent after the configured period.
- **Discord API error logging** (Req 7.3): Verify log contains channel ID and content length.
- **Session corruption recovery** (Req 7.4): Verify binding removal and user notification on session error.
- **Shutdown sequence** (Req 10.1, 10.2, 10.3): Verify the gateway stops accepting prompts, waits for in-flight queries, and disconnects.
- **Result forwarding** (Req 5.1): Verify result messages are sent to the correct channel.
- **Markdown file reading** (Req 13.1-13.3, 14.1, 15.1, 16.1): Verify each config file is read from the correct path.
- **Hot-reload behavior** (Req 13.5, 14.4, 16.4): Verify modified files are picked up on next event cycle.
- **Memory.md auto-creation** (Req 15.5): Verify memory.md is created with "# Memory" header when missing.
- **Heartbeat timer startup** (Req 17.2, 17.3): Verify timers start and fire events.
- **Missing heartbeat.md** (Req 17.5): Verify gateway operates without heartbeat events.
- **Cron job scheduling** (Req 18.2, 18.3): Verify cron jobs schedule and fire events.
- **Hook types** (Req 19.1): Verify all four hook types are supported.
- **Startup hook** (Req 19.2): Verify startup hook fires after initialization.
- **Shutdown hook** (Req 19.5): Verify shutdown hook fires and is processed before exit.
- **Hook config parsing** (Req 19.6): Verify hooks section is parsed from agents.md.
- **Boot.md reading** (Req 20.1): Verify boot config is read.
- **Bootstrap logging** (Req 20.4): Verify log lists loaded and created files.
- **Missing boot.md defaults** (Req 20.5): Verify built-in defaults are used.
- **Memory write ordering** (Req 21.2): Verify memory changes are written before event completion signal.
### Test Organization
```
tests/
unit/
config-loader.test.ts
discord-bot.test.ts
session-manager.test.ts
response-formatter.test.ts
gateway-core.test.ts
event-queue.test.ts
agent-runtime.test.ts
markdown-config-loader.test.ts
system-prompt-assembler.test.ts
heartbeat-scheduler.test.ts
cron-scheduler.test.ts
hook-manager.test.ts
bootstrap-manager.test.ts
property/
config.property.test.ts
prompt-extraction.property.test.ts
message-splitting.property.test.ts
session-manager.property.test.ts
error-formatting.property.test.ts
channel-queue.property.test.ts
concurrency.property.test.ts
event-queue.property.test.ts
system-prompt.property.test.ts
heartbeat-config.property.test.ts
cron-config.property.test.ts
lifecycle-hooks.property.test.ts
bootstrap.property.test.ts
state-reconstruction.property.test.ts
```
### Testing Tools
- **vitest** as the test runner
- **fast-check** for property-based testing
- Discord.js client and Agent SDK `query()` should be mocked in unit tests using vitest mocks
- File system operations should be mocked using `memfs` or vitest's `vi.mock` for markdown config tests
- **node-cron** should be mocked for cron scheduler tests to avoid real timer dependencies