Files

tanmay11k 77d7c74909 Initial commit: Discord-Claude Gateway with event-driven agent runtime

2026-02-22 00:31:25 -05:00

24 KiB

Raw Blame History

Requirements Document

Introduction

The Discord-Claude Gateway is an agent runtime platform inspired by OpenClaw that connects a Discord bot to Claude via the Claude Agent SDK. At its core, it is a long-running process that accepts messages from Discord, routes them through a unified event queue, and dispatches them to an AI agent runtime for processing. The agent's personality, identity, user context, long-term memory, tool configuration, and operating rules are all defined in local markdown files that the runtime reads on each wake-up. Beyond user messages, the platform supports five input types: Messages, Heartbeats (timer-based proactive checks), Cron Jobs (scheduled events), Hooks (internal state change triggers), and Webhooks (external system events). All inputs enter a unified event queue, are processed by the agent runtime, and state is persisted back to markdown files, completing the event loop.

Glossary

Gateway: The long-running middleware process that accepts connections from Discord, manages the event queue, and routes events to the Agent_Runtime.
Agent_Runtime: The core processing engine that dequeues events, assembles context from markdown configuration files, executes Agent SDK queries, performs actions using tools, and persists state changes.
Discord_Bot: The Discord.js bot client that listens for messages and slash commands in Discord channels and guilds.
Agent_SDK: The Claude Agent SDK (@anthropic-ai/claude-agent-sdk) TypeScript library that provides the query() function with a systemPrompt option for sending prompts to Claude and receiving streamed responses with built-in tool execution.
Event_Queue: The unified in-memory queue that receives all input events (messages, heartbeats, crons, hooks, webhooks) and dispatches them to the Agent_Runtime for sequential processing.
Event: A discrete unit of work entering the Event_Queue, carrying a type (message, heartbeat, cron, hook, webhook), payload, timestamp, and source metadata.
Session: A stateful conversation context maintained by the Agent SDK, identified by a session ID, that preserves conversation history across multiple exchanges.
Prompt: A text message submitted by a Discord user intended to be forwarded to Claude via the Agent SDK.
Response_Stream: The async iterable of messages returned by the Agent SDK's query() function, containing assistant text, tool use events, and result messages.
Channel_Binding: The association between a Discord channel and an active Agent SDK session, enabling conversational continuity.
Allowed_Tools: The configurable list of Agent SDK built-in tools (Read, Write, Edit, Bash, Glob, Grep, WebSearch, WebFetch) that Claude is permitted to use when processing prompts.
Permission_Mode: The Agent SDK permission configuration that controls tool execution approval behavior (bypassPermissions, acceptEdits, plan, default).
Soul_Config: The markdown file (soul.md) that defines the agent's personality, tone, values, and behavior defaults.
Identity_Config: The markdown file (identity.md) that defines the agent's name, role, and specialization.
Agents_Config: The markdown file (agents.md) that defines operating rules, workflows, and safety/permission boundaries.
User_Config: The markdown file (user.md) that stores information about the user: identity, preferences, and life/work context.
Memory_Store: The markdown file (memory.md) that stores long-term durable facts, IDs, lessons learned, and other persistent knowledge across sessions.
Tools_Config: The markdown file (tools.md) that documents API configurations, tool usage limits, and gotchas.
Heartbeat_Config: The markdown file (heartbeat.md) that defines proactive check instructions, intervals, and prompts for timer-based events.
Boot_Config: The markdown file (boot.md) that defines bootstrap configuration and initial setup parameters.
Bootstrap_Process: The initial setup sequence that loads Boot_Config, validates markdown files, and prepares the Agent_Runtime for operation.
Heartbeat: A timer-based event that fires at configurable intervals, triggering the agent to perform proactive checks (e.g., check email, review calendar).
Cron_Job: A scheduled event with exact timing (cron expression) and custom instructions that enters the Event_Queue at the specified time.
Hook: An internal state change trigger (e.g., startup, agent_begin, agent_stop) that fires when specific lifecycle events occur within the Gateway.
System_Prompt: The assembled prompt text constructed by combining Soul_Config, Identity_Config, Agents_Config, User_Config, Memory_Store, and Tools_Config content, passed to the Agent SDK query() function via the systemPrompt option.

Requirements

Requirement 1: Discord Bot Initialization

User Story: As an operator, I want the Gateway to connect to Discord on startup, so that it can begin receiving user messages.

Acceptance Criteria

WHEN the Gateway starts, THE Discord_Bot SHALL authenticate with Discord using a configured bot token and transition to a ready state.
IF the bot token is missing or invalid, THEN THE Gateway SHALL log a descriptive error message and terminate with a non-zero exit code.
WHEN the Discord_Bot reaches the ready state, THE Gateway SHALL log the bot's username and the number of guilds it has joined.

Requirement 2: Agent SDK Initialization

User Story: As an operator, I want the Gateway to validate Claude Agent SDK credentials on startup, so that I know the Claude integration is functional before accepting user prompts.

Acceptance Criteria

WHEN the Gateway starts, THE Gateway SHALL verify that the ANTHROPIC_API_KEY environment variable is set.
IF the ANTHROPIC_API_KEY environment variable is missing, THEN THE Gateway SHALL log a descriptive error message and terminate with a non-zero exit code.
THE Gateway SHALL load Allowed_Tools and Permission_Mode from a configuration source (environment variables or config file).

Requirement 3: Prompt Reception via Discord

User Story: As a Discord user, I want to send prompts to Claude by mentioning the bot or using a slash command, so that I can interact with Claude from within Discord.

Acceptance Criteria

WHEN a user sends a message that mentions the Discord_Bot, THE Gateway SHALL extract the text content (excluding the mention) and treat it as a Prompt.
WHEN a user invokes the /claude slash command with a prompt argument, THE Gateway SHALL extract the prompt argument and treat it as a Prompt.
THE Gateway SHALL ignore messages sent by other bots to prevent feedback loops.
WHEN a Prompt is received, THE Gateway SHALL send a typing indicator in the originating Discord channel while processing.

Requirement 4: Prompt Forwarding to Claude Agent SDK

User Story: As a Discord user, I want my prompts forwarded to Claude Code, so that I receive intelligent responses powered by Claude's agent capabilities.

Acceptance Criteria

WHEN a valid Prompt is received, THE Gateway SHALL call the Agent SDK query() function with the Prompt text, the configured Allowed_Tools, and the configured Permission_Mode.
WHILE a Channel_Binding exists for the originating Discord channel, THE Gateway SHALL resume the existing Session by passing the session ID to the query() function.
WHEN a new Prompt is received in a channel without a Channel_Binding, THE Gateway SHALL create a new Session and store the Channel_Binding.
THE Gateway SHALL pass the assembled System_Prompt to the Agent SDK query() function via the systemPrompt option to inject personality, identity, user context, and memory.

Requirement 5: Response Streaming and Delivery

User Story: As a Discord user, I want to see Claude's responses in my Discord channel, so that I can read and act on the information Claude provides.

Acceptance Criteria

WHEN the Agent SDK emits a result message in the Response_Stream, THE Gateway SHALL send the result text as a message in the originating Discord channel.
WHILE the Response_Stream is active, THE Gateway SHALL maintain the typing indicator in the Discord channel.
IF the response text exceeds 2000 characters (Discord's message limit), THEN THE Gateway SHALL split the response into multiple sequential messages, each within the character limit, preserving code block formatting across splits.
WHEN the Response_Stream contains markdown code blocks, THE Gateway SHALL preserve the markdown formatting in the Discord message.

Requirement 6: Session Management

User Story: As a Discord user, I want my conversation context preserved across messages in the same channel, so that I can have multi-turn conversations with Claude.

Acceptance Criteria

WHEN the Agent SDK returns an init message with a session_id, THE Gateway SHALL store the session_id in the Channel_Binding for the originating Discord channel.
WHILE a Channel_Binding exists, THE Gateway SHALL use the stored session_id to resume the Session on subsequent prompts from the same channel.
WHEN a user invokes the /claude-reset slash command, THE Gateway SHALL remove the Channel_Binding for that channel, causing the next prompt to start a new Session.
THE Gateway SHALL support concurrent Sessions across multiple Discord channels without interference.

Requirement 7: Error Handling

User Story: As a Discord user, I want to see clear error messages when something goes wrong, so that I understand why my prompt was not processed.

Acceptance Criteria

IF the Agent SDK query() call throws an error, THEN THE Gateway SHALL send a user-friendly error message to the originating Discord channel that includes the error type without exposing internal details.
IF the Agent SDK does not produce a result within a configurable timeout period, THEN THE Gateway SHALL send a timeout notification to the Discord channel and cancel the pending query.
IF the Discord API rejects a message send operation, THEN THE Gateway SHALL log the error with the channel ID and message content length.
IF a Session becomes invalid or corrupted, THEN THE Gateway SHALL remove the Channel_Binding and inform the user to retry, which will start a new Session.

Requirement 8: Configuration

User Story: As an operator, I want to configure the Gateway's behavior through environment variables, so that I can customize it for different deployment environments.

Acceptance Criteria

THE Gateway SHALL read the following configuration from environment variables: DISCORD_BOT_TOKEN, ANTHROPIC_API_KEY, ALLOWED_TOOLS (comma-separated list), PERMISSION_MODE, QUERY_TIMEOUT_MS, and CONFIG_DIR (path to the markdown configuration directory).
THE Gateway SHALL apply default values for optional configuration: ALLOWED_TOOLS defaults to "Read,Glob,Grep,WebSearch,WebFetch", PERMISSION_MODE defaults to "bypassPermissions", QUERY_TIMEOUT_MS defaults to "120000", and CONFIG_DIR defaults to "./config".
IF a required configuration value (DISCORD_BOT_TOKEN, ANTHROPIC_API_KEY) is missing, THEN THE Gateway SHALL terminate with a descriptive error message listing the missing values.

Requirement 9: Concurrency and Rate Limiting

User Story: As an operator, I want the Gateway to handle multiple simultaneous requests safely, so that it remains stable under load.

Acceptance Criteria

THE Gateway SHALL process prompts from different Discord channels concurrently without blocking.
WHILE a prompt is being processed for a given channel, THE Gateway SHALL queue subsequent prompts from the same channel and process them sequentially to maintain conversation coherence.
IF the number of concurrent active queries exceeds a configurable maximum (default: 5), THEN THE Gateway SHALL respond to new prompts with a message indicating the system is busy and to retry later.

Requirement 10: Graceful Shutdown

User Story: As an operator, I want the Gateway to shut down cleanly, so that in-flight requests complete and resources are released.

Acceptance Criteria

WHEN the Gateway receives a SIGTERM or SIGINT signal, THE Gateway SHALL stop accepting new prompts from Discord.
WHEN the Gateway receives a shutdown signal, THE Gateway SHALL wait for all in-flight Agent SDK queries to complete or timeout before terminating.
WHEN all in-flight queries have resolved, THE Gateway SHALL disconnect the Discord_Bot and terminate with exit code 0.

Requirement 11: Event Queue

User Story: As an operator, I want all inputs to flow through a unified event queue, so that the system processes events in a consistent, ordered manner regardless of their source.

Acceptance Criteria

THE Event_Queue SHALL accept events of type: message, heartbeat, cron, hook, and webhook.
WHEN an Event is enqueued, THE Event_Queue SHALL assign a monotonically increasing sequence number and record the enqueue timestamp.
THE Event_Queue SHALL dispatch events to the Agent_Runtime in first-in-first-out order.
WHILE the Agent_Runtime is processing an Event, THE Event_Queue SHALL hold subsequent events until the current event completes processing.
IF the Event_Queue exceeds a configurable maximum depth (default: 100), THEN THE Event_Queue SHALL reject new events and log a warning with the current queue depth.

Requirement 12: Agent Runtime

User Story: As an operator, I want a core agent runtime that processes events from the queue, so that all inputs are handled by the AI agent with full context.

Acceptance Criteria

WHEN the Agent_Runtime dequeues an Event, THE Agent_Runtime SHALL read all markdown configuration files (Soul_Config, Identity_Config, Agents_Config, User_Config, Memory_Store, Tools_Config) from the CONFIG_DIR.
WHEN the Agent_Runtime dequeues an Event, THE Agent_Runtime SHALL assemble the System_Prompt by concatenating the content of Soul_Config, Identity_Config, Agents_Config, User_Config, Memory_Store, and Tools_Config in that order, separated by section headers.
WHEN the Agent_Runtime processes a message Event, THE Agent_Runtime SHALL call the Agent SDK query() function with the assembled System_Prompt via the systemPrompt option and the message payload as the prompt.
WHEN the Agent_Runtime processes a heartbeat or cron Event, THE Agent_Runtime SHALL call the Agent SDK query() function with the assembled System_Prompt and the event's instruction text as the prompt.
WHEN the Agent_Runtime finishes processing an Event, THE Agent_Runtime SHALL signal the Event_Queue to dispatch the next event.

Requirement 13: Markdown-Based Personality and Identity

User Story: As an operator, I want to define the agent's personality and identity in markdown files, so that I can customize the agent's behavior without code changes.

Acceptance Criteria

THE Agent_Runtime SHALL read Soul_Config from {CONFIG_DIR}/soul.md to obtain personality, tone, values, and behavior defaults.
THE Agent_Runtime SHALL read Identity_Config from {CONFIG_DIR}/identity.md to obtain the agent's name, role, and specialization.
THE Agent_Runtime SHALL read Agents_Config from {CONFIG_DIR}/agents.md to obtain operating rules, workflows, and safety/permission boundaries.
IF Soul_Config, Identity_Config, or Agents_Config is missing, THEN THE Agent_Runtime SHALL log a warning and use an empty string for the missing section in the System_Prompt.
WHEN any personality or identity markdown file is modified on disk, THE Agent_Runtime SHALL read the updated content on the next Event processing cycle without requiring a restart.

Requirement 14: Markdown-Based User Context

User Story: As a user, I want the agent to know about my identity, preferences, and context, so that responses are personalized and relevant.

Acceptance Criteria

THE Agent_Runtime SHALL read User_Config from {CONFIG_DIR}/user.md to obtain user identity, preferences, and life/work context.
WHEN User_Config content is included in the System_Prompt, THE Agent_Runtime SHALL place it in a clearly labeled section so the agent can distinguish user context from other configuration.
IF User_Config is missing, THEN THE Agent_Runtime SHALL log a warning and omit the user context section from the System_Prompt.
WHEN User_Config is modified on disk, THE Agent_Runtime SHALL read the updated content on the next Event processing cycle without requiring a restart.

Requirement 15: Markdown-Based Long-Term Memory

User Story: As a user, I want the agent to remember durable facts, lessons learned, and important information across sessions, so that I do not have to repeat context.

Acceptance Criteria

THE Agent_Runtime SHALL read Memory_Store from {CONFIG_DIR}/memory.md to obtain long-term durable facts, IDs, and lessons learned.
WHEN Memory_Store content is included in the System_Prompt, THE Agent_Runtime SHALL place it in a clearly labeled section titled "Long-Term Memory".
THE Agent_Runtime SHALL instruct the agent (via the System_Prompt) that it may append new facts to Memory_Store by writing to {CONFIG_DIR}/memory.md using the Write tool.
WHEN the agent writes to Memory_Store during event processing, THE Agent_Runtime SHALL read the updated Memory_Store content on the next Event processing cycle.
IF Memory_Store is missing, THEN THE Agent_Runtime SHALL create an empty {CONFIG_DIR}/memory.md file with a "# Memory" header.

Requirement 16: Markdown-Based Tool Configuration

User Story: As an operator, I want to document API configurations, tool usage limits, and gotchas in a markdown file, so that the agent has reference material for using external tools correctly.

Acceptance Criteria

THE Agent_Runtime SHALL read Tools_Config from {CONFIG_DIR}/tools.md to obtain API documentation, tool configurations, usage limits, and known gotchas.
WHEN Tools_Config content is included in the System_Prompt, THE Agent_Runtime SHALL place it in a clearly labeled section titled "Tool Configuration".
IF Tools_Config is missing, THEN THE Agent_Runtime SHALL log a warning and omit the tool configuration section from the System_Prompt.
WHEN Tools_Config is modified on disk, THE Agent_Runtime SHALL read the updated content on the next Event processing cycle without requiring a restart.

Requirement 17: Heartbeat System

User Story: As a user, I want the agent to proactively perform checks at regular intervals, so that it can monitor things like email, calendar, or other services without me asking.

Acceptance Criteria

THE Gateway SHALL read Heartbeat_Config from {CONFIG_DIR}/heartbeat.md to obtain a list of proactive check definitions, each with an instruction prompt and an interval in seconds.
WHEN the Gateway starts and Heartbeat_Config contains valid check definitions, THE Gateway SHALL start a recurring timer for each defined check at its configured interval.
WHEN a heartbeat timer fires, THE Gateway SHALL create a heartbeat Event with the check's instruction prompt as the payload and enqueue it in the Event_Queue.
WHEN the Agent_Runtime processes a heartbeat Event, THE Agent_Runtime SHALL use the heartbeat instruction as the prompt and deliver any response to the configured output channel.
IF Heartbeat_Config is missing, THEN THE Gateway SHALL log an informational message and operate without heartbeat events.
IF a heartbeat check definition has an interval less than 60 seconds, THEN THE Gateway SHALL reject the definition and log a warning indicating the minimum interval is 60 seconds.

Requirement 18: Cron Job System

User Story: As an operator, I want to schedule events at exact times using cron expressions, so that the agent can perform tasks on a precise schedule.

Acceptance Criteria

THE Gateway SHALL read cron job definitions from {CONFIG_DIR}/agents.md under a "Cron Jobs" section, each with a cron expression and an instruction prompt.
WHEN the Gateway starts and valid cron job definitions exist, THE Gateway SHALL schedule each cron job according to its cron expression.
WHEN a cron job fires at its scheduled time, THE Gateway SHALL create a cron Event with the job's instruction prompt as the payload and enqueue it in the Event_Queue.
WHEN the Agent_Runtime processes a cron Event, THE Agent_Runtime SHALL use the cron instruction as the prompt and deliver any response to the configured output channel.
IF a cron expression is syntactically invalid, THEN THE Gateway SHALL log a warning identifying the invalid cron job and skip scheduling for that job.

Requirement 19: Hook System

User Story: As an operator, I want internal state changes to trigger agent actions, so that the agent can respond to lifecycle events like startup, shutdown, or session changes.

Acceptance Criteria

THE Gateway SHALL support the following Hook types: startup, agent_begin, agent_stop, and shutdown.
WHEN the Gateway completes initialization (Discord_Bot ready and Agent_Runtime ready), THE Gateway SHALL fire a startup Hook event and enqueue it in the Event_Queue.
WHEN the Agent_Runtime begins processing any Event, THE Agent_Runtime SHALL fire an agent_begin Hook event (the agent_begin hook is processed inline, not re-enqueued).
WHEN the Agent_Runtime finishes processing any Event, THE Agent_Runtime SHALL fire an agent_stop Hook event (the agent_stop hook is processed inline, not re-enqueued).
WHEN the Gateway begins its shutdown sequence, THE Gateway SHALL fire a shutdown Hook event and enqueue it in the Event_Queue, waiting for it to be processed before completing shutdown.
THE Gateway SHALL read Hook instruction prompts from {CONFIG_DIR}/agents.md under a "Hooks" section, mapping each Hook type to an optional instruction prompt.

Requirement 20: Bootstrap and Boot System

User Story: As an operator, I want the agent to perform an initial setup sequence on first run, so that all markdown configuration files are validated and the runtime is properly initialized.

Acceptance Criteria

THE Gateway SHALL read Boot_Config from {CONFIG_DIR}/boot.md to obtain bootstrap parameters including required markdown files, default content templates, and initialization instructions.
WHEN the Bootstrap_Process runs, THE Gateway SHALL verify that all markdown configuration files referenced in Boot_Config exist in CONFIG_DIR.
IF a required markdown configuration file is missing during bootstrap, THEN THE Gateway SHALL create the file with default content as specified in Boot_Config.
WHEN the Bootstrap_Process completes successfully, THE Gateway SHALL log the list of configuration files loaded and any files that were created with defaults.
IF Boot_Config is missing, THEN THE Gateway SHALL use built-in defaults: require only soul.md and identity.md, and create any missing optional files with empty section headers.

Requirement 21: State Persistence

User Story: As a user, I want the agent's state to persist across restarts, so that memory, preferences, and context survive Gateway restarts.

Acceptance Criteria

THE Agent_Runtime SHALL persist all state exclusively in markdown files within CONFIG_DIR.
WHEN the agent modifies Memory_Store during event processing, THE Agent_Runtime SHALL ensure the changes are written to disk before signaling event completion.
WHEN the Gateway restarts, THE Agent_Runtime SHALL reconstruct its full context by reading all markdown configuration files from CONFIG_DIR, requiring no additional state recovery mechanism.
THE Gateway SHALL treat the CONFIG_DIR markdown files as the single source of truth for all agent state and configuration.

Requirement 22: System Prompt Assembly

User Story: As an operator, I want the system prompt to be dynamically assembled from all markdown configuration files, so that the agent always operates with the latest context.

Acceptance Criteria

WHEN assembling the System_Prompt, THE Agent_Runtime SHALL read each markdown file and wrap its content in a labeled section using the format: ## [Section Name]\n\n[file content].
THE Agent_Runtime SHALL assemble sections in the following order: Identity_Config, Soul_Config, Agents_Config, User_Config, Memory_Store, Tools_Config.
THE Agent_Runtime SHALL omit sections for markdown files that are missing or empty, rather than including empty sections.
THE Agent_Runtime SHALL include a preamble in the System_Prompt instructing the agent that it may update memory.md to persist new long-term facts using the Write tool.
FOR ALL valid markdown configuration file sets, assembling the System_Prompt then parsing the section headers SHALL produce the same set of section names as the input files (round-trip property).

24 KiB Raw Blame History

Requirements Document

Introduction

Glossary

Requirements

Requirement 1: Discord Bot Initialization

Acceptance Criteria

Requirement 2: Agent SDK Initialization

Acceptance Criteria

Requirement 3: Prompt Reception via Discord

Acceptance Criteria

Requirement 4: Prompt Forwarding to Claude Agent SDK

Acceptance Criteria

Requirement 5: Response Streaming and Delivery

Acceptance Criteria

Requirement 6: Session Management

Acceptance Criteria

Requirement 7: Error Handling

Acceptance Criteria

Requirement 8: Configuration

Acceptance Criteria

Requirement 9: Concurrency and Rate Limiting

Acceptance Criteria

Requirement 10: Graceful Shutdown

Acceptance Criteria

Requirement 11: Event Queue

Acceptance Criteria

Requirement 12: Agent Runtime

Acceptance Criteria

Requirement 13: Markdown-Based Personality and Identity

Acceptance Criteria

Requirement 14: Markdown-Based User Context

Acceptance Criteria

Requirement 15: Markdown-Based Long-Term Memory

Acceptance Criteria

Requirement 16: Markdown-Based Tool Configuration

Acceptance Criteria

Requirement 17: Heartbeat System

Acceptance Criteria

Requirement 18: Cron Job System

Acceptance Criteria

Requirement 19: Hook System

Acceptance Criteria

Requirement 20: Bootstrap and Boot System

Acceptance Criteria

Requirement 21: State Persistence

Acceptance Criteria

Requirement 22: System Prompt Assembly

Acceptance Criteria

24 KiB

Raw Blame History