Implement BackendAdapter interface with four CLI backends: - ClaudeCodeBackend (extracted from AgentRuntime) - CodexBackend (OpenAI Codex CLI) - GeminiBackend (Google Gemini CLI) - OpenCodeBackend (OpenCode CLI) Add BackendRegistry for resolution/creation via AGENT_BACKEND env var. Refactor AgentRuntime to delegate to BackendAdapter instead of hardcoding Claude CLI. Update GatewayConfig with new env vars (AGENT_BACKEND, BACKEND_CLI_PATH, BACKEND_MODEL, BACKEND_MAX_TURNS). Includes 10 property-based test files and unit tests for edge cases.
137 lines
9.3 KiB
Markdown
137 lines
9.3 KiB
Markdown
# Requirements Document
|
|
|
|
## Introduction
|
|
|
|
The gateway currently hardcodes Claude Code CLI as its sole agent backend. This feature introduces a pluggable CLI backend system that allows operators to choose between Claude Code CLI, OpenCode CLI, Codex CLI, and Gemini CLI. Each backend has different command-line interfaces, output formats, and session management semantics. The system must abstract these differences behind a unified interface so the rest of the gateway (event processing, session management, Discord integration) remains unchanged.
|
|
|
|
## Glossary
|
|
|
|
- **Gateway**: The Discord-to-agent bridge application (Aetheel) that receives prompts and dispatches them to a CLI backend
|
|
- **CLI_Backend**: A pluggable module that knows how to spawn a specific CLI tool, pass prompts and system prompts, parse output, and manage sessions
|
|
- **Backend_Registry**: The component that holds all available CLI_Backend implementations and resolves the active one from configuration
|
|
- **Agent_Runtime**: The existing `AgentRuntime` class that orchestrates event processing; it will delegate CLI execution to the active CLI_Backend
|
|
- **Backend_Adapter**: An interface that each CLI_Backend must implement, defining spawn, parse, and session operations
|
|
- **Session_ID**: An opaque string returned by a CLI backend that allows resuming a prior conversation
|
|
- **Event_Result**: The normalized response object returned by any CLI_Backend after processing a prompt
|
|
|
|
## Requirements
|
|
|
|
### Requirement 1: Backend Adapter Interface
|
|
|
|
**User Story:** As a developer, I want a common interface for all CLI backends, so that the gateway can interact with any backend without knowing its implementation details.
|
|
|
|
#### Acceptance Criteria
|
|
|
|
1. THE Backend_Adapter SHALL define a method to execute a prompt given a prompt string, a system prompt string, an optional Session_ID, and an optional streaming callback
|
|
2. THE Backend_Adapter SHALL return an Event_Result containing the response text, an optional Session_ID for continuation, and an error flag
|
|
3. THE Backend_Adapter SHALL define a method to return the backend name as a string identifier
|
|
4. THE Backend_Adapter SHALL define a method to validate that the CLI tool is reachable on the system (e.g., binary exists at configured path)
|
|
|
|
### Requirement 2: Claude Code CLI Backend
|
|
|
|
**User Story:** As an operator, I want the existing Claude Code CLI integration preserved as a backend, so that current deployments continue working without changes.
|
|
|
|
#### Acceptance Criteria
|
|
|
|
1. THE Claude_Code_Backend SHALL implement the Backend_Adapter interface
|
|
2. THE Claude_Code_Backend SHALL spawn the Claude CLI with `-p`, `--output-format json`, `--dangerously-skip-permissions`, and `--append-system-prompt-file` flags
|
|
3. WHEN a Session_ID is provided, THE Claude_Code_Backend SHALL pass `--resume <Session_ID>` to the CLI process
|
|
4. THE Claude_Code_Backend SHALL parse the JSON array output to extract `session_id` from `system/init` objects and `result` from `result` objects
|
|
5. THE Claude_Code_Backend SHALL pass `--allowedTools` flags for each tool in the configured allowed tools list
|
|
6. THE Claude_Code_Backend SHALL pass `--max-turns 25` to the CLI process
|
|
|
|
### Requirement 3: Codex CLI Backend
|
|
|
|
**User Story:** As an operator, I want to use OpenAI Codex CLI as a backend, so that I can leverage OpenAI models through the gateway.
|
|
|
|
#### Acceptance Criteria
|
|
|
|
1. THE Codex_Backend SHALL implement the Backend_Adapter interface
|
|
2. THE Codex_Backend SHALL spawn the Codex CLI using `codex exec` subcommand for non-interactive execution
|
|
3. THE Codex_Backend SHALL pass `--json` to receive newline-delimited JSON output
|
|
4. THE Codex_Backend SHALL pass `--dangerously-bypass-approvals-and-sandbox` to skip approval prompts
|
|
5. WHEN a working directory is configured, THE Codex_Backend SHALL pass `--cd <path>` to set the workspace root
|
|
6. THE Codex_Backend SHALL parse the newline-delimited JSON events to extract the final assistant message as the response text
|
|
7. WHEN a Session_ID is provided, THE Codex_Backend SHALL use `codex exec resume <Session_ID>` to continue a prior session
|
|
|
|
### Requirement 4: Gemini CLI Backend
|
|
|
|
**User Story:** As an operator, I want to use Google Gemini CLI as a backend, so that I can leverage Gemini models through the gateway.
|
|
|
|
#### Acceptance Criteria
|
|
|
|
1. THE Gemini_Backend SHALL implement the Backend_Adapter interface
|
|
2. THE Gemini_Backend SHALL spawn the Gemini CLI with the prompt as a positional argument for non-interactive one-shot mode
|
|
3. THE Gemini_Backend SHALL pass `--output-format json` to receive structured JSON output
|
|
4. THE Gemini_Backend SHALL pass `--approval-mode yolo` to auto-approve tool executions
|
|
5. WHEN a Session_ID is provided, THE Gemini_Backend SHALL pass `--resume <Session_ID>` to continue a prior session
|
|
6. THE Gemini_Backend SHALL parse the JSON output to extract the response text
|
|
|
|
### Requirement 5: OpenCode CLI Backend
|
|
|
|
**User Story:** As an operator, I want to use OpenCode CLI as a backend, so that I can leverage multiple model providers through OpenCode's provider system.
|
|
|
|
#### Acceptance Criteria
|
|
|
|
1. THE OpenCode_Backend SHALL implement the Backend_Adapter interface
|
|
2. THE OpenCode_Backend SHALL spawn the OpenCode CLI using `opencode run` subcommand for non-interactive execution
|
|
3. THE OpenCode_Backend SHALL pass `--format json` to receive JSON event output
|
|
4. WHEN a Session_ID is provided, THE OpenCode_Backend SHALL pass `--session <Session_ID> --continue` to resume a prior session
|
|
5. WHEN a model is configured, THE OpenCode_Backend SHALL pass `--model <provider/model>` to select the model
|
|
6. THE OpenCode_Backend SHALL parse the JSON events to extract the final response text
|
|
|
|
### Requirement 6: Backend Selection via Configuration
|
|
|
|
**User Story:** As an operator, I want to select which CLI backend to use through environment variables, so that I can switch backends without code changes.
|
|
|
|
#### Acceptance Criteria
|
|
|
|
1. THE Gateway SHALL read an `AGENT_BACKEND` environment variable to determine which CLI_Backend to activate
|
|
2. THE Gateway SHALL accept values `claude`, `codex`, `gemini`, and `opencode` for the `AGENT_BACKEND` variable
|
|
3. WHEN `AGENT_BACKEND` is not set, THE Gateway SHALL default to `claude` for backward compatibility
|
|
4. THE Gateway SHALL read a `BACKEND_CLI_PATH` environment variable to override the default binary path for the selected backend
|
|
5. IF an unrecognized value is provided for `AGENT_BACKEND`, THEN THE Gateway SHALL fail at startup with a descriptive error message listing valid options
|
|
|
|
### Requirement 7: Backend-Specific Configuration
|
|
|
|
**User Story:** As an operator, I want to pass backend-specific settings through environment variables, so that I can tune each backend's behavior.
|
|
|
|
#### Acceptance Criteria
|
|
|
|
1. THE Gateway SHALL read `BACKEND_MODEL` environment variable to pass a model override to the active CLI_Backend
|
|
2. THE Gateway SHALL read `BACKEND_MAX_TURNS` environment variable to limit the number of agentic turns, defaulting to 25
|
|
3. WHEN the active backend does not support a configured option, THE Gateway SHALL log a warning and ignore the unsupported option
|
|
4. THE Gateway SHALL pass the existing `ALLOWED_TOOLS` configuration to backends that support tool filtering
|
|
|
|
### Requirement 8: Unified Output Parsing
|
|
|
|
**User Story:** As a developer, I want each backend to normalize its output into a common format, so that downstream processing (Discord messaging, archiving) works identically regardless of backend.
|
|
|
|
#### Acceptance Criteria
|
|
|
|
1. THE Backend_Adapter SHALL return Event_Result with fields: `responseText` (string or undefined), `sessionId` (string or undefined), and `isError` (boolean)
|
|
2. WHEN a CLI_Backend process exits with a non-zero exit code, THE Backend_Adapter SHALL set `isError` to true and include the stderr content in `responseText`
|
|
3. WHEN a CLI_Backend process exceeds the configured query timeout, THE Backend_Adapter SHALL terminate the process and return an Event_Result with `isError` set to true and `responseText` set to "Query timed out"
|
|
4. THE Backend_Adapter SHALL support an optional streaming callback that receives partial result text as the CLI process produces output
|
|
|
|
### Requirement 9: Backend Validation at Startup
|
|
|
|
**User Story:** As an operator, I want the gateway to verify the selected backend is available at startup, so that I get immediate feedback if the CLI tool is missing or misconfigured.
|
|
|
|
#### Acceptance Criteria
|
|
|
|
1. WHEN the Gateway starts, THE Backend_Registry SHALL invoke the active CLI_Backend's validation method
|
|
2. IF the validation fails, THEN THE Gateway SHALL log an error with the backend name and configured path, and exit with a non-zero exit code
|
|
3. THE validation method SHALL check that the configured CLI binary path is executable
|
|
|
|
### Requirement 10: Agent Runtime Refactoring
|
|
|
|
**User Story:** As a developer, I want the AgentRuntime to delegate CLI execution to the Backend_Adapter, so that the runtime is decoupled from any specific CLI tool.
|
|
|
|
#### Acceptance Criteria
|
|
|
|
1. THE Agent_Runtime SHALL accept a Backend_Adapter instance through its constructor instead of directly referencing Claude CLI configuration
|
|
2. THE Agent_Runtime SHALL call the Backend_Adapter's execute method instead of spawning CLI processes directly
|
|
3. THE Agent_Runtime SHALL map the Backend_Adapter's Event_Result to the existing EventResult interface used by the rest of the gateway
|
|
4. WHEN the Backend_Adapter returns a Session_ID, THE Agent_Runtime SHALL store the Session_ID in the Session_Manager for the corresponding channel
|