Aetheel/docs/security-audit.md

# Aetheel Security Audit

**Date:** February 17, 2026
**Scope:** Full codebase review of all modules

---

## CRITICAL

### 1. Path Traversal in `memory/manager.py` → `read_file()`

The method accepts absolute paths and resolves them with `os.path.realpath()` but never validates the result is within the workspace directory. An attacker (or the AI itself) could read arbitrary files:

```python
# Current code — no containment check
if os.path.isabs(raw):
    abs_path = os.path.realpath(raw)
```

**Fix:** Add a check like `if not abs_path.startswith(self._workspace_dir): raise ValueError("path outside workspace")`

### 2. Arbitrary Code Execution via Hook `handler.py` Loading

`hooks/hooks.py` → `_load_handler` uses `importlib.util.spec_from_file_location` to dynamically load and execute arbitrary Python from `handler.py` files found in the workspace. If an attacker can write a file to `~/.aetheel/workspace/hooks/<name>/handler.py`, they get full code execution. There's no sandboxing, signature verification, or allowlisting.

### 3. Webhook Auth Defaults to Open Access

`webhooks/receiver.py` → `_check_auth`:

```python
if not self._config.token:
    return True  # No token configured = open access
```

If the webhook receiver is enabled without a token, anyone on the network can trigger AI actions. The default config writes `"token": ""` which means open access.

### 4. AI-Controlled Action Tags Execute Without Validation

`main.py` → `_process_action_tags` parses the AI's response text for action tags like `[ACTION:cron|...]`, `[ACTION:spawn|...]`, and `[ACTION:remind|...]`. The AI can:

- Schedule arbitrary cron jobs with any expression
- Spawn unlimited subagent tasks
- Set reminders with any delay

There's no validation that the AI was asked to do this, no user confirmation, and no rate limiting. A prompt injection attack via any adapter could trigger these.

---

## HIGH

### 5. No Input Validation on Webhook POST Bodies

`webhooks/receiver.py` — JSON payloads are parsed but never schema-validated. Fields like `channel_id`, `sender`, `channel` are passed through directly. The `body` dict is stored in `raw_event` and could contain arbitrarily large data.

### 6. No Request Size Limits on HTTP Endpoints

Neither the webhook receiver nor the WebChat adapter set `client_max_size` on the aiohttp `Application`. Default is 2MB but there's no explicit limit, and no per-request timeout.

### 7. WebSocket Has No Authentication

`adapters/webchat_adapter.py` — Anyone who can reach the WebSocket endpoint at `/ws` can interact with the AI. No token, no session cookie, no origin check. If the host is changed from `127.0.0.1` to `0.0.0.0`, this becomes remotely exploitable.

### 8. No Rate Limiting Anywhere

No rate limiting on:

- Webhook endpoints
- WebSocket messages
- Adapter message handlers
- Subagent spawning (only a concurrent limit of 3, but no cooldown)
- Scheduler job creation

### 9. Cron Expression Not Validated Before APScheduler

`scheduler/scheduler.py` → `_register_cron_job` only checks `len(parts) != 5`. Malformed values within fields (e.g., `999 999 999 999 999`) are passed directly to `CronTrigger`, which could cause unexpected behavior or exceptions.

### 10. Webhook Token in Query Parameter

`webhooks/receiver.py`:

```python
if request.query.get("token") == self._config.token:
    return True
```

Query parameters are logged in web server access logs, browser history, and proxy logs. This leaks the auth token.

---

## MEDIUM

### 11. SQLite Databases Created with Default Permissions

`sessions.db`, `scheduler.db`, and `memory.db` are all created under `~/.aetheel/` with default umask permissions. On multi-user systems, these could be world-readable.

### 12. Webhook Token Stored in `config.json`

The `webhooks.token` field in `config.py` is read from and written to `config.json`, which is a plaintext file. Secrets should only live in `.env`.

### 13. No HTTPS on Any HTTP Endpoint

Both WebChat (port 8080) and webhooks (port 8090) run plain HTTP. Even on localhost, this is vulnerable to local network sniffing.

### 14. Full Environment Passed to Subprocesses

`_build_cli_env()` in both runtimes copies `os.environ` entirely to the subprocess, which may include sensitive variables beyond what the CLI needs.

### 15. Session Logs Contain Full Conversations in Plaintext

`memory/manager.py` → `log_session()` writes unencrypted markdown files to `~/.aetheel/workspace/daily/`. No access control, no encryption, no retention policy.

### 16. XSS Partially Mitigated in `chat.html` but Fragile

The `renderMarkdown()` function escapes `<`, `>`, `&` first, then applies regex-based markdown rendering. User messages use `textContent` (safe). AI messages use `innerHTML` with the escaped+rendered output. The escaping happens before markdown processing, which is the right order, but the regex-based approach is fragile — edge cases in the markdown regexes could potentially bypass the escaping.

### 17. No CORS Headers on WebChat

The aiohttp app doesn't configure CORS. If exposed beyond localhost, cross-origin requests could interact with the WebSocket.

---

## LOW

### 18. Loose Dependency Version Constraints

`pyproject.toml`:

- `python-telegram-bot>=21.0` — no upper bound
- `discord.py>=2.4.0` — no upper bound
- `fastembed>=0.7.4` — no upper bound

These could pull in breaking or vulnerable versions on fresh installs.

### 19. No Security Scanning in CI/Test Pipeline

No `bandit`, `safety`, `pip-audit`, or similar tools in the test suite or project config.

### 20. `config edit` Uses `$EDITOR` Without Sanitization

`cli.py`:

```python
editor = os.environ.get("EDITOR", "nano")
subprocess.run([editor, CONFIG_PATH], check=True)
```

If `$EDITOR` contains spaces or special characters, this could behave unexpectedly (though `subprocess.run` with a list is safe from shell injection).

### 21. No Data Retention/Cleanup for Session Logs

Session logs accumulate indefinitely in `daily/`. No automatic pruning.

### 22. `SubagentBus` Has No Authentication

The pub/sub bus allows any code in the process to publish/subscribe to any channel. No isolation between subagents.

---

## Recommended Priority Fixes

The most impactful changes to make first:

1. **Add path containment check in `read_file()`** — one-line fix, prevents file system escape
2. **Make webhook auth mandatory** when `webhooks.enabled = true` — refuse to start without a token
3. **Add input schema validation** on webhook POST bodies
4. **Validate cron expressions** more strictly before passing to APScheduler
5. **Add rate limiting** to webhook and WebSocket endpoints (e.g., aiohttp middleware)
6. **Move `webhooks.token` to `.env` only**, remove from `config.json`
7. **Add WebSocket origin checking or token auth** to WebChat
8. **Set explicit `client_max_size`** on aiohttp apps
9. **Pin dependency upper bounds** in `pyproject.toml`
10. **Add `bandit`** to the test pipeline