Restructure Tasks.md to match Instructions.md workflow format

This commit is contained in:
2026-03-15 11:14:55 +01:00
parent f62785aaf2
commit 57a0bbe36e

View File

@@ -8,176 +8,130 @@ This document breaks the entire BanGUI project into development stages, ordered
These instructions apply to every AI agent working in this repository. Read them fully before touching any file. These instructions apply to every AI agent working in this repository. Read them fully before touching any file.
### Repository Layout ### Before You Begin
``` 1. Read [Instructions.md](Instructions.md) in full — it defines the project context, coding standards, and workflow rules. Every rule there is authoritative and takes precedence over any assumption you make.
backend/app/ FastAPI application (Python 3.12+, async) 2. Read [Architekture.md](Architekture.md) to understand the system structure before touching any component.
models/ Pydantic v2 models 3. Read the development guide relevant to your task: [Backend-Development.md](Backend-Development.md) or [Web-Development.md](Web-Development.md) (or both).
repositories/ Database access (aiosqlite) 4. Read [Features.md](Features.md) so you understand what the product is supposed to do and do not accidentally break intended behaviour.
routers/ FastAPI routers — thin, delegate to services
services/ Business logic; all fail2ban interaction lives here
tasks/ APScheduler background jobs
utils/ Shared helpers (fail2ban_client.py, ip_utils.py, …)
backend/tests/ pytest-asyncio test suite mirroring app/ structure
Docker/ Compose files, Dockerfiles, dev config for fail2ban
fail2ban-dev-config/ Bind-mounted into the fail2ban container in debug mode
frontend/src/ React + TypeScript SPA (Vite, Fluent UI)
fail2ban-master/ Vendored fail2ban source — DO NOT EDIT
Docs/ Architecture, design notes, this file
```
### Coding Conventions ### How to Work Through This Document
- **Python**: async/await throughout. Use `structlog` for logging (`log.info/warning/error` with keyword args). Pydantic v2 models. Type-annotated functions. - Tasks are grouped by feature area. Each group is self-contained.
- **Error handling**: Raise domain-specific exceptions (`JailNotFoundError`, `JailOperationError`, `Fail2BanConnectionError`) from service functions; let routers map them to HTTP responses. Never swallow exceptions silently in routers. - Work through tasks in the order they appear within a group; earlier tasks establish foundations for later ones.
- **fail2ban socket**: All communication goes through `app.utils.fail2ban_client.Fail2BanClient`. Use `_safe_get` when a missing command is non-fatal; use `_ok()` when the response is required. - Mark a task **in-progress** before you start it and **completed** the moment it is done. Never batch completions.
- **Tests**: Mirror the `app/` tree under `tests/`. Use `pytest-asyncio` + `unittest.mock.patch` or `AsyncMock`. Keep fixtures in `conftest.py`. Run with `pytest backend/tests` from the repo root (venv at `.venv/`). - If a task depends on another task that is not yet complete, stop and complete the dependency first.
- **Docker dev config**: `Docker/fail2ban-dev-config/` is bind-mounted read-write into the fail2ban container. Changes here take effect after `fail2ban-client reload`. - If you are uncertain whether a change is correct, read the relevant documentation section again before proceeding. Do not guess.
- **Compose**: Use `Docker/compose.debug.yml` for local development; `Docker/compose.prod.yml` for production. Never commit secrets.
### How to Verify Changes ### Code Quality Rules (Summary)
1. Run the backend test suite: `cd /home/lukas/Volume/repo/BanGUI && .venv/bin/pytest backend/tests -x -q` - No TODOs, no placeholders, no half-finished functions.
2. Check for Python type errors: `.venv/bin/mypy backend/app` (if mypy is installed). - Full type annotations on every function (Python) and full TypeScript types on every symbol (no `any`).
3. For runtime verification, start the debug stack: `docker compose -f Docker/compose.debug.yml up`. - Layered architecture: routers → services → repositories. No layer may skip another.
4. Inspect fail2ban logs inside the container: `docker exec bangui-fail2ban-dev fail2ban-client status` and `docker logs bangui-fail2ban-dev`. - All backend errors are raised as typed HTTP exceptions; all unexpected errors are logged via structlog before re-raising.
- All frontend state lives in typed hooks; no raw `fetch` calls outside of the `api/` layer.
- After every code change, run the full test suite (`make test`) and ensure it is green.
### Agent Workflow ### Definition of Done
1. Read the task description fully. Read every source file mentioned before editing anything. A task is done when:
2. Make the minimal change that solves the stated problem — do not refactor surrounding code. - The code compiles and the test suite passes (`make test`).
3. After every code change run the test suite to confirm no regressions. - The feature works end-to-end in the dev stack (`make up`).
4. If a fix requires a config file change in `Docker/fail2ban-dev-config/`, also update `Docker/compose.prod.yml` or document the required change if it affects the production config volume. - No new lint errors are introduced.
5. Mark the task complete only after tests pass and the root-cause error no longer appears. - The change is consistent with all documentation rules.
--- ---
## Bug Fixes — fail2ban Runtime Errors (2026-03-14) ## Bug Fixes
The following three independent bugs were identified from fail2ban logs. They are ordered by severity. Resolve them in order; each is self-contained.
--- ---
### Task 1 — `UnknownJailException('airsonic-auth')` on reload ### BUG-001 — fail2ban: `bangui-sim` jail fails to start due to missing `banaction`
**Priority**: High **Status:** Open
**Files**: `Docker/fail2ban-dev-config/fail2ban/jail.d/airsonic-auth.conf`, `backend/app/services/config_file_service.py`
#### Observed error #### Error
``` ```
fail2ban.transmitter ERROR Command ['reload', '--all', [], [['start', 'airsonic-auth'], Failed during configuration: Bad value substitution: option 'action' in section 'bangui-sim'
['start', 'bangui-sim'], ['start', 'blocklist-import']]] has failed. contains an interpolation key 'banaction' which is not a valid option name.
Received UnknownJailException('airsonic-auth') Raw value: '%(action_)s'
``` ```
#### Root cause #### Root Cause
When a user activates the `airsonic-auth` jail through BanGUI, `config_file_service.py` writes a local override and then calls `jail_service.reload_all(socket_path, include_jails=['airsonic-auth'])`. The reload stream therefore contains `['start', 'airsonic-auth']`. fail2ban cannot create the jail object because its `logpath` (`/remotelogs/airsonic/airsonic.log`) does not exist in the dev environment — the airsonic service is not running and no log volume is mounted. fail2ban registers a config-level parse failure for the jail and the server falls back to `UnknownJailException` when the reload asks it to start that name. fail2ban's interpolation system resolves option values at configuration load time by
substituting `%(key)s` placeholders with values from the same section or from `[DEFAULT]`.
The current error handling in `config_file_service.py` catches the reload exception and rolls back the config file, but it does so with a generic `except Exception` and logs only a warning. The real error is never surfaced to the user in a way that explains why activation failed. The chain that fails is:
#### What to implement 1. Every jail inherits `action = %(action_)s` from `[DEFAULT]` (no override in `bangui-sim.conf`).
2. `action_` is defined in `[DEFAULT]` as `%(banaction)s[port="%(port)s", protocol="%(protocol)s", chain="%(chain)s"]`.
1. **Dev-config fix** (Docker/fail2ban-dev-config): The `airsonic-auth.conf` file is a sample jail showing how a remote-log jail is configured. Add a prominent comment block at the top explaining that this jail is intentionally `enabled = false` and that it will fail to start unless the `/remotelogs/airsonic/` log directory is mounted into the fail2ban container. This makes the intent explicit and prevents confusion. 3. `banaction` is **commented out** in `[DEFAULT]`:
2. **Logpath pre-validation in the activation service** (`config_file_service.py`): Before calling `reload_all` during jail activation, read the `logpath` values from the jail's config (parse the `.conf` and any `.local` override using the existing `conffile_parser`). For each logpath that is not `/dev/null` and does not contain a glob wildcard, check whether the path exists on the filesystem (the backend container shares the fail2ban config volume with read-write access). If any required logpath is missing, abort the activation immediately — do not write the local override, do not call reload — and return a `JailActivationResponse` with `active=False` and a clear `message` explaining which logpath is missing. Add a `validation_warnings` entry listing the missing paths.
3. **Specific exception detection** (`jail_service.py`): In `reload_all`, when `_ok()` raises `ValueError` and the message matches `unknownjail` / `unknown jail` (use the existing `_is_not_found_error` helper), re-raise a `JailNotFoundError` instead of the generic `JailOperationError`. Update callers in `config_file_service.py` to catch `JailNotFoundError` separately and include the jail name in the activation failure message.
#### Acceptance criteria
- Attempting to activate `airsonic-auth` (without mounting the log volume) returns a 422-class response with a message mentioning the missing logpath — no `UnknownJailException` appears in the fail2ban log.
- Activating `bangui-sim` (whose logpath exists in dev) continues to work correctly.
- All existing tests in `backend/tests/` pass without modification.
---
### Task 2 — `iptables-allports` action fails with exit code 4 (`Script error`) on `blocklist-import` jail
**Priority**: High
**Files**: `Docker/fail2ban-dev-config/fail2ban/jail.d/blocklist-import.conf`, `Docker/fail2ban-dev-config/fail2ban/jail.d/bangui-sim.conf`, `Docker/fail2ban-dev-config/fail2ban/jail.conf`
#### Observed error
```
fail2ban.utils ERROR 753c588a7860 -- returned 4
fail2ban.actions ERROR Failed to execute ban jail 'blocklist-import' action
'iptables-allports' … Error starting action
Jail('blocklist-import')/iptables-allports: 'Script error'
```
#### Root cause
Exit code 4 from an iptables invocation means the xtables advisory lock could not be obtained (another iptables call was in progress simultaneously). The `iptables-allports` `actionstart` script does not pass the `-w` (wait-for-lock) flag by default. In the Docker dev environment, fail2ban starts multiple jails in parallel during `reload --all`. Each jail's `actionstart` sub-process calls iptables concurrently. The xtables lock contention causes some of them to exit with code 4. fail2ban interprets any non-zero exit from an action script as `Script error` and aborts the action.
A secondary risk: if the host kernel uses `nf_tables` (iptables-nft) but the container runs `iptables-legacy` binaries, the same exit code 4 can also appear due to backend mismatch.
#### What to implement
1. **Add the xtables lock-wait flag globally** in `Docker/fail2ban-dev-config/fail2ban/jail.conf`, under the `[DEFAULT]` section. Set:
```ini ```ini
banaction = iptables-allports[lockingopt="-w 5"] # Docker/fail2ban-dev-config/fail2ban/jail.conf [DEFAULT]
#banaction = iptables-multiport ← this line is disabled
``` ```
The `lockingopt` parameter is appended to every iptables call inside `iptables-allports`, telling it to wait up to 5 seconds for the xtables lock instead of failing immediately. This is the canonical fix recommended in the fail2ban documentation for containerised deployments. 4. Because `banaction` is absent from the interpolation namespace, fail2ban cannot resolve
`action_`, which makes it unable to resolve `action`, and the jail fails to load.
2. **Remove the redundant `banaction` overrides** from `blocklist-import.conf` and `bangui-sim.conf` (both already specify `banaction = iptables-allports` which now comes from the global default). Removing the per-jail override keeps configuration DRY and ensures the lock-wait flag applies consistently. The same root cause affects every jail in `jail.d/` that does not define its own `banaction`,
including `blocklist-import.conf`.
3. **Production compose note**: Add a comment in `Docker/compose.prod.yml` near the fail2ban service block stating that the `fail2ban-config` volume must contain a `jail.d/` with `banaction = iptables-allports[lockingopt="-w 5"]` or equivalent, because the production volume is not pre-seeded by the repository. This is a documentation action only — do not auto-seed the production volume. #### Fix
#### Acceptance criteria **File:** `Docker/fail2ban-dev-config/fail2ban/jail.conf`
- After `docker compose -f Docker/compose.debug.yml restart fail2ban`, the fail2ban log shows no `returned 4` or `Script error` lines during startup. Uncomment the `banaction` line inside the `[DEFAULT]` section so the value is globally
- `docker exec bangui-fail2ban-dev fail2ban-client status blocklist-import` reports the jail as running. available to all jails:
- `docker exec bangui-fail2ban-dev fail2ban-client status bangui-sim` reports the jail as running.
```ini
--- banaction = iptables-multiport
banaction_allports = iptables-allports
### Task 3 — Suppress log noise from unsupported `get <jail> idle/backend` commands ```
**Priority**: Low This is safe: the dev compose (`Docker/compose.debug.yml`) already grants the fail2ban
**Files**: `backend/app/services/jail_service.py` container `NET_ADMIN` and `NET_RAW` capabilities, which are the prerequisites for
iptables-based banning.
#### Observed error
#### Tasks
```
fail2ban.transmitter ERROR Command ['get', 'bangui-sim', 'idle'] has failed. - [ ] **BUG-001-T1 — Uncomment `banaction` in `jail.conf` [DEFAULT]**
Received Exception('Invalid command (no get action or not yet implemented)')
fail2ban.transmitter ERROR Command ['get', 'bangui-sim', 'backend'] has failed. Open `Docker/fail2ban-dev-config/fail2ban/jail.conf`.
Received Exception('Invalid command (no get action or not yet implemented)') Find the two commented-out lines near the `action_` definition:
… (repeated ~15 times per polling cycle) ```ini
``` #banaction = iptables-multiport
#banaction_allports = iptables-allports
#### Root cause ```
Remove the leading `#` from both lines so they become active options.
`_fetch_jail_summary()` in `jail_service.py` sends `["get", name, "backend"]` and `["get", name, "idle"]` to the fail2ban daemon via the socket. The running fail2ban version (LinuxServer.io container image) does not implement these two sub-commands in its transmitter. fail2ban logs every unrecognised command as an `ERROR` on its side. BanGUI already handles this gracefully — `asyncio.gather(return_exceptions=True)` captures the exceptions and `_safe_bool` / `_safe_str` fall back to `False` / `"polling"` respectively. The BanGUI application does not malfunction, but the fail2ban log is flooded with spurious `ERROR` lines on every jail-list refresh (~15 errors per request across all jails, repeated on every dashboard poll). Do not change any other part of the file.
#### What to implement - [ ] **BUG-001-T2 — Restart the fail2ban container and verify clean startup**
The fix is a capability-detection probe executed once at startup (or at most once per polling cycle, re-used for all jails). The probe sends `["get", "<first_jail>", "backend"]` and caches the result in a module-level boolean flag. Bring the dev stack down and back up:
```bash
1. In `jail_service.py`, add a module-level `asyncio.Lock` and a nullable `bool` flag: make down && make up
```python ```
_backend_cmd_supported: bool | None = None Wait for the fail2ban container to reach `healthy`, then inspect its logs:
_backend_cmd_lock: asyncio.Lock = asyncio.Lock() ```bash
``` make logs # or: docker logs bangui-fail2ban-dev 2>&1 | grep -i error
```
2. Add an async helper `_check_backend_cmd_supported(client, jail_name) -> bool` that: Confirm that no `Bad value substitution` or `Failed during configuration` lines appear
- Returns the cached value immediately if already determined. and that both `bangui-sim` and `blocklist-import` jails show as **enabled** in the output.
- Acquires `_backend_cmd_lock`, checks again (double-check idiom), then sends `["get", jail_name, "backend"]`.
- Sets `_backend_cmd_supported = True` if the command returns without exception, `False` if it raises any `Exception`. - [ ] **BUG-001-T3 — Verify ban/unban cycle works end-to-end**
- Returns the flag value.
With the stack running, trigger the simulation script:
3. In `_fetch_jail_summary()`, call this helper (using the first available jail name) before the main `asyncio.gather`. If the flag is `False`, replace the `["get", name, "backend"]` and `["get", name, "idle"]` gather entries with `asyncio.coroutine` constants that immediately return the default values (`"polling"` and `False`) without sending any socket command. This eliminates the fail2ban-side log noise entirely without changing the public API or the fallback values. ```bash
bash Docker/simulate_failed_logins.sh
4. If `_backend_cmd_supported` becomes `True` (future fail2ban version), the commands are sent as before and real values are returned. ```
Then confirm fail2ban has recorded a ban:
#### Acceptance criteria ```bash
bash Docker/check_ban_status.sh
- After one polling cycle where `backend`/`idle` are detected as unsupported, the fail2ban log shows zero `Invalid command` lines for those commands on subsequent refreshes. ```
- The `GET /api/jails` response still returns `backend: "polling"` and `idle: false` for each jail (unchanged defaults). The script should report at least one banned IP in the `bangui-sim` jail.
- All existing tests in `backend/tests/` pass. Also verify that the BanGUI dashboard reflects the new ban entry.
- The detection result is reset if `_backend_cmd_supported` is `None` (i.e., on application restart), so a fail2ban upgrade that adds support is detected automatically within one polling cycle.
--- ---