363
Docs/Tasks.md
363
Docs/Tasks.md
@@ -0,0 +1,363 @@
|
|||||||
|
|
||||||
|
## [E2E-1] Set up Robot Framework E2E test infrastructure
|
||||||
|
|
||||||
|
**Where found:**
|
||||||
|
No E2E test suite exists in the repository. There are 87 backend pytest files and 78 frontend vitest files but zero integration/E2E tests that exercise the full running stack.
|
||||||
|
|
||||||
|
**Why this is needed:**
|
||||||
|
Unit and component tests cannot catch regressions that span the full system: frontend → backend → fail2ban → database. A running-stack test suite is the only safety net for deployment-breaking changes.
|
||||||
|
|
||||||
|
**Goal:**
|
||||||
|
Create the `e2e/` directory layout, install Robot Framework with the Browser library (Playwright-backed), configure shared keywords for startup/teardown and authentication, and add a `make e2e` target so the suite can be run with a single command.
|
||||||
|
|
||||||
|
**What to do:**
|
||||||
|
1. Create `e2e/requirements.txt`:
|
||||||
|
```
|
||||||
|
robotframework>=7
|
||||||
|
robotframework-browser>=18
|
||||||
|
```
|
||||||
|
2. Run `rfbrowser init` after install to download Playwright browsers.
|
||||||
|
3. Create the directory layout:
|
||||||
|
```
|
||||||
|
e2e/
|
||||||
|
├── requirements.txt
|
||||||
|
├── resources/
|
||||||
|
│ ├── common.resource # variables, shared setup/teardown
|
||||||
|
│ └── auth.resource # Login As Admin keyword
|
||||||
|
└── tests/
|
||||||
|
├── 01_page_loading.robot
|
||||||
|
├── 02_ban_records.robot
|
||||||
|
├── 03_blocklist_import.robot
|
||||||
|
└── 04_config_edit.robot
|
||||||
|
```
|
||||||
|
4. `common.resource` must define:
|
||||||
|
- `${FRONTEND_URL}` = `http://localhost:5173`
|
||||||
|
- `${BACKEND_URL}` = `http://localhost:8000`
|
||||||
|
- Suite Setup: wait for `GET ${BACKEND_URL}/api/health` to return 200 before any test runs (poll with timeout 120 s).
|
||||||
|
5. `auth.resource` must implement `Login As Admin`:
|
||||||
|
- Check `GET /api/setup/status`; if setup is not done, complete the setup wizard first.
|
||||||
|
- POST credentials to `/api/auth/login` or drive the login form at `/login`.
|
||||||
|
- Store the resulting session for subsequent page navigations.
|
||||||
|
6. Add to `Makefile`:
|
||||||
|
```makefile
|
||||||
|
e2e: up
|
||||||
|
@echo "Waiting for stack to be healthy…"
|
||||||
|
@sleep 60
|
||||||
|
robot --outputdir e2e/results e2e/tests/
|
||||||
|
```
|
||||||
|
7. Add `e2e/results/` to `.gitignore`.
|
||||||
|
|
||||||
|
**Possible traps and issues:**
|
||||||
|
- `BANGUI_SESSION_SECRET` env var is required; tests will fail with a startup error if it is not set. Document that `make e2e` requires the variable in the environment.
|
||||||
|
- `make up` uses `podman-compose` or `podman compose` auto-detected at Makefile eval time. If neither is installed the `e2e` target silently fails.
|
||||||
|
- The backend `start_period` in the healthcheck is 45 s; the frontend is 30 s on top of that. The 60 s sleep in the Makefile target may not be enough on a cold build — prefer polling `/api/health` until ready.
|
||||||
|
- `rfbrowser init` must be re-run whenever the `robotframework-browser` version changes.
|
||||||
|
- The Browser library uses Chromium headless by default. CI environments may need `--no-sandbox` flags passed via `New Browser chromium headless=true args=['--no-sandbox']`.
|
||||||
|
|
||||||
|
**Docs changes needed:**
|
||||||
|
- Add an "E2E Testing" section to [Testing-Requirements.md](Testing-Requirements.md) describing how to run `make e2e`, required env vars, and how to view the HTML report in `e2e/results/`.
|
||||||
|
- Add `e2e/results/` to the `.gitignore` list documented in [Backend-Development.md](Backend-Development.md).
|
||||||
|
|
||||||
|
**Doc references:**
|
||||||
|
- [Testing-Requirements.md](Testing-Requirements.md)
|
||||||
|
- [Backend-Development.md](Backend-Development.md)
|
||||||
|
- [Deployment.md](Deployment.md) — for env var documentation
|
||||||
|
- Robot Framework: https://robotframework.org/#getting-started
|
||||||
|
- Browser library: https://robotframework-browser.org/
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## [E2E-2] Page loading tests — all routes render without error
|
||||||
|
|
||||||
|
**Where found:**
|
||||||
|
The frontend has eight distinct routes (`/setup`, `/login`, `/`, `/map`, `/jails`, `/jails/:name`, `/config`, `/history`, `/blocklists`). Each is wrapped in a `PageErrorBoundary`. There is no test that verifies all of them load successfully against a live stack.
|
||||||
|
|
||||||
|
**Why this is needed:**
|
||||||
|
A broken import, a missing API field, or a bad runtime dependency can cause a page to show the error boundary fallback ("Something went wrong") instead of its content. Unit tests mock API responses, so they cannot catch this class of regression.
|
||||||
|
|
||||||
|
**Goal:**
|
||||||
|
Every protected page loads, shows its primary content element, and does **not** show the `PageErrorBoundary` fallback when the stack is running correctly.
|
||||||
|
|
||||||
|
**What to do:**
|
||||||
|
1. Create `e2e/tests/01_page_loading.robot`.
|
||||||
|
2. Suite Setup: call `Login As Admin` from `auth.resource`.
|
||||||
|
3. For each route, implement a test case with the pattern:
|
||||||
|
```robot
|
||||||
|
*** Test Cases ***
|
||||||
|
Login Page Loads Without Error
|
||||||
|
# Must run before Login As Admin — navigate while unauthenticated
|
||||||
|
Go To ${FRONTEND_URL}/login
|
||||||
|
Wait For Elements State css=form visible timeout=15s
|
||||||
|
Get Text body not contains Something went wrong
|
||||||
|
|
||||||
|
Dashboard Loads Without Error
|
||||||
|
Go To ${FRONTEND_URL}/
|
||||||
|
Wait For Elements State css=main visible timeout=15s
|
||||||
|
Get Text body not contains Something went wrong
|
||||||
|
|
||||||
|
Map Page Loads Without Error
|
||||||
|
Go To ${FRONTEND_URL}/map
|
||||||
|
Wait For Elements State css=canvas,svg visible timeout=15s
|
||||||
|
Get Text body not contains Something went wrong
|
||||||
|
```
|
||||||
|
4. Cover all routes:
|
||||||
|
- `/setup` — assert setup form OR redirect to `/login` (setup already done).
|
||||||
|
- `/login` — assert login form visible.
|
||||||
|
- `/` — assert dashboard stats/charts visible.
|
||||||
|
- `/map` — assert SVG or canvas element visible.
|
||||||
|
- `/jails` — assert a table or list visible.
|
||||||
|
- `/jails/:name` — navigate to `/jails/manual-Jail`; assert jail detail heading visible.
|
||||||
|
- `/config` — assert tab navigation visible.
|
||||||
|
- `/history` — assert history table visible.
|
||||||
|
- `/blocklists` — assert blocklists panel visible.
|
||||||
|
5. Assert HTTP status for each page via `${response}= GET ${FRONTEND_URL}/<path>` and `Should Be Equal As Integers ${response.status} 200`.
|
||||||
|
|
||||||
|
**Possible traps and issues:**
|
||||||
|
- The `/login` page test must run **before** `Login As Admin` is called, or the session cookie will cause an immediate redirect to `/`. Either make it the first test case with its own `[Setup] New Page` (no auth), or run it in a separate suite that has no Suite Setup.
|
||||||
|
- The frontend is a SPA; `GET /map` at the Vite dev server always returns 200 with `index.html`. HTTP status checks here are not meaningful — focus on DOM assertions after client-side routing.
|
||||||
|
- The `/jails/:name` test assumes `manual-Jail` exists. If fail2ban has not started or the jail is not active the page may render an empty or error state. Add a guard: check jail exists via `GET /api/jails` before navigating.
|
||||||
|
- `PageErrorBoundary` renders per-page; the text "Something went wrong" must not be matched against the window title or other benign text. Scope the assertion to the `<main>` element.
|
||||||
|
- Page elements have no `data-testid` attributes on the production components — only on test mocks. CSS selectors (`css=main`, `css=table`, `css=canvas`) are fragile. See [E2E-6] for the task to add `data-testid` attributes.
|
||||||
|
- The Vite dev server takes ~30 s to compile on first load. The first navigation may time out; increase the default timeout to 30 s for the first test only.
|
||||||
|
|
||||||
|
**Docs changes needed:**
|
||||||
|
- Document the expected selectors and page landmarks in [Web-Development.md](Web-Development.md) so frontend developers know which elements are load-tested.
|
||||||
|
|
||||||
|
**Doc references:**
|
||||||
|
- [Web-Development.md](Web-Development.md)
|
||||||
|
- [Web-Design.md](Web-Design.md)
|
||||||
|
- [Testing-Requirements.md](Testing-Requirements.md)
|
||||||
|
- `frontend/src/App.tsx` — canonical route definitions
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## [E2E-3] Ban records appear in UI after simulated failed logins
|
||||||
|
|
||||||
|
**Where found:**
|
||||||
|
`Docker/simulate_failed_logins.sh` exists and is used in the `make dev-ban-test` target, but there is no automated test that verifies the resulting bans are surfaced correctly in the BanGUI frontend (History page, Dashboard, or Jails detail page).
|
||||||
|
|
||||||
|
**Why this is needed:**
|
||||||
|
The ban pipeline is the core feature of the product. A regression anywhere in the chain (fail2ban log parsing → fail2ban banning → backend polling → database write → API response → frontend rendering) would go undetected until a user reports it.
|
||||||
|
|
||||||
|
**Goal:**
|
||||||
|
After running `simulate_failed_logins.sh` with a known IP, the ban record for that IP must appear in the BanGUI UI within a reasonable timeout.
|
||||||
|
|
||||||
|
**What to do:**
|
||||||
|
1. Create `e2e/tests/02_ban_records.robot`.
|
||||||
|
2. Suite Setup: `Login As Admin`.
|
||||||
|
3. Test teardown: unban the test IP using `Docker/check_ban_status.sh --unban 192.168.100.99` via `Run Process`.
|
||||||
|
4. Test case:
|
||||||
|
```robot
|
||||||
|
*** Test Cases ***
|
||||||
|
Simulated Failed Logins Appear As Ban Records
|
||||||
|
[Teardown] Run Process bash ${EXECDIR}/Docker/check_ban_status.sh
|
||||||
|
... --unban 192.168.100.99
|
||||||
|
# Step 1 — write failure lines
|
||||||
|
${result}= Run Process
|
||||||
|
... bash ${EXECDIR}/Docker/simulate_failed_logins.sh 5 192.168.100.99
|
||||||
|
... timeout=15s
|
||||||
|
Should Be Equal As Integers ${result.rc} 0
|
||||||
|
# Step 2 — wait for fail2ban to process and backend to pick up the ban
|
||||||
|
Sleep 15s
|
||||||
|
# Step 3 — check History page
|
||||||
|
Go To ${FRONTEND_URL}/history
|
||||||
|
Wait For Elements State css=table,tbody visible timeout=20s
|
||||||
|
Get Text body contains 192.168.100.99
|
||||||
|
# Step 4 — confirm jail name is shown
|
||||||
|
Get Text body contains manual-Jail
|
||||||
|
```
|
||||||
|
5. Optionally add a direct API assertion before the UI check to isolate UI vs. backend failures:
|
||||||
|
```robot
|
||||||
|
${resp}= GET ${BACKEND_URL}/api/history expected_status=200
|
||||||
|
Should Contain ${resp.text} 192.168.100.99
|
||||||
|
```
|
||||||
|
|
||||||
|
**Possible traps and issues:**
|
||||||
|
- The default `COUNT` for `simulate_failed_logins.sh` is 5 and the fail2ban `maxretry` for `manual-Jail` must be ≤ 5 for the ban to trigger. If the jail config has been changed locally the test may pass the script step but produce no ban.
|
||||||
|
- `simulate_failed_logins.sh` writes to `Docker/logs/auth.log`. The fail2ban container reads from `/remotelogs/bangui/auth.log` (mapped volume). If the file mapping differs the lines will never be detected.
|
||||||
|
- The backend polls fail2ban on a schedule (APScheduler). The 15 s sleep may not be enough if the polling interval is longer. Read the scheduler interval from the config before hardcoding the wait.
|
||||||
|
- `check_ban_status.sh` uses `docker exec` directly. In a Podman environment the container runtime may be `podman`, making the unban teardown fail and leaving the test IP permanently banned until manual cleanup.
|
||||||
|
- The History page paginates results. If the test IP is not on the first page the `contains` assertion will fail. Assert via the API or set a large page size query parameter in the URL.
|
||||||
|
- Running the test suite multiple times in the same session accumulates lines in `auth.log`. Old lines do not re-trigger bans, but the counter inside fail2ban resets on container restart. Add a `truncate -s 0 Docker/logs/auth.log` step before writing new lines if idempotency is needed.
|
||||||
|
|
||||||
|
**Docs changes needed:**
|
||||||
|
- Add a note in [Testing-Requirements.md](Testing-Requirements.md) explaining the ban pipeline timing expectations and why a sleep is needed in this test.
|
||||||
|
- Document the `manual-Jail` maxretry value and log path in [CONFIGURATION.md](CONFIGURATION.md) so it is clear what the E2E test depends on.
|
||||||
|
|
||||||
|
**Doc references:**
|
||||||
|
- [CONFIGURATION.md](CONFIGURATION.md)
|
||||||
|
- [Testing-Requirements.md](Testing-Requirements.md)
|
||||||
|
- `Docker/simulate_failed_logins.sh`
|
||||||
|
- `Docker/check_ban_status.sh`
|
||||||
|
- `Docker/fail2ban-dev-config/` — jail configuration
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## [E2E-4] Blocklist import executes and is reflected in the UI
|
||||||
|
|
||||||
|
**Where found:**
|
||||||
|
The blocklist import endpoint (`POST /api/blocklists/import`) is implemented in `backend/app/routers/blocklist.py` and has unit tests in `backend/tests/test_routers/test_blocklist.py` and `backend/tests/test_tasks/test_blocklist_import.py`. The `/blocklists` frontend page exists but there is no E2E test verifying the manual import button works end-to-end.
|
||||||
|
|
||||||
|
**Why this is needed:**
|
||||||
|
The import flow is asynchronous and involves DNS validation, an external HTTP fetch, and a database write. Unit tests mock all external calls. An E2E test is the only way to verify that the full import pipeline — including the network call to the external source — completes and updates the UI.
|
||||||
|
|
||||||
|
**Goal:**
|
||||||
|
Clicking the manual import button on the `/blocklists` page triggers an import run, the UI reflects completion (no error banner), and the import log table shows a new entry.
|
||||||
|
|
||||||
|
**What to do:**
|
||||||
|
1. Create `e2e/tests/03_blocklist_import.robot`.
|
||||||
|
2. Suite Setup: `Login As Admin`.
|
||||||
|
3. Pre-condition: at least one blocklist source must be configured. Add a keyword `Ensure Blocklist Source Exists` that:
|
||||||
|
- Calls `GET /api/blocklists` to check if any sources are defined.
|
||||||
|
- If none: calls `POST /api/blocklists` to add a known-good source (e.g., a small, stable public list).
|
||||||
|
4. Test case:
|
||||||
|
```robot
|
||||||
|
*** Test Cases ***
|
||||||
|
Manual Blocklist Import Completes Without Error
|
||||||
|
Ensure Blocklist Source Exists
|
||||||
|
Go To ${FRONTEND_URL}/blocklists
|
||||||
|
Wait For Elements State css=[aria-label*="Import"],button visible timeout=15s
|
||||||
|
# record the current log entry count
|
||||||
|
${resp_before}= GET ${BACKEND_URL}/api/blocklists/log expected_status=200
|
||||||
|
# trigger the import
|
||||||
|
Click css=[aria-label*="Import"],button
|
||||||
|
# wait for import to finish (spinner gone or success toast)
|
||||||
|
Wait For Elements State css=[aria-label*="Import"],button enabled timeout=45s
|
||||||
|
Get Text body not contains error
|
||||||
|
# verify the log has a new entry
|
||||||
|
${resp_after}= GET ${BACKEND_URL}/api/blocklists/log expected_status=200
|
||||||
|
Should Not Be Equal ${resp_before.text} ${resp_after.text}
|
||||||
|
```
|
||||||
|
5. If the environment has no internet access, mock the external fetch via a local HTTP server (use Python's `http.server` in a `Run Process` background task and point the source URL at `http://localhost:<port>/test.txt`).
|
||||||
|
|
||||||
|
**Possible traps and issues:**
|
||||||
|
- The import endpoint has a rate limit (`RATE_LIMIT_BLOCKLIST_IMPORT_REQUESTS`). Running the suite more than once in the same hour may result in a 429. Either reset the rate limiter between runs or use a unique client IP via a custom `X-Forwarded-For` header.
|
||||||
|
- The external network may be unavailable in CI. The test must either skip gracefully (`Skip If ${no_internet}`) or use a local mock server.
|
||||||
|
- The import button selector is unknown until inspected in a running browser. The selector `css=[aria-label*="Import"],button` is a best guess — it must be verified against the actual rendered DOM and updated if wrong.
|
||||||
|
- If a blocklist source URL returns an invalid or empty file, the import will complete but with 0 bans added. The test must distinguish "import ran successfully" from "bans were added" — these are separate assertions.
|
||||||
|
- The `POST /api/blocklists/import` can take several seconds per source. The 45 s timeout may be insufficient for large lists.
|
||||||
|
|
||||||
|
**Docs changes needed:**
|
||||||
|
- Add the rate limit value and reset mechanism to [CONFIGURATION.md](CONFIGURATION.md).
|
||||||
|
- Document the E2E dependency on network access (or local mock) in [Testing-Requirements.md](Testing-Requirements.md).
|
||||||
|
|
||||||
|
**Doc references:**
|
||||||
|
- [CONFIGURATION.md](CONFIGURATION.md)
|
||||||
|
- [Testing-Requirements.md](Testing-Requirements.md)
|
||||||
|
- [Features.md](Features.md) — blocklist feature description
|
||||||
|
- `backend/app/routers/blocklist.py` — endpoint and rate limit implementation
|
||||||
|
- `backend/tests/test_tasks/test_blocklist_import.py`
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## [E2E-5] Config edit saves and persists after page reload
|
||||||
|
|
||||||
|
**Where found:**
|
||||||
|
The `/config` page allows editing jail settings, filter definitions, and server-level options. It is covered by unit tests (`frontend/src/pages/__tests__/ConfigPage.test.tsx`, `backend/tests/test_routers/test_config.py`) but no E2E test verifies that a change made in the UI actually persists through the backend, survives a page reload, and reflects the new value.
|
||||||
|
|
||||||
|
**Why this is needed:**
|
||||||
|
The config page uses an auto-save mechanism (`useAutoSave`) that debounces writes. A regression in the debounce logic, the PATCH endpoint, or the GET-on-mount rehydration would silently discard user edits. Only a full round-trip test can catch this.
|
||||||
|
|
||||||
|
**Goal:**
|
||||||
|
Change a config field value via the UI, wait for the auto-save indicator to confirm the save, reload the page, and assert the new value is still present.
|
||||||
|
|
||||||
|
**What to do:**
|
||||||
|
1. Create `e2e/tests/04_config_edit.robot`.
|
||||||
|
2. Suite Setup: `Login As Admin`.
|
||||||
|
3. Choose a safe, low-risk config field to edit — e.g., the `[DEFAULT]` `bantime` value, or a per-jail `maxretry` setting.
|
||||||
|
4. Record the original value before editing so it can be restored in teardown.
|
||||||
|
5. Test case:
|
||||||
|
```robot
|
||||||
|
*** Settings ***
|
||||||
|
Library Browser
|
||||||
|
Resource ../resources/auth.resource
|
||||||
|
Test Teardown Restore Original Config Value
|
||||||
|
|
||||||
|
*** Test Cases ***
|
||||||
|
Config Field Edit Persists After Reload
|
||||||
|
Login As Admin
|
||||||
|
Go To ${FRONTEND_URL}/config
|
||||||
|
Wait For Elements State css=[role="tablist"] visible timeout=15s
|
||||||
|
# Read current value for teardown
|
||||||
|
${original}= Get Text css=[data-field="bantime"]
|
||||||
|
Set Suite Variable ${ORIGINAL_BANTIME} ${original}
|
||||||
|
# Edit the field
|
||||||
|
Fill Text css=[data-field="bantime"] 7200
|
||||||
|
# Wait for auto-save indicator to show "Saved"
|
||||||
|
Wait For Elements State css=[data-autosave="saved"] visible timeout=15s
|
||||||
|
# Reload and verify persistence
|
||||||
|
Reload
|
||||||
|
Wait For Elements State css=[data-field="bantime"] visible timeout=15s
|
||||||
|
Get Text css=[data-field="bantime"] == 7200
|
||||||
|
|
||||||
|
*** Keywords ***
|
||||||
|
Restore Original Config Value
|
||||||
|
Go To ${FRONTEND_URL}/config
|
||||||
|
Fill Text css=[data-field="bantime"] ${ORIGINAL_BANTIME}
|
||||||
|
Wait For Elements State css=[data-autosave="saved"] visible timeout=15s
|
||||||
|
```
|
||||||
|
6. Also verify via the API that the value was actually written:
|
||||||
|
```robot
|
||||||
|
${resp}= GET ${BACKEND_URL}/api/config expected_status=200
|
||||||
|
Should Contain ${resp.text} 7200
|
||||||
|
```
|
||||||
|
|
||||||
|
**Possible traps and issues:**
|
||||||
|
- The config page auto-save uses a debounce delay. The test must wait for the "Saved" indicator rather than a fixed `Sleep`, otherwise the reload may happen before the PATCH request fires.
|
||||||
|
- The selectors `[data-field="bantime"]` and `[data-autosave="saved"]` do not exist in the current frontend components (no `data-*` attributes on production elements). These must be added to the components before the test can work. See [E2E-6] for the prerequisite task.
|
||||||
|
- Config fields are rendered inside a tab panel. The correct tab must be activated before the target field is interactable. The test must click the right tab first.
|
||||||
|
- If the backend validates the new value and rejects it (e.g., bantime must be a positive integer), the test will fail at the API assertion. Use a value that is guaranteed to be valid.
|
||||||
|
- Editing config files on disk via the API may restart the fail2ban service inside the container, causing a brief health-check failure and destabilising subsequent tests in the suite. Run config edit tests last or use a test-only jail that is isolated from the main config.
|
||||||
|
- Teardown must restore the original value even if the test fails mid-way. Ensure `Test Teardown` is set, not just a final keyword call.
|
||||||
|
|
||||||
|
**Docs changes needed:**
|
||||||
|
- Document the auto-save debounce behaviour and the "Saved" indicator semantics in [Web-Development.md](Web-Development.md) so E2E test authors know what to wait for.
|
||||||
|
- Note in [Testing-Requirements.md](Testing-Requirements.md) that config edit tests must restore state in teardown.
|
||||||
|
|
||||||
|
**Doc references:**
|
||||||
|
- [Web-Development.md](Web-Development.md)
|
||||||
|
- [CONFIGURATION.md](CONFIGURATION.md)
|
||||||
|
- [Testing-Requirements.md](Testing-Requirements.md)
|
||||||
|
- `frontend/src/components/config/__tests__/AutoSaveIndicator.test.tsx`
|
||||||
|
- `frontend/src/hooks/__tests__/useAutoSave.test.ts`
|
||||||
|
- `backend/tests/test_routers/test_config.py`
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## [E2E-6] Add `data-testid` / `data-*` attributes to production frontend components
|
||||||
|
|
||||||
|
**Where found:**
|
||||||
|
Inspecting the frontend source, `data-testid` attributes appear only in test mock files (e.g., `MapPage.test.tsx` line 57: `<div data-testid="world-map" />`). The production components in `frontend/src/components/` and `frontend/src/pages/` have no `data-testid` or `data-*` attributes. E2E tests [E2E-2], [E2E-4], and [E2E-5] all require stable selectors that survive CSS and class-name refactors.
|
||||||
|
|
||||||
|
**Why this is needed:**
|
||||||
|
CSS class selectors and `aria-label` text are brittle — they break when styles change or text is translated. `data-testid` attributes are the idiomatic, refactor-safe way to locate elements in E2E tests. Without them, every UI change risks breaking the E2E suite for reasons unrelated to correctness.
|
||||||
|
|
||||||
|
**Goal:**
|
||||||
|
Key interactive and landmark elements across all pages and the config form have `data-testid` (or semantic `data-*`) attributes that the Robot Framework E2E suite can rely on.
|
||||||
|
|
||||||
|
**What to do:**
|
||||||
|
1. Identify the minimum set of elements needed by the four E2E suites:
|
||||||
|
- `data-testid="page-error-boundary"` on the `PageErrorBoundary` fallback render.
|
||||||
|
- `data-testid="dashboard"`, `data-testid="map-page"`, `data-testid="jails-page"`, `data-testid="history-page"`, `data-testid="blocklists-page"`, `data-testid="config-page"` on each page's root element.
|
||||||
|
- `data-testid="history-table"` on the History page table body.
|
||||||
|
- `data-testid="blocklist-import-button"` on the manual import trigger.
|
||||||
|
- `data-testid="autosave-status"` on the auto-save indicator, with `data-status="saved" | "saving" | "error"`.
|
||||||
|
- `data-field="<fieldname>"` on config input fields that E2E tests will edit.
|
||||||
|
2. Add the attributes directly to the JSX — no wrappers, no extra elements.
|
||||||
|
3. Do not add `data-testid` to elements that already have stable semantic roles (e.g., `<button type="submit">` with unique text, landmark `<main>`, `<nav>`).
|
||||||
|
4. Update the existing vitest component tests to use the new `data-testid` selectors instead of text queries where appropriate.
|
||||||
|
|
||||||
|
**Possible traps and issues:**
|
||||||
|
- `data-testid` attributes are visible in production HTML. This is standard practice and not a security concern, but some teams prefer to strip them in production builds via a Babel/Vite plugin. Decide on a policy before adding them.
|
||||||
|
- Adding attributes to components that are also tested by vitest may require updating those unit tests if they query by `data-testid`.
|
||||||
|
- The `PageErrorBoundary` component wraps lazy-loaded pages. The fallback element must carry the `data-testid` on the rendered fallback JSX, not on the wrapper itself.
|
||||||
|
- Config fields are rendered dynamically from API data. The `data-field` value must be derived from the field's API key name, not a hardcoded index.
|
||||||
|
|
||||||
|
**Docs changes needed:**
|
||||||
|
- Add a "Selector conventions" section to [Web-Development.md](Web-Development.md) documenting when to use `data-testid` vs. semantic selectors vs. ARIA roles, and listing all reserved `data-testid` values used by the E2E suite.
|
||||||
|
|
||||||
|
**Doc references:**
|
||||||
|
- [Web-Development.md](Web-Development.md)
|
||||||
|
- [Testing-Requirements.md](Testing-Requirements.md)
|
||||||
|
- `frontend/src/components/ErrorBoundary.tsx`
|
||||||
|
- `frontend/src/components/config/__tests__/AutoSaveIndicator.test.tsx`
|
||||||
|
|||||||
@@ -1,45 +0,0 @@
|
|||||||
# ADR-001: SQLite as the Application Database
|
|
||||||
|
|
||||||
## Status
|
|
||||||
Accepted
|
|
||||||
|
|
||||||
## Context
|
|
||||||
BanGUI needs a database to store application state: configuration, session records,
|
|
||||||
blocklist sources, import logs, and ban history archives.
|
|
||||||
|
|
||||||
## Decision
|
|
||||||
Use **SQLite** (via `aiosqlite`) as BanGUI's application database, persisted to a
|
|
||||||
volume mounted from the host or a named Docker volume.
|
|
||||||
|
|
||||||
## Rationale
|
|
||||||
|
|
||||||
### Why SQLite over PostgreSQL?
|
|
||||||
- **Zero-infrastructure:** No separate DB server process, no connection pooling,
|
|
||||||
no credentials to manage. Ships in the Docker container with no additional
|
|
||||||
configuration.
|
|
||||||
- **Fail2ban-compatible:** The fail2ban database itself is SQLite. BanGUI already
|
|
||||||
depends on SQLite; adding a second database engine would increase operational
|
|
||||||
complexity for no benefit.
|
|
||||||
- **Single-instance deployment:** BanGUI runs as a single service with one
|
|
||||||
background scheduler (enforced by `BANGUI_WORKERS=1`). Horizontal scaling is not
|
|
||||||
a design goal.
|
|
||||||
- **Async I/O:** `aiosqlite` provides full async support, avoiding blocking I/O in
|
|
||||||
the FastAPI async request handlers.
|
|
||||||
|
|
||||||
### Why not PostgreSQL?
|
|
||||||
- Requires a separate service or sidecar container.
|
|
||||||
- Adds connection overhead (TCP, connection pools, auth).
|
|
||||||
- Over-engineered for a single-instance web app.
|
|
||||||
|
|
||||||
### Trade-offs
|
|
||||||
- **Not suitable for multi-worker deployments.** SQLite's file-level locking means
|
|
||||||
only one process can write at a time. This is explicitly enforced:
|
|
||||||
`BANGUI_WORKERS=1` is validated at startup, and `scheduler_lock` prevents
|
|
||||||
duplicate schedulers in restarts.
|
|
||||||
- For future multi-instance deployments, BanGUI would need to migrate to
|
|
||||||
PostgreSQL or add a distributed lock layer (Redis + job store).
|
|
||||||
|
|
||||||
## Consequences
|
|
||||||
- Application database is a single file (`bangui.db`) in the container's data volume.
|
|
||||||
- Backup is a file copy. No `pg_dump` equivalent needed.
|
|
||||||
- Schema migrations managed via `app/startup.py` startup DAG.
|
|
||||||
@@ -1,45 +0,0 @@
|
|||||||
# ADR-002: FastAPI over Django
|
|
||||||
|
|
||||||
## Status
|
|
||||||
Accepted
|
|
||||||
|
|
||||||
## Context
|
|
||||||
The backend requires a Python async web framework with strong typing, validation,
|
|
||||||
and OpenAPI support.
|
|
||||||
|
|
||||||
## Decision
|
|
||||||
Use **FastAPI** as the backend framework.
|
|
||||||
|
|
||||||
## Rationale
|
|
||||||
|
|
||||||
### Why FastAPI over Django?
|
|
||||||
- **Async-first:** FastAPI is built on Starlette with native `async def` route
|
|
||||||
handlers. Django's ORM and request handling are synchronous, requiring thread
|
|
||||||
pools for I/O-bound work.
|
|
||||||
- **Modern Python 3.12+:** FastAPI embraces modern Python idioms — type annotations,
|
|
||||||
structural pattern matching, dataclasses. Django maintains broad Python 3.8+
|
|
||||||
compatibility and shows its age.
|
|
||||||
- **Pydantic v2 integration:** FastAPI natively uses Pydantic for request/response
|
|
||||||
validation. Automatic OpenAPI schema generation from Pydantic models is seamless.
|
|
||||||
- **Dependency injection:** FastAPI's `Depends()` system provides a lightweight,
|
|
||||||
explicit DI pattern without a separate container library.
|
|
||||||
- **Performance:** FastAPI + Uvicorn consistently benchmarks as one of the fastest
|
|
||||||
Python web frameworks, comparable to Node.js and Go for JSON APIs.
|
|
||||||
|
|
||||||
### Why not Django?
|
|
||||||
- Django's synchronous ORM creates thread-pool bottlenecks with SQLite.
|
|
||||||
- Django's "batteries-included" philosophy is overkill for BanGUI's scope.
|
|
||||||
We need REST endpoints and background tasks, not a full CMS.
|
|
||||||
- Less flexible dependency injection — Django's middleware and view system is
|
|
||||||
less composable than FastAPI's routing layers.
|
|
||||||
|
|
||||||
### Trade-offs
|
|
||||||
- **Smaller ecosystem:** Django has decades of third-party packages. FastAPI's
|
|
||||||
ecosystem is younger but covers BanGUI's needs (structlog, aiosqlite, APScheduler,
|
|
||||||
aiohttp) completely.
|
|
||||||
- **No built-in admin UI:** BanGUI is its own admin UI; Django's admin is irrelevant.
|
|
||||||
|
|
||||||
## Consequences
|
|
||||||
- FastAPI routes are defined in `app/routers/`.
|
|
||||||
- Request/response models live in `app/models/`.
|
|
||||||
- Dependency injection via `app/dependencies.py`.
|
|
||||||
@@ -1,37 +0,0 @@
|
|||||||
# ADR-003: React over Vue
|
|
||||||
|
|
||||||
## Status
|
|
||||||
Accepted
|
|
||||||
|
|
||||||
## Context
|
|
||||||
The frontend requires a component-based SPA framework with strong typing, a
|
|
||||||
battle-tested component library, and broad ecosystem support.
|
|
||||||
|
|
||||||
## Decision
|
|
||||||
Use **React 18+ with TypeScript** as the frontend framework.
|
|
||||||
|
|
||||||
## Rationale
|
|
||||||
|
|
||||||
### Why React over Vue?
|
|
||||||
- **Ecosystem maturity:** React has the largest frontend ecosystem. Libraries
|
|
||||||
(date pickers, data grids, rich text editors) assume React availability first.
|
|
||||||
- **Fluent UI v9:** Microsoft's official React component library is built for React.
|
|
||||||
The Vue-compatible version (Fluent UI Vue) lags significantly in features and
|
|
||||||
maintenance.
|
|
||||||
- **Hiring and onboarding:** React is more widely known. New contributors are
|
|
||||||
more likely to arrive with React experience than Vue experience.
|
|
||||||
- **Concurrent features:** React 18's concurrent rendering (`useTransition`,
|
|
||||||
`useDeferredValue`) provides a foundation for performance improvements in
|
|
||||||
data-heavy views like the ban table.
|
|
||||||
|
|
||||||
### Why not Vue?
|
|
||||||
- Fluent UI v9 does not provide first-class Vue support.
|
|
||||||
- Vue's composition API is well-designed, but does not outweigh the Fluent UI
|
|
||||||
constraint.
|
|
||||||
- Ecosystem and hiring advantages strongly favor React for enterprise-adjacent
|
|
||||||
projects.
|
|
||||||
|
|
||||||
## Consequences
|
|
||||||
- Frontend is a React SPA in `frontend/src/`.
|
|
||||||
- All components are functional components using hooks.
|
|
||||||
- Global state via React context (`frontend/src/providers/`).
|
|
||||||
@@ -1,48 +0,0 @@
|
|||||||
# ADR-004: APScheduler over Celery
|
|
||||||
|
|
||||||
## Status
|
|
||||||
Accepted
|
|
||||||
|
|
||||||
## Context
|
|
||||||
BanGUI requires background task scheduling for periodic work: geo cache flush,
|
|
||||||
session cleanup, history sync, and blocklist imports.
|
|
||||||
|
|
||||||
## Decision
|
|
||||||
Use **APScheduler 4.x (AsyncIOScheduler)** for background scheduling.
|
|
||||||
|
|
||||||
## Rationale
|
|
||||||
|
|
||||||
### Why APScheduler over Celery?
|
|
||||||
- **No infrastructure:** Celery requires a message broker (Redis or RabbitMQ).
|
|
||||||
APScheduler runs in-process with no broker. Given BanGUI's single-instance
|
|
||||||
constraint, a message queue adds unnecessary operational complexity.
|
|
||||||
- **Async-native:** `AsyncIOScheduler` integrates directly with the asyncio event
|
|
||||||
loop. All BanGUI's I/O (database, HTTP, fail2ban socket) is async. APScheduler
|
|
||||||
jobs are `async def` functions that `await` without blocking.
|
|
||||||
- **Simplicity:** BanGUI's job set is fixed and small. Celery's rich task routing,
|
|
||||||
retry policies, and distributed execution are overkill. APScheduler covers
|
|
||||||
cron-style scheduling with simpler semantics.
|
|
||||||
- **Single-instance enforcement:** APScheduler's in-memory job store is a natural
|
|
||||||
fit when there is only one scheduler. No distributed coordination needed.
|
|
||||||
|
|
||||||
### Why not Celery?
|
|
||||||
- Celery's architecture (broker + workers + result backend) is designed for
|
|
||||||
distributed systems. BanGUI is explicitly single-instance.
|
|
||||||
- Celery tasks are synchronous wrappers around async code without careful
|
|
||||||
handling. Native `async def` tasks require `async_task()` or explicit `run_sync`,
|
|
||||||
creating friction in an async-first codebase.
|
|
||||||
- Added operational burden: Redis or RabbitMQ must be available at startup.
|
|
||||||
|
|
||||||
### Trade-offs
|
|
||||||
- **No horizontal scaling of workers:** APScheduler jobs run in the single
|
|
||||||
uvicorn worker process. CPU-intensive jobs would block the event loop.
|
|
||||||
(This is not a concern for BanGUI's I/O-bound jobs.)
|
|
||||||
- **No built-in retry mechanism:** Failed jobs must re-raise exceptions or
|
|
||||||
implement retry logic manually. This is acceptable given BanGUI's job
|
|
||||||
idempotency guarantees.
|
|
||||||
|
|
||||||
## Consequences
|
|
||||||
- Scheduler is configured in `app/startup.py` using `AsyncIOScheduler`.
|
|
||||||
- Jobs live in `app/tasks/`.
|
|
||||||
- Single-worker constraint is enforced via `BANGUI_WORKERS=1` validation and
|
|
||||||
the `scheduler_lock` database table.
|
|
||||||
@@ -1,61 +0,0 @@
|
|||||||
# ADR-005: Single-Instance Scheduler Enforcement
|
|
||||||
|
|
||||||
## Status
|
|
||||||
Accepted
|
|
||||||
|
|
||||||
## Context
|
|
||||||
APScheduler's `AsyncIOScheduler` is bound to a single asyncio event loop.
|
|
||||||
Running multiple scheduler instances leads to duplicate jobs, database lock
|
|
||||||
contention, and undefined behaviour.
|
|
||||||
|
|
||||||
## Decision
|
|
||||||
Enforce exactly **one scheduler instance** across the entire application lifecycle,
|
|
||||||
using a database-level distributed lock.
|
|
||||||
|
|
||||||
## Mechanism
|
|
||||||
|
|
||||||
### 1. Startup gate: `BANGUI_WORKERS=1`
|
|
||||||
The Docker compose file is configured with `BANGUI_WORKERS=1` and the startup DAG
|
|
||||||
validates this variable. If the variable is not set to `1`, startup aborts with
|
|
||||||
a clear error message.
|
|
||||||
|
|
||||||
### 2. Runtime lock: `scheduler_lock` table
|
|
||||||
During startup, after opening the SQLite database, the application attempts:
|
|
||||||
|
|
||||||
```sql
|
|
||||||
INSERT INTO scheduler_lock (lock_name, heartbeat_at)
|
|
||||||
VALUES ('scheduler', unixepoch())
|
|
||||||
ON CONFLICT(lock_name) DO UPDATE SET heartbeat_at = unixepoch()
|
|
||||||
WHERE (unixepoch() - heartbeat_at) < 30;
|
|
||||||
```
|
|
||||||
|
|
||||||
- If the INSERT succeeds, this instance holds the lock and starts the scheduler.
|
|
||||||
- If the INSERT is a no-op (heartbeat is recent), another instance holds the lock
|
|
||||||
and startup continues without starting the scheduler.
|
|
||||||
- A background task (`scheduler_lock_heartbeat`) updates the heartbeat every 10
|
|
||||||
seconds. If the process crashes, the lock expires after 30 seconds, allowing
|
|
||||||
a restart to acquire it immediately.
|
|
||||||
|
|
||||||
### 3. Deployment topology
|
|
||||||
| Deployment | Behaviour |
|
|
||||||
|---|---|
|
|
||||||
| Single container | Scheduler runs normally |
|
|
||||||
| Single Pod (Kubernetes) | Scheduler runs normally |
|
|
||||||
| Accidental multi-process restart | Second process fails to start scheduler; first continues |
|
|
||||||
| Intentional multi-worker | Not supported; requires external job store (future) |
|
|
||||||
|
|
||||||
## Rationale
|
|
||||||
|
|
||||||
### Why this approach?
|
|
||||||
- **No external coordination service:** No ZooKeeper, etcd, or Redis needed.
|
|
||||||
The existing SQLite database is reused.
|
|
||||||
- **Atomic:** SQLite's INSERT with ON CONFLICT is atomic; no race condition.
|
|
||||||
- **Self-healing:** Lock expiry means a crashed instance automatically releases
|
|
||||||
its lock. No manual cleanup required.
|
|
||||||
- **Crash-safe:** A heartbeat-based TTL ensures stale locks are not held
|
|
||||||
indefinitely.
|
|
||||||
|
|
||||||
## Consequences
|
|
||||||
- `BANGUI_WORKERS` must always be `1`. This is documented and enforced.
|
|
||||||
- Future multi-worker deployments require migration to a persistent job store
|
|
||||||
(PostgreSQL + SQLAlchemy job store, or Redis).
|
|
||||||
Reference in New Issue
Block a user