Run immediate health probe after jail deactivation

After deactivation the endpoint now calls _run_probe to flush the cached server status immediately, matching the activate_jail behaviour added in Task 5. Without this, the dashboard active-jail count could remain stale for up to 30 s after a deactivation reload. - config.py: capture result, await _run_probe, return result - test_config.py: add test_deactivate_triggers_health_probe; fix 3 pre-existing UP017 ruff warnings (datetime.UTC alias) - test_health.py: update test to assert the new fail2ban field
2026-03-14 19:25:24 +01:00
parent ee7412442a
commit 936946010f
4 changed files with 114 additions and 6 deletions
--- a/Docs/Tasks.md
+++ b/Docs/Tasks.md
@@ -6,6 +6,10 @@ This document breaks the entire BanGUI project into development stages, ordered

 ## Task 1 — Remove "Currently Banned IPs" section from the Jails page

+**Status:** done
+
+**Summary:** Removed `ActiveBansSection` component, `buildBanColumns` helper, `fmtTimestamp` helper, `ActiveBan` type import, Dialog/DeleteRegular/DismissRegular imports from `JailsPage.tsx`. Updated file-level and component-level JSDoc to say "three sections". `useActiveBans` kept for `banIp`/`unbanIp` used by `BanUnbanForm`.
+
 **Page:** `/jails` — rendered by `frontend/src/pages/JailsPage.tsx`

 The Jails page currently has four sections: Jail Overview, Ban / Unban IP, **Currently Banned IPs**, and IP Lookup. Remove the "Currently Banned IPs" section entirely.
@@ -28,6 +32,10 @@ The Jails page currently has four sections: Jail Overview, Ban / Unban IP, **Cur

 ## Task 2 — Remove "Jail Distribution" section from the Dashboard

+**Status:** done
+
+**Summary:** Removed `JailDistributionChart` import and JSX block from `DashboardPage.tsx`. The component file is retained (still importable) but no longer rendered.
+
 **Page:** `/` (Dashboard) — rendered by `frontend/src/pages/DashboardPage.tsx`

 The Dashboard currently shows: Server Status Bar, Filter Bar, Ban Trend, Top Countries, **Jail Distribution**, and Ban List. Remove the "Jail Distribution" section.
@@ -48,6 +56,10 @@ The Dashboard currently shows: Server Status Bar, Filter Bar, Ban Trend, Top Cou

 ## Task 3 — Fix transparent pie chart in "Top Countries" on the Dashboard

+**Status:** done
+
+**Summary:** Added `Cell` import from recharts and rendered per-slice `<Cell key={index} fill={slice.fill} />` children inside `<Pie>` in `TopCountriesPieChart.tsx`. Added `// eslint-disable-next-line @typescript-eslint/no-deprecated` since recharts v3 type-marks `Cell` as deprecated but it remains the correct mechanism for per-slice colours.
+
 **Page:** `/` (Dashboard) — pie chart rendered by `frontend/src/components/TopCountriesPieChart.tsx`

 The pie chart in the "Top Countries" section is transparent (invisible slices). The root cause is that each slice's `fill` colour is set on the data objects but the `<Pie>` component does not apply them. Recharts needs either a `fill` prop on `<Pie>`, per-slice `<Cell>` elements, or the data items' `fill` field to be picked up correctly.
@@ -89,6 +101,10 @@ The pie chart in the "Top Countries" section is transparent (invisible slices).

 ## Task 4 — Fix log viewer rejecting fail2ban log path under `/config/log`

+**Status:** done
+
+**Summary:** Added `"/config/log"` to `_SAFE_LOG_PREFIXES` tuple in `config_service.py` and updated the error message to reference both allowed prefixes (`/var/log` and `/config/log`). All existing tests continue to pass.
+
 **Error:** `API error 400: {"detail":"Log path '/config/log/fail2ban/fail2ban.log' is outside the allowed directory. Only paths under /var/log are permitted."}`

 **Root cause:** The linuxserver/fail2ban Docker image writes its own log to `/config/log/fail2ban/fail2ban.log` (this is configured via `logtarget` in `Docker/fail2ban-dev-config/fail2ban/fail2ban.conf`). In the Docker Compose setup, the `/config` volume is shared between the fail2ban and backend containers, so the file exists and is readable. However, the backend's `read_fail2ban_log` function in `backend/app/services/config_service.py` hard-codes the allowed path prefix list as:
@@ -124,6 +140,15 @@ This causes any log target under `/config/log/` to be rejected with a 400 error.

 ## Task 5 — Harden jail activation: block on missing logpaths and improve post-activation health checks

+**Status:** done
+
+**Summary:**
+- **Part A** (`config_file_service.py`): `activate_jail` now refuses to proceed when `_validate_jail_config_sync` returns any issue with `field` in `("filter", "logpath")`. Returns `active=False, fail2ban_running=True` with a descriptive message without writing to disk or reloading fail2ban.
+- **Part B** (`ActivateJailDialog.tsx`): All validation issues are now blocking (`blockingIssues = validationIssues`); the advisory-only `logpath` exclusion was removed.
+- **Part C** (`config.py` router): After activation, `await _run_probe(request.app)` is called immediately to refresh the cached fail2ban status.
+- **Part D** (`health.py`): `/api/health` now returns `{"status": "ok", "fail2ban": "online"|"offline"}` from cached probe state.
+- **Tests**: Added `TestActivateJailBlocking` (3 tests) in `test_config_file_service.py`; added `test_200_with_active_false_on_missing_logpath` in `test_config.py`; updated 5 existing `TestActivateJail`/`TestActivateJailReloadArgs` tests to mock `_validate_jail_config_sync`.
+
 ### Problem description

 Activating a jail whose `logpath` references a non-existent file (e.g. `airsonic-auth.conf` referencing a log file that doesn't exist) causes **fail2ban to crash entirely**. All previously running jails go down (Active Jails drops from 2 → 0). The GUI still shows "Service Health: Running" because:
@@ -255,3 +280,54 @@ Keep the HTTP status code as 200 so Docker health checks don't restart the backe
   - Click "Activate" — the dialog should show a **blocking error** about the missing log file and the Activate button should be disabled.
   - Verify that fail2ban is still running with the original jails intact (Active Jails count unchanged).
   - Go to dashboard — "Service Health" should correctly reflect the live fail2ban state.
+
+---
+
+## Task 6 — Run immediate health probe after jail deactivation
+
+**Status:** done
+
+**Summary:** `deactivate_jail` endpoint in `config.py` now captures the service result, calls `await _run_probe(request.app)`, and then returns the result — matching the behaviour added to `activate_jail` in Task 5. Added `test_deactivate_triggers_health_probe` to `TestDeactivateJail` in `test_config.py` (verifies `_run_probe` is awaited once on success). Also fixed 3 pre-existing ruff UP017 warnings (`datetime.timezone.utc` → `datetime.UTC`) in `test_config.py`.
+
+The `deactivate_jail` endpoint in `backend/app/routers/config.py` is inconsistent with `activate_jail`: after activation the router calls `await _run_probe(request.app)` to immediately refresh the cached fail2ban status (added in Task 5 Part C). Deactivation performs a full `reload_all` which also causes a brief fail2ban restart; without the probe the dashboard can show a stale active-jail count for up to 30 seconds.
+
+### What to do
+
+**File:** `backend/app/routers/config.py` — `deactivate_jail` endpoint (around line 670–698)
+
+1. The handler currently calls `config_file_service.deactivate_jail(...)` and returns its result directly via `return await ...`. Refactor it to capture the result first, run the probe, then return:
+
+   ```python
+   result = await config_file_service.deactivate_jail(config_dir, socket_path, name)
+   # Force an immediate health probe so the cached status reflects the current
+   # fail2ban state (reload changes the active-jail count).
+   await _run_probe(request.app)
+   return result
+   ```
+
+   `_run_probe` is already imported at the top of the file (added in Task 5).
+
+### Tests to add or update
+
+**File:** `backend/tests/test_routers/test_config.py`
+
+2. **Add test**: `test_deactivate_triggers_health_probe` — in the `TestDeactivateJail` class, mock both `config_file_service.deactivate_jail` and `app.routers.config._run_probe`. POST to `/api/config/jails/sshd/deactivate` and assert that `_run_probe` was awaited exactly once.
+
+3. **Update test** `test_200_deactivates_jail` — it already passes without the probe mock, so no changes are needed unless the test client setup causes `_run_probe` to raise. Add a mock for `_run_probe` to prevent real socket calls in that test too.
+
+### Verification
+
+1. Run backend tests:
+   ```bash
+   .venv/bin/python -m pytest backend/tests/test_routers/test_config.py -q --tb=short
+   ```
+   All tests pass with no failures.
+
+2. Run the full backend suite to confirm no regressions:
+   ```bash
+   .venv/bin/python -m pytest backend/tests/ --no-cov --tb=no -q
+   ```
+
+3. **Manual test with running server:**
+   - Go to `/config`, find an active jail and click "Deactivate".
+   - Immediately navigate to the Dashboard — "Active Jails" count should already reflect the reduced count without any delay.
--- a/backend/app/routers/config.py
+++ b/backend/app/routers/config.py
@@ -695,7 +695,7 @@ async def deactivate_jail(
    socket_path: str = request.app.state.settings.fail2ban_socket

    try:
-        return await config_file_service.deactivate_jail(config_dir, socket_path, name)
+        result = await config_file_service.deactivate_jail(config_dir, socket_path, name)
    except JailNameError as exc:
        raise _bad_request(str(exc)) from exc
    except JailNotFoundInConfigError:
@@ -713,6 +713,13 @@ async def deactivate_jail(
    except Fail2BanConnectionError as exc:
        raise _bad_gateway(exc) from exc

+    # Force an immediate health probe so the cached status reflects the current
+    # fail2ban state (reload changes the active-jail count) without waiting for
+    # the next scheduled background check (up to 30 seconds).
+    await _run_probe(request.app)
+
+    return result
+

 # ---------------------------------------------------------------------------
 # Jail validation & rollback endpoints (Task 3)
--- a/backend/tests/test_routers/test_config.py
+++ b/backend/tests/test_routers/test_config.py
@@ -847,6 +847,30 @@ class TestDeactivateJail:
        ).post("/api/config/jails/sshd/deactivate")
        assert resp.status_code == 401

+    async def test_deactivate_triggers_health_probe(self, config_client: AsyncClient) -> None:
+        """POST .../deactivate triggers an immediate health probe after success."""
+        from app.models.config import JailActivationResponse
+
+        mock_response = JailActivationResponse(
+            name="sshd",
+            active=False,
+            message="Jail 'sshd' deactivated successfully.",
+        )
+        with (
+            patch(
+                "app.routers.config.config_file_service.deactivate_jail",
+                AsyncMock(return_value=mock_response),
+            ),
+            patch(
+                "app.routers.config._run_probe",
+                AsyncMock(),
+            ) as mock_probe,
+        ):
+            resp = await config_client.post("/api/config/jails/sshd/deactivate")
+
+        assert resp.status_code == 200
+        mock_probe.assert_awaited_once()
+

 # ---------------------------------------------------------------------------
 # GET /api/config/filters
@@ -1992,7 +2016,7 @@ class TestPendingRecovery:

        from app.models.config import PendingRecovery

-        now = datetime.datetime.now(tz=datetime.timezone.utc)
+        now = datetime.datetime.now(tz=datetime.UTC)
        record = PendingRecovery(
            jail_name="sshd",
            activated_at=now - datetime.timedelta(seconds=20),
@@ -2031,7 +2055,7 @@ class TestRollbackEndpoint:

        # Set up a pending recovery record on the app.
        app = config_client._transport.app  # type: ignore[attr-defined]
-        now = datetime.datetime.now(tz=datetime.timezone.utc)
+        now = datetime.datetime.now(tz=datetime.UTC)
        app.state.pending_recovery = PendingRecovery(
            jail_name="sshd",
            activated_at=now - datetime.timedelta(seconds=10),
@@ -2067,7 +2091,7 @@ class TestRollbackEndpoint:
        from app.models.config import PendingRecovery, RollbackResponse

        app = config_client._transport.app  # type: ignore[attr-defined]
-        now = datetime.datetime.now(tz=datetime.timezone.utc)
+        now = datetime.datetime.now(tz=datetime.UTC)
        record = PendingRecovery(
            jail_name="sshd",
            activated_at=now - datetime.timedelta(seconds=10),
--- a/backend/tests/test_routers/test_health.py
+++ b/backend/tests/test_routers/test_health.py
@@ -13,10 +13,11 @@ async def test_health_check_returns_200(client: AsyncClient) -> None:

@pytest.mark.asyncio
 async def test_health_check_returns_ok_status(client: AsyncClient) -> None:
-    """``GET /api/health`` must return ``{"status": "ok"}``."""
+    """``GET /api/health`` must contain ``status: ok`` and a ``fail2ban`` field."""
    response = await client.get("/api/health")
    data: dict[str, str] = response.json()
-    assert data == {"status": "ok"}
+    assert data["status"] == "ok"
+    assert data["fail2ban"] in ("online", "offline")


@pytest.mark.asyncio