Files
BanGUI/Docs/Tasks.md
Lukas 5e1b8134d9 Remove inactive jails section from Jail management page
The Jail page is now a pure operational view showing only jails that
fail2ban reports as active. The backend GET /api/jails already queried
only the fail2ban socket status command, so no backend changes were
needed.

Frontend changes:
- Remove Inactive Jails table, Show-inactive toggle, and all related
  state (showInactive, inactiveJails, activateTarget)
- Remove fetchInactiveJails() call and loadInactive/handleActivated
  callbacks
- Remove ActivateJailDialog import and usage
- Remove unused imports: useCallback, useEffect, Switch, InactiveJail

Inactive-jail discovery and activation remain fully functional via the
Configuration page Jails tab (JailsTab.tsx) — unchanged.
2026-03-14 11:44:05 +01:00

20 KiB

BanGUI — Task List

This document breaks the entire BanGUI project into development stages, ordered so that each stage builds on the previous one. Every task is described in prose with enough detail for a developer to begin work. References point to the relevant documentation.


Task 1 — Jail Page: Show Only Active Jails (No Inactive Configs)

Status: done

Summary: Backend GET /api/jails already only returned active jails (queries fail2ban socket status command). Frontend JailsPage.tsx updated: removed the "Inactive Jails" section, the "Show inactive" toggle, the fetchInactiveJails() call, the ActivateJailDialog import/usage, and the InactiveJail type import. The Config page (JailsTab.tsx) retains full inactive-jail management. All backend tests pass (96/96). TypeScript and ESLint report zero errors. (JailsPage.tsx) currently displays inactive jail configurations alongside active jails. Inactive jails — those defined in config files but not running — belong on the Configuration page (ConfigPage.tsx, Jails tab), not on the operational Jail management page. The Jail page should be a pure operational view: only jails that fail2ban reports as active/running appear here.

Goal

Remove all inactive-jail display and activation UI from the Jail management page. The Jail page shows only jails that are currently loaded in the running fail2ban instance. Users who want to discover and activate inactive jails do so exclusively through the Configuration page's Jails tab.

Backend Changes

  1. Review GET /api/jails in backend/app/routers/jails.py and jail_service.py. Confirm this endpoint only returns jails that are reported as active by fail2ban via the socket (status command). If it already does, no change needed. If it includes inactive/config-only jails in its response, strip them out.
  2. No new endpoints needed. The inactive-jail listing and activation endpoints already live under /api/config/jails and /api/config/jails/{name}/activate in config.py / config_file_service.py — those stay as-is for the Config page.

Frontend Changes

  1. JailsPage.tsx — Remove the "Inactive Jails" section, the toggle that reveals inactive jails, and the fetchInactiveJails() call. The page should only call fetchJails() (which queries /api/jails) and render that list. Remove the ActivateJailDialog import and usage from this page if present.
  2. JailsPage.tsx — Remove any "Activate" buttons or affordances that reference inactive jails. The jail overview table should show: jail name, status (running / stopped / idle), backend type, currently banned count, total bans, currently failed, total failed, find time, ban time, max retries. No "Inactive" badge or "Activate" button.
  3. Verify the Config page (ConfigPage.tsx → Jails tab / JailsTab.tsx) still shows the full list including inactive jails with Active/Inactive badges and the Activate button. This is the only place where inactive jails are managed. No changes expected here — just verify nothing broke.

Tests

  1. Backend: If there are existing tests for GET /api/jails that assert inactive jails are included, update them so they assert inactive jails are excluded.
  2. Frontend: Update or remove any component tests for the inactive-jail section on JailsPage. Ensure Config-page tests for inactive jail activation still pass.

Acceptance Criteria

  • The Jail page shows zero inactive jails under any circumstance.
  • All Jail page data comes only from the fail2ban socket's active jail list.
  • Inactive-jail discovery and activation remain fully functional on the Configuration page, Jails tab.
  • No regressions in existing jail control actions (start, stop, reload, idle, ignore-list) on the Jail page.

Task 2 — Configuration Subpage: fail2ban Log Viewer & Service Health

Status: not started
References: Features.md § 6 — Configuration View, Architekture.md § 2

Problem

There is currently no way to view the fail2ban daemon log (/var/log/fail2ban.log or wherever the log target is configured) through the web interface. There is also no dedicated place in the Configuration section that shows at a glance whether fail2ban is running correctly. The existing health probe (health_service.py) and dashboard status bar give connectivity info, but the Configuration page should have its own panel showing service health alongside the raw log output.

Goal

Add a new Log tab to the Configuration page. This tab shows two things:

  1. A Service Health panel — a compact summary showing whether fail2ban is running, its version, active jail count, total bans, total failures, and the current log level/target. This reuses data from the existing health probe.
  2. A Log viewer — displays the tail of the fail2ban daemon log file with newest entries at the bottom. Supports manual refresh and optional auto-refresh on an interval.

Backend Changes

New Endpoint: Read fail2ban Log

  1. Create GET /api/config/fail2ban-log in backend/app/routers/config.py (or a new router file backend/app/routers/log.py if config.py is getting large).

    • Query parameters:
      • lines (int, default 200, max 2000) — number of lines to return from the tail of the log file.
      • filter (optional string) — a plain-text substring filter; only return lines containing this string (for searching).
    • Response model: Fail2BanLogResponse with fields:
      • log_path: str — the resolved path of the log file being read.
      • lines: list[str] — the log lines.
      • total_lines: int — total number of lines in the file (so the UI can indicate if it's truncated).
      • log_level: str — the current fail2ban log level.
      • log_target: str — the current fail2ban log target.
    • Behaviour: Query the fail2ban socket for get logtarget to find the current log file path. Read the last N lines from that file using an efficient tail implementation (read from end of file, do not load the entire file into memory). If the log target is not a file (stdout, syslog, systemd-journal), return an informative error explaining that log viewing is only available when fail2ban logs to a file.
    • Security: Validate that the resolved log path is under an expected directory (e.g. /var/log/). Do not allow path traversal. Never expose arbitrary file contents.
  2. Create the service method read_fail2ban_log() in backend/app/services/config_service.py (or a new log_service.py).

    • Use fail2ban_client.py to query get logtarget and get loglevel.
    • Implement an async file tail: open the file, seek to end, read backwards until N newlines are found OR the beginning of the file is reached.
    • Apply the optional substring filter on the server side before returning.
  3. Create Pydantic models in backend/app/models/config.py:

    • Fail2BanLogResponse(log_path: str, lines: list[str], total_lines: int, log_level: str, log_target: str)

Extend Health Data for Config Page

  1. Create GET /api/config/service-status (or reuse/extend GET /api/dashboard/status if appropriate).
    • Returns: online (bool), version (str), jail_count (int), total_bans (int), total_failures (int), log_level (str), log_target (str), db_path (str), uptime or start_time if available.
    • This can delegate to the existing health_service.probe() and augment with the log-level/target info from the socket.

Frontend Changes

New Tab: Log

  1. Create frontend/src/components/config/LogTab.tsx.

    • Service Health panel at the top:
      • A status badge: green "Running" or red "Offline".
      • Version, active jails count, total bans, total failures displayed in a compact row of stat cards.
      • Current log level and log target shown as labels.
      • If fail2ban is offline, show a prominent warning banner with the text: "fail2ban is not running or unreachable. Check the server and socket configuration."
    • Log viewer below:
      • A monospace-font scrollable container showing the log lines.
      • A toolbar above the log area with:
        • A Refresh button to re-fetch the log.
        • An Auto-refresh toggle (off by default) with a selectable interval (5s, 10s, 30s).
        • A Lines dropdown to choose how many lines to load (100, 200, 500, 1000).
        • A Filter text input to search within the log (sends the filter param to the backend).
      • Log lines should be syntax-highlighted or at minimum color-coded by log level (ERROR = red, WARNING = yellow, INFO = default, DEBUG = muted).
      • The container auto-scrolls to the bottom on load and on refresh (since newest entries are at the end).
      • If the log target is not a file, show an info banner: "fail2ban is logging to [target]. File-based log viewing is not available."
  2. Register the tab in ConfigPage.tsx. Add a "Log" tab after the existing tabs (Jails, Filters, Actions, Global, Server, Map, Regex Tester). Use a log-file icon.

  3. Create API functions in frontend/src/api/config.ts:

    • fetchFail2BanLog(lines?: number, filter?: string): Promise<Fail2BanLogResponse>
    • fetchServiceStatus(): Promise<ServiceStatusResponse>
  4. Create TypeScript types in frontend/src/types/config.ts (or wherever config types live):

    • Fail2BanLogResponse { log_path: string; lines: string[]; total_lines: number; log_level: string; log_target: string; }
    • ServiceStatusResponse { online: boolean; version: string; jail_count: number; total_bans: number; total_failures: number; log_level: string; log_target: string; }

Tests

  1. Backend: Write tests for the new log endpoint — mock the file read, test line-count limiting, test the substring filter, test the error case when log target is not a file, test path-traversal prevention.
  2. Backend: Write tests for the service-status endpoint.
  3. Frontend: Write component tests for LogTab.tsx — renders health panel, renders log lines, filter input works, handles offline state.

Acceptance Criteria

  • The Configuration page has a new "Log" tab.
  • The Log tab shows a clear health summary with running/offline state and key metrics.
  • The Log tab displays the tail of the fail2ban daemon log file.
  • Users can choose how many lines to display, can refresh manually, and can optionally enable auto-refresh.
  • Users can filter log lines by substring.
  • Log lines are visually differentiated by severity level.
  • If fail2ban logs to a non-file target, a clear message is shown instead of the log viewer.
  • The log endpoint does not allow reading arbitrary files — only the actual fail2ban log target.

Task 3 — Invalid Jail Config Recovery: Detect Broken fail2ban & Auto-Disable Bad Jails

Status: not started
References: Features.md § 5 — Jail Management, Features.md § 6 — Configuration View, Architekture.md § 2

Problem

When a user activates a jail from the Configuration page, the system writes enabled = true to a .local override file and triggers a fail2ban reload. If the jail's configuration is invalid (bad regex, missing log file, broken filter reference, syntax error in an action), fail2ban may refuse to start entirely — not just skip the one bad jail but stop the whole daemon. At that point every jail is down, all monitoring stops, and the user is locked out of all fail2ban operations in BanGUI.

The current activate_jail() flow in config_file_service.py does a post-reload check (queries fail2ban for the jail's status and returns active=false if it didn't start), but this only works when fail2ban is still running. If the entire daemon crashes after the reload, the socket is gone and BanGUI cannot query anything. The user sees generic "offline" errors but has no clear path to fix the problem.

Goal

Build a multi-layered safety net that:

  1. Pre-validates the jail config before activating it (catch obvious errors before the reload).
  2. Detects when fail2ban goes down after a jail activation (detect the crash quickly).
  3. Alerts the user with a clear, actionable message explaining which jail was just activated and that it likely caused the failure.
  4. Offers a one-click rollback that disables the bad jail config and restarts fail2ban.

Plan

Layer 1: Pre-Activation Validation

  1. Extend activate_jail() in config_file_service.py (or add a new validate_jail_config() method) to perform dry-run checks before writing the .local file and reloading:
    • Filter existence: Verify the jail's filter setting references a filter file that actually exists in filter.d/.
    • Action existence: Verify every action referenced by the jail exists in action.d/.
    • Regex compilation: Attempt to compile all failregex and ignoreregex patterns with Python's re module. Report which pattern is broken.
    • Log path check: Verify that the log file paths declared in the jail config actually exist on disk and are readable.
    • Syntax check: Parse the full merged config (base + overrides) and check for obvious syntax issues (malformed interpolation, missing required keys).
  2. Return validation errors as a structured response before proceeding with activation. The response should list every issue found so the user can fix them before trying again.
  3. Create a new endpoint POST /api/config/jails/{name}/validate that runs only the validation step without actually activating. The frontend can call this for a "Check Config" button.

Layer 2: Post-Activation Health Check

  1. After each activate_jail() reload, perform a health-check sequence with retries:
    • Wait 2 seconds after sending the reload command.
    • Probe the fail2ban socket with ping.
    • If the probe succeeds, check if the specific jail is active.
    • If the probe fails (socket gone / connection refused), retry up to 3 times with 2-second intervals.
    • Return the probe result as part of the activation response.
  2. Extend the JailActivationResponse model to include:
    • fail2ban_running: bool — whether the fail2ban daemon is still running after reload.
    • validation_warnings: list[str] — any non-fatal warnings from the pre-validation step.
    • error: str | None — a human-readable error message if something went wrong.

Layer 3: Automatic Crash Detection via Background Task

  1. Extend tasks/health_check.py (the periodic health probe that runs every 30 seconds):
    • Track the last known activation event: when a jail was activated, store its name and timestamp in an in-memory variable (or a lightweight DB record).
    • If the health check detects that fail2ban transitioned from online to offline, and a jail was activated within the last 60 seconds, flag this as a probable activation failure.
    • Store a PendingRecovery record: { jail_name: str, activated_at: datetime, detected_at: datetime, recovered: bool }.
  2. Create a new endpoint GET /api/config/pending-recovery that returns the current PendingRecovery record (or null if none).
    • The frontend polls this endpoint (or it is included in the dashboard status response) to detect when a recovery state is active.

Layer 4: User Alert & One-Click Rollback

  1. Frontend — Global alert banner. When the health status transitions to offline and a PendingRecovery record exists:
    • Show a full-width warning banner at the top of every page (not just the Config page). The banner is dismissible only after the issue is resolved.
    • Banner text: "fail2ban stopped after activating jail {name}. The jail's configuration may be invalid. Disable this jail and restart fail2ban?"
    • Two buttons:
      • "Disable & Restart" — calls the rollback endpoint (see below).
      • "View Details" — navigates to the Config page Log tab so the user can inspect the fail2ban log for the exact error message.
  2. Create a rollback endpoint POST /api/config/jails/{name}/rollback in the backend:
    • Writes enabled = false to the jail's .local override (same as deactivate_jail() but works even when fail2ban is down since it only writes a file).
    • Attempts to start (not reload) the fail2ban daemon via the configured start command (e.g. systemctl start fail2ban or fail2ban-client start). Make the start command configurable in the app settings.
    • Waits up to 10 seconds for the socket to come back, probing every 2 seconds.
    • Returns a response indicating whether fail2ban is back online and how many jails are now active.
    • Clears the PendingRecovery record on success.
  3. Frontend — Rollback result. After the rollback call returns:
    • If successful: show a success toast "fail2ban restarted with {n} active jails. The jail {name} has been disabled." and dismiss the banner.
    • If fail2ban still doesn't start: show an error dialog explaining that the problem may not be limited to the last activated jail. Suggest the user check the fail2ban log (link to the Log tab) or SSH into the server. Keep the banner visible.

Layer 5: Config Page Enhancements

  1. On the Config page Jails tab, when activating a jail:
    • Before activation, show a confirmation dialog that includes any validation warnings from the pre-check.
    • During activation, show a spinner with the text "Activating jail and verifying fail2ban…" (acknowledge the post-activation health check takes a few seconds).
    • After activation, if fail2ban_running is false in the response, immediately show the recovery banner and rollback option without waiting for the background health check.
  2. Add a "Validate" button next to the "Activate" button on inactive jails. Clicking it calls POST /api/config/jails/{name}/validate and shows the validation results in a panel (green for pass, red for each issue found).

Backend File Map

File Changes
services/config_file_service.py Add validate_jail_config(), extend activate_jail() with pre-validation and post-reload health check.
routers/config.py Add POST /api/config/jails/{name}/validate, GET /api/config/pending-recovery, POST /api/config/jails/{name}/rollback.
models/config.py Add JailValidationResult, PendingRecovery, extend JailActivationResponse.
tasks/health_check.py Track last activation event, detect crash-after-activation, write PendingRecovery record.
services/health_service.py Add helper to attempt daemon start (not just probe).

Frontend File Map

File Changes
components/config/ActivateJailDialog.tsx Add pre-validation call, show warnings, show extended activation feedback.
components/config/JailsTab.tsx Add "Validate" button next to "Activate" for inactive jails.
components/common/RecoveryBanner.tsx (new) Global warning banner for activation failures with rollback button.
pages/AppLayout.tsx (or root layout) Mount the RecoveryBanner component so it appears on all pages.
api/config.ts Add validateJailConfig(), fetchPendingRecovery(), rollbackJail().
types/config.ts Add JailValidationResult, PendingRecovery, extend JailActivationResponse.

Tests

  1. Backend: Test validate_jail_config() — valid config passes, missing filter fails, bad regex fails, missing log path fails.
  2. Backend: Test the rollback endpoint — mock file write, mock daemon start, verify response for success and failure cases.
  3. Backend: Test the health-check crash detection — simulate online→offline transition with a recent activation, verify PendingRecovery is set.
  4. Frontend: Test RecoveryBanner — renders when PendingRecovery is present, disappears after successful rollback, shows error on failed rollback.
  5. Frontend: Test the "Validate" button on the Jails tab — shows green on valid, shows errors on invalid.

Acceptance Criteria

  • Obvious config errors (missing filter, bad regex, missing log file) are caught before the jail is activated.
  • If fail2ban crashes after a jail activation, BanGUI detects it within 30 seconds and shows a prominent alert.
  • The user can disable the problematic jail and restart fail2ban with a single click from the alert banner.
  • If the automatic rollback succeeds, BanGUI confirms fail2ban is back and shows the number of recovered jails.
  • If the automatic rollback fails, the user is guided to check the log or intervene manually.
  • A standalone "Validate" button lets users check a jail's config without activating it.
  • All new endpoints have tests covering success, failure, and edge cases.