Files
BanGUI/Docs/Tasks.md
Lukas b9289a3b0e Fix: Remove socket path leak in fail2ban error responses
- Change _fail2ban_connection_handler() to return generic message instead of
  leaking socket path in HTTP 502 response body
- Change _fail2ban_protocol_handler() to return generic message instead of
  leaking raw exception details in HTTP 502 response body
- Full error details are still logged server-side (error=str(exc)) for debugging
- Update Backend-Development.md with error message hygiene section explaining
  the pattern: generic user-friendly messages in HTTP responses, full details
  in server logs only

Fixes TASK-029: Fail2BanConnectionError leaks socket path in HTTP error responses

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-04-26 15:21:35 +02:00

10 KiB
Raw Blame History

TASK-029 — Fail2BanConnectionError leaks socket path in HTTP error responses

Severity: Medium

Where found

backend/app/exceptions.pyFail2BanConnectionError.__init__() formats the message as f"{message} (socket: {socket_path})". backend/app/main.py_fail2ban_connection_handler() returns {"detail": f"Cannot reach fail2ban: {exc}"} verbatim.

Why this is needed

Every 502 response caused by fail2ban being unreachable includes the full socket path (e.g., Cannot reach fail2ban: [Errno 2] No such file or directory (socket: /var/run/fail2ban/fail2ban.sock)) in the JSON error body. This discloses internal infrastructure details to unauthenticated users who can trigger the error. Similarly, _fail2ban_protocol_handler includes raw exception details that may expose internal parsing logic.

Goal

Return generic, user-friendly error messages in HTTP responses. Log full details server-side only.

What to do

  1. In _fail2ban_connection_handler(), replace:
    content={"detail": f"Cannot reach fail2ban: {exc}"}
    
    with:
    content={"detail": "Cannot reach the fail2ban service. Check the server status page."}
    
  2. In _fail2ban_protocol_handler(), similarly return a generic message.
  3. Both handlers already log error=str(exc) server-side — this is correct and should remain.

Possible traps and issues

  • Update any tests that assert the exact detail string in 502 responses.
  • If the frontend displays this error message directly to the user, ensure it still makes sense after genericizing.

Docs changes needed

  • Backend-Development.md — error message hygiene (no internal paths/details in responses).

Doc references


TASK-030 — ip-api.com geo lookups use plain HTTP — IP addresses sent unencrypted

Severity: Medium

Where found

backend/app/services/geo_cache.py lines ~4146:

_API_URL = "http://ip-api.com/json/{ip}?fields=..."
_BATCH_API_URL = "http://ip-api.com/batch?fields=..."

Why this is needed

All banned and monitored IP addresses are transmitted to ip-api.com in cleartext over HTTP. These are potentially sensitive data (PII under GDPR/CCPA — IP addresses identify users). Any network path between the BanGUI server and ip-api.com's servers can observe or modify the traffic. Forged responses would corrupt the geo database silently.

Goal

Use encrypted transport for all geo API calls, or switch to a local resolver.

What to do

ip-api.com's free tier does not support HTTPS. The recommended approach:

  1. Promote the existing geoip_db_path setting (MaxMind GeoLite2-Country MMDB) to the primary resolver.
  2. Use ip-api.com as a secondary fallback only when the MMDB is unavailable or returns no result.
  3. Add documentation and compose file examples for downloading and mounting the GeoLite2 MMDB.
  4. If ip-api.com HTTP is retained as a fallback, add a config flag BANGUI_GEOIP_ALLOW_HTTP_FALLBACK (default false) and warn clearly at startup when enabled.

Possible traps and issues

  • The MaxMind GeoLite2 database requires a free account and a license key to download — document the setup process.
  • The GeoLite2-Country MMDB does not include ASN or organisation data — these fields will be null when using the local resolver. The GeoInfo model must handle nullable asn and org.

Docs changes needed

  • Features.md — document the geo resolution mechanism and MMDB setup.
  • Architekture.md — update the external API dependency section.
  • Backend-Development.md — configuration for geoip_db_path.

Doc references


TASK-031 — bcrypt 72-byte truncation not enforced — long passwords silently equivalent to their prefix

Severity: Medium

Where found

backend/app/models/auth.pyLoginRequest.password: str = Field(...) (no max_length). backend/app/models/setup.pySetupRequest.master_password has min_length=8 but no max_length.

Why this is needed

bcrypt silently truncates all input at 72 bytes before hashing. A user who sets a 100-character password can be authenticated by supplying only the first 72 characters. The extra characters provide no additional security. An attacker who has reduced the search space to 72 characters can brute-force the password more efficiently than the user intended.

Goal

Enforce a maximum password length of 72 bytes, or pre-hash before bcrypt to remove the limit entirely.

What to do

Option A (simple):

  1. Add max_length=72 to SetupRequest.master_password and LoginRequest.password.
  2. Update the setup wizard UI to reflect the 72-character maximum.

Option B (removes the 72-byte limit entirely):

  1. Pre-hash the password with HMAC-SHA256 using the session_secret as the key before passing to bcrypt:
    pre_hashed = hmac.new(secret.encode(), password.encode(), hashlib.sha256).digest()
    bcrypt.hashpw(pre_hashed, bcrypt.gensalt())
    
  2. Apply consistently in both run_setup() and _check_password().

Option A is recommended as the simpler, lower-risk fix. Option B is architecturally cleaner but requires a stored hash migration.

Possible traps and issues

  • Option A: Users who already have passwords longer than 72 characters will need to reset. For a single-admin app this is acceptable.
  • Option B: If the session_secret changes, all stored password hashes become invalid (since the pre-hash key changes). This is a hidden coupling — document it explicitly.

Docs changes needed

  • Features.md — document the password length constraint.
  • Backend-Development.md — bcrypt usage notes.

Doc references


TASK-032 — geo_cache table grows unboundedly — no eviction or purge

Severity: Medium

Where found

backend/app/repositories/geo_cache_repo.py — has upsert_entry, bulk_upsert_entries, upsert_neg_entry — but no DELETE functions. backend/app/db.pygeo_cache table has no last_seen or created_at column.

Why this is needed

Every unique IP address ever seen by fail2ban gets a row in geo_cache. The table is never trimmed. A BanGUI instance monitoring a busy server can accumulate millions of rows over months, increasing the DB file size and degrading query performance on every geo lookup.

Goal

Implement a retention policy that prunes geo cache entries not referenced recently.

What to do

  1. Add a migration (_MIGRATIONS[2]) that adds a last_seen TEXT NOT NULL DEFAULT CURRENT_TIMESTAMP column to geo_cache.
  2. Update upsert_entry and bulk_upsert_entries to set last_seen = CURRENT_TIMESTAMP on every upsert.
  3. Add delete_stale_entries(db: aiosqlite.Connection, cutoff_iso: str) -> int to geo_cache_repo.py.
  4. Create backend/app/tasks/geo_cache_cleanup.py — a nightly task that calls delete_stale_entries with a 90-day cutoff.
  5. Register the task in startup_shared_resources.

Possible traps and issues

  • Adding a column requires a migration. Coordinate with TASK-023 (migration atomicity) and TASK-022 (session hash migration) — all three migrations must be sequenced correctly as _MIGRATIONS[2], [3], etc.
  • IPs that have not been seen in 90 days will lose their geo data — on their next appearance they will be re-resolved from ip-api.com or the MMDB. This is acceptable.

Docs changes needed

  • Architekture.md — update the geo_cache table description and add the cleanup task.
  • Backend-Development.md — document the geo cache retention policy.

Doc references


Severity: Medium

Where found

backend/app/routers/auth.pylogin() returns LoginResponse(token=signed_token, expires_at=expires_at) in the JSON body and sets the HttpOnly cookie. backend/app/models/auth.pyLoginResponse.token field.

Why this is needed

The LoginResponse JSON body contains the full signed session token. JavaScript running on the page (including third-party analytics scripts or a future XSS injection) can read the response body from a fetch() call and store the token in localStorage or a non-HttpOnly cookie. The Bearer-header authentication path (Authorization: Bearer <token>) then allows using that extracted token, completely bypassing the protections provided by the HttpOnly cookie.

Goal

Prevent the session token from being accessible to JavaScript when using cookie-based authentication.

What to do

  1. For browser SPA consumers: Remove the token field from LoginResponse. The HttpOnly cookie is the only token the browser needs.
  2. If an API-first (non-browser) token flow is required, create a separate endpoint POST /api/auth/token that returns a token in the body and does not set a cookie. Document this endpoint as "for programmatic API clients only, not for browser use".
  3. Update the frontend — verify that AuthProvider does not use response.token (confirmed: it currently does not).

Possible traps and issues

  • Any existing API client that relies on the token in the LoginResponse body will break. Check tests.
  • The expires_at field in LoginResponse is useful for the frontend to know when to prompt for re-login — this can remain.
  • The Bearer-token path in require_auth (Authorization: Bearer) remains functional for programmatic clients using the dedicated token endpoint.

Docs changes needed

  • Features.md — document the authentication flow (cookie for browser, token endpoint for API clients).
  • Backend-Development.md — authentication endpoint design.
  • Web-Development.md — document that the frontend uses only the HttpOnly cookie.

Doc references