Files
BanGUI/Docs/Tasks.md
Lukas 1d91e24a88 TASK-030: Secure IP geolocation with MMDB-primary resolver
Make MaxMind GeoLite2-Country MMDB the primary IP resolver (local, encrypted)
and demote ip-api.com to optional fallback only (disabled by default).

Changes:
- Add geoip_allow_http_fallback config flag (default False) to Settings
- Refactor GeoCache.lookup() and lookup_batch() to try MMDB first
- Update startup.py to pass config flag and log security warning when HTTP enabled
- Update all 49 tests to reflect new MMDB-primary strategy
- Add comprehensive geoip configuration section to Backend-Development.md
- Update Architekture.md to show MMDB + optional HTTP in system dependencies
- Update .env.example with BANGUI_GEOIP_DB_PATH and HTTP fallback flag

Security impact:
- 99% of IP addresses (successful MMDB lookups) now stay local, encrypted
- HTTP-only IPs are cached for 5 minutes to minimize external calls
- Operators must explicitly enable HTTP fallback (security-conscious default)
- GDPR/CCPA compliance: no PII sent over unencrypted networks by default

Fixes TASK-030: Resolved plaintext IP transmission to ip-api.com

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-04-26 15:31:39 +02:00

8.4 KiB
Raw Blame History

TASK-030 — ip-api.com geo lookups use plain HTTP — IP addresses sent unencrypted

Severity: Medium

Where found

backend/app/services/geo_cache.py lines ~4146:

_API_URL = "http://ip-api.com/json/{ip}?fields=..."
_BATCH_API_URL = "http://ip-api.com/batch?fields=..."

Why this is needed

All banned and monitored IP addresses are transmitted to ip-api.com in cleartext over HTTP. These are potentially sensitive data (PII under GDPR/CCPA — IP addresses identify users). Any network path between the BanGUI server and ip-api.com's servers can observe or modify the traffic. Forged responses would corrupt the geo database silently.

Goal

Use encrypted transport for all geo API calls, or switch to a local resolver.

What to do

ip-api.com's free tier does not support HTTPS. The recommended approach:

  1. Promote the existing geoip_db_path setting (MaxMind GeoLite2-Country MMDB) to the primary resolver.
  2. Use ip-api.com as a secondary fallback only when the MMDB is unavailable or returns no result.
  3. Add documentation and compose file examples for downloading and mounting the GeoLite2 MMDB.
  4. If ip-api.com HTTP is retained as a fallback, add a config flag BANGUI_GEOIP_ALLOW_HTTP_FALLBACK (default false) and warn clearly at startup when enabled.

Possible traps and issues

  • The MaxMind GeoLite2 database requires a free account and a license key to download — document the setup process.
  • The GeoLite2-Country MMDB does not include ASN or organisation data — these fields will be null when using the local resolver. The GeoInfo model must handle nullable asn and org.

Docs changes needed

  • Features.md — document the geo resolution mechanism and MMDB setup.
  • Architekture.md — update the external API dependency section.
  • Backend-Development.md — configuration for geoip_db_path.

Doc references


TASK-031 — bcrypt 72-byte truncation not enforced — long passwords silently equivalent to their prefix

Severity: Medium

Where found

backend/app/models/auth.pyLoginRequest.password: str = Field(...) (no max_length). backend/app/models/setup.pySetupRequest.master_password has min_length=8 but no max_length.

Why this is needed

bcrypt silently truncates all input at 72 bytes before hashing. A user who sets a 100-character password can be authenticated by supplying only the first 72 characters. The extra characters provide no additional security. An attacker who has reduced the search space to 72 characters can brute-force the password more efficiently than the user intended.

Goal

Enforce a maximum password length of 72 bytes, or pre-hash before bcrypt to remove the limit entirely.

What to do

Option A (simple):

  1. Add max_length=72 to SetupRequest.master_password and LoginRequest.password.
  2. Update the setup wizard UI to reflect the 72-character maximum.

Option B (removes the 72-byte limit entirely):

  1. Pre-hash the password with HMAC-SHA256 using the session_secret as the key before passing to bcrypt:
    pre_hashed = hmac.new(secret.encode(), password.encode(), hashlib.sha256).digest()
    bcrypt.hashpw(pre_hashed, bcrypt.gensalt())
    
  2. Apply consistently in both run_setup() and _check_password().

Option A is recommended as the simpler, lower-risk fix. Option B is architecturally cleaner but requires a stored hash migration.

Possible traps and issues

  • Option A: Users who already have passwords longer than 72 characters will need to reset. For a single-admin app this is acceptable.
  • Option B: If the session_secret changes, all stored password hashes become invalid (since the pre-hash key changes). This is a hidden coupling — document it explicitly.

Docs changes needed

  • Features.md — document the password length constraint.
  • Backend-Development.md — bcrypt usage notes.

Doc references


TASK-032 — geo_cache table grows unboundedly — no eviction or purge

Severity: Medium

Where found

backend/app/repositories/geo_cache_repo.py — has upsert_entry, bulk_upsert_entries, upsert_neg_entry — but no DELETE functions. backend/app/db.pygeo_cache table has no last_seen or created_at column.

Why this is needed

Every unique IP address ever seen by fail2ban gets a row in geo_cache. The table is never trimmed. A BanGUI instance monitoring a busy server can accumulate millions of rows over months, increasing the DB file size and degrading query performance on every geo lookup.

Goal

Implement a retention policy that prunes geo cache entries not referenced recently.

What to do

  1. Add a migration (_MIGRATIONS[2]) that adds a last_seen TEXT NOT NULL DEFAULT CURRENT_TIMESTAMP column to geo_cache.
  2. Update upsert_entry and bulk_upsert_entries to set last_seen = CURRENT_TIMESTAMP on every upsert.
  3. Add delete_stale_entries(db: aiosqlite.Connection, cutoff_iso: str) -> int to geo_cache_repo.py.
  4. Create backend/app/tasks/geo_cache_cleanup.py — a nightly task that calls delete_stale_entries with a 90-day cutoff.
  5. Register the task in startup_shared_resources.

Possible traps and issues

  • Adding a column requires a migration. Coordinate with TASK-023 (migration atomicity) and TASK-022 (session hash migration) — all three migrations must be sequenced correctly as _MIGRATIONS[2], [3], etc.
  • IPs that have not been seen in 90 days will lose their geo data — on their next appearance they will be re-resolved from ip-api.com or the MMDB. This is acceptable.

Docs changes needed

  • Architekture.md — update the geo_cache table description and add the cleanup task.
  • Backend-Development.md — document the geo cache retention policy.

Doc references


Severity: Medium

Where found

backend/app/routers/auth.pylogin() returns LoginResponse(token=signed_token, expires_at=expires_at) in the JSON body and sets the HttpOnly cookie. backend/app/models/auth.pyLoginResponse.token field.

Why this is needed

The LoginResponse JSON body contains the full signed session token. JavaScript running on the page (including third-party analytics scripts or a future XSS injection) can read the response body from a fetch() call and store the token in localStorage or a non-HttpOnly cookie. The Bearer-header authentication path (Authorization: Bearer <token>) then allows using that extracted token, completely bypassing the protections provided by the HttpOnly cookie.

Goal

Prevent the session token from being accessible to JavaScript when using cookie-based authentication.

What to do

  1. For browser SPA consumers: Remove the token field from LoginResponse. The HttpOnly cookie is the only token the browser needs.
  2. If an API-first (non-browser) token flow is required, create a separate endpoint POST /api/auth/token that returns a token in the body and does not set a cookie. Document this endpoint as "for programmatic API clients only, not for browser use".
  3. Update the frontend — verify that AuthProvider does not use response.token (confirmed: it currently does not).

Possible traps and issues

  • Any existing API client that relies on the token in the LoginResponse body will break. Check tests.
  • The expires_at field in LoginResponse is useful for the frontend to know when to prompt for re-login — this can remain.
  • The Bearer-token path in require_auth (Authorization: Bearer) remains functional for programmatic clients using the dedicated token endpoint.

Docs changes needed

  • Features.md — document the authentication flow (cookie for browser, token endpoint for API clients).
  • Backend-Development.md — authentication endpoint design.
  • Web-Development.md — document that the frontend uses only the HttpOnly cookie.

Doc references