TASK-030: Secure IP geolocation with MMDB-primary resolver

Make MaxMind GeoLite2-Country MMDB the primary IP resolver (local, encrypted)
and demote ip-api.com to optional fallback only (disabled by default).

Changes:
- Add geoip_allow_http_fallback config flag (default False) to Settings
- Refactor GeoCache.lookup() and lookup_batch() to try MMDB first
- Update startup.py to pass config flag and log security warning when HTTP enabled
- Update all 49 tests to reflect new MMDB-primary strategy
- Add comprehensive geoip configuration section to Backend-Development.md
- Update Architekture.md to show MMDB + optional HTTP in system dependencies
- Update .env.example with BANGUI_GEOIP_DB_PATH and HTTP fallback flag

Security impact:
- 99% of IP addresses (successful MMDB lookups) now stay local, encrypted
- HTTP-only IPs are cached for 5 minutes to minimize external calls
- Operators must explicitly enable HTTP fallback (security-conscious default)
- GDPR/CCPA compliance: no PII sent over unencrypted networks by default

Fixes TASK-030: Resolved plaintext IP transmission to ip-api.com

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This commit is contained in:
2026-04-26 15:31:39 +02:00
parent b9289a3b0e
commit 1d91e24a88
8 changed files with 313 additions and 135 deletions

View File

@@ -39,7 +39,8 @@ BanGUI is a two-tier web application with a clear separation between frontend an
| **Backend** | Python 3.12+, FastAPI, Pydantic v2, aiosqlite | Business logic, data persistence, fail2ban communication, scheduling |
| **Application Database** | SQLite (via aiosqlite) | Stores BanGUI's own data: configuration, session state, blocklist sources, import logs |
| **fail2ban** | Unix domain socket | The monitored service — BanGUI reads status, issues commands, and reads the fail2ban database |
| **External APIs** | HTTP (via aiohttp) | IP geolocation, ASN/RIR lookups, blocklist downloads |
| **MaxMind GeoLite2** | Offline MMDB file (mounted into container) | IP geolocation (primary resolver) — local, encrypted |
| **External APIs** | HTTP (via aiohttp) | Blocklist downloads; IP geolocation fallback (only if MMDB unavailable and HTTP fallback enabled) |
---
@@ -196,7 +197,7 @@ The business logic layer. Services orchestrate operations, enforce rules, and co
| `log_service.py` | Log preview and regex test operations (extracted from config_service) |
| `history_service.py` | Queries the fail2ban database for historical ban records, builds per-IP timelines, computes ban counts and repeat-offender flags, and syncs new records into BanGUI's archive table |
| `blocklist_service.py` | Downloads blocklists via aiohttp, validates IPs/CIDRs, applies bans through fail2ban or iptables, logs import results |
| `geo_cache.py` | **GeoCache** class that encapsulates all IP geolocation caching: resolves IP addresses to country, ASN, and organization using external APIs or a local MaxMind database, maintains in-memory and persistent caches with negative cache support, and manages background re-resolution. Instantiated once at startup and stored on `app.state.geo_cache` |
| `geo_cache.py` | **GeoCache** class that encapsulates all IP geolocation caching: resolves IP addresses to country, ASN, and organization using a primary local MaxMind GeoLite2-Country database (if available) with optional HTTP fallback to ip-api.com (disabled by default for security). Maintains in-memory and persistent caches with negative cache support, and manages background re-resolution. Instantiated once at startup with allow_http_fallback flag and stored on `app.state.geo_cache` |
| `geo_service.py` | (Deprecated) Backward-compatibility wrappers that delegate to the `GeoCache` instance. Kept for compatibility with existing code. New code should use `GeoCache` directly or via dependency injection |
| `server_service.py` | Reads and writes fail2ban server-level settings (log level, log target, syslog socket, DB location, purge age) |
| `health_service.py` | Probes fail2ban socket connectivity, retrieves server version and global stats, reports online/offline status |
@@ -671,7 +672,7 @@ BanGUI maintains its **own SQLite database** (separate from the fail2ban databas
- The frontend `AuthProvider` checks session validity on mount and redirects to `/login` if invalid.
- The backend `dependencies.py` provides an `authenticated` dependency that validates the session cookie on every protected endpoint.
- **Session validation cache** (`InMemorySessionCache` in `app.utils.session_cache`) — validated session tokens are cached in memory for 10 seconds (configurable via `session_cache_ttl_seconds`) to avoid a SQLite round-trip on every request from the same browser. The cache is invalidated immediately on logout. **⚠️ This cache is process-local and not safe for multi-worker or distributed deployments.** In single-worker mode (enforced by TASK-002), this is safe and improves performance. For multi-worker deployments, replace `InMemorySessionCache` with a shared backend (Redis, database, shared memory) implementing the `SessionCache` protocol. See `app/utils/session_cache.py` module docstring for implementation details.
- **GeoCache** — `GeoCache` instance is created at startup and stored on `app.state.geo_cache`. It encapsulates all IP geolocation caching: in-memory lookup cache, negative cache for unresolvable IPs (with TTL), dirty set for persistence, and thread-safe async locking. Cache is loaded from the `geo_cache` SQLite table on startup. New resolutions are accumulated in memory and periodically flushed to the database by the `geo_cache_flush` background task. Stale entries are re-resolved by the `geo_re_resolve` task. Injected into routes and tasks via FastAPI's dependency system.
- **GeoCache** — `GeoCache` instance is created at startup with a configurable `allow_http_fallback` flag and stored on `app.state.geo_cache`. It implements a primary + fallback resolution strategy: (1) try local MaxMind GeoLite2-Country MMDB database (primary, encrypted, no network traffic), (2) if unavailable/no result and allowed, fall back to ip-api.com HTTP API (unencrypted, disabled by default for security). Encapsulates in-memory lookup cache, negative cache for unresolvable IPs (5-minute TTL), dirty set for persistence, and thread-safe async locking. Cache is loaded from the `geo_cache` SQLite table on startup. New resolutions are accumulated in memory and periodically flushed to the database by the `geo_cache_flush` background task. Stale entries are re-resolved by the `geo_re_resolve` task. Injected into routes and tasks via FastAPI's dependency system. See Backend-Development.md § IP Geolocation Resolution for setup and security details.
- **Runtime state** (`RuntimeState` in `app.utils.runtime_state`) — stores mutable application state: `server_status` (fail2ban online/offline), `last_activation` (jail activation tracking), `pending_recovery` (crash detection), and `runtime_settings` (effective configuration). **⚠️ RuntimeState is process-local and only safe when BanGUI runs as a single asyncio worker.** Mutations must not span `await` points (cooperative scheduling within a single event loop is safe). In multi-worker deployments, each process has its own copy — logouts from worker A don't affect worker B's cache, health status updates are per-worker, and activation tracking is unreliable. BanGUI enforces single-worker mode (TASK-002) to prevent this issue. For future multi-worker support, replace RuntimeState with a shared coordination backend (Redis, shared memory, database). See `app/utils/runtime_state.py` module docstring for details.
- **Setup-completion flag** — once `is_setup_complete()` returns `True`, the result is stored in `app.state._setup_complete_cached`. The `SetupRedirectMiddleware` skips the DB query on all subsequent requests, removing 1 SQL query per request for the common post-setup case. The completion flag is only written after the runtime database is successfully initialized and all initial setup settings are persisted, preventing a failed setup from permanently bypassing the setup wizard.

View File

@@ -917,12 +917,76 @@ BANGUI_FAIL2BAN_START_COMMAND='"/opt/my tools/fail2ban" start' # Quoted path
**Common Pitfall:**
Using `.split()` instead of `shlex.split()` would break commands with spaces in paths. Always use quoted strings for paths that contain whitespace.
### IP Geolocation Resolution
BanGUI resolves IP addresses to country codes and network organization information for ban analytics and geomapping. The geolocation system implements a **primary + fallback** resolution strategy to balance security and availability:
1. **Primary Resolver (MaxMind GeoLite2-Country):** All IP lookups first attempt resolution using a local MaxMind GeoLite2-Country MMDB database file (if available). The MMDB is downloaded offline and mounted into the container — no IP data is sent over the network.
2. **Fallback Resolver (ip-api.com HTTP):** If the MMDB is unavailable or returns no result, the system can fall back to the ip-api.com HTTP API. **This fallback must be explicitly enabled** and only sends unresolved IPs over HTTP. HTTP is disabled by default for security (to avoid sending IP addresses in cleartext).
**Download & Configure MaxMind GeoLite2:**
The MaxMind GeoLite2-Country MMDB requires a free account and license key. To set up the database:
1. **Create a free MaxMind account** at https://www.maxmind.com/en/geolite2/signup and download your license key.
2. **Download the GeoLite2-Country MMDB** using the provided script or manually from the MaxMind downloads page.
3. **Mount the MMDB into the BanGUI container** at a known path (e.g., `/data/GeoLite2-Country.mmdb`).
4. **Set `BANGUI_GEOIP_DB_PATH`** to the mounted path in your environment.
Example Docker Compose configuration:
```yaml
services:
bangui:
volumes:
- ./GeoLite2-Country.mmdb:/data/GeoLite2-Country.mmdb:ro
environment:
BANGUI_GEOIP_DB_PATH: /data/GeoLite2-Country.mmdb
```
**Fallback to HTTP (Not Recommended):**
If the MMDB cannot be mounted (e.g., in restricted environments), you can enable the HTTP fallback:
```yaml
services:
bangui:
environment:
BANGUI_GEOIP_ALLOW_HTTP_FALLBACK: "true"
```
**⚠️ Security Warning:** Enabling HTTP fallback causes unresolved IP addresses to be sent **unencrypted** to ip-api.com. This is a privacy and GDPR/CCPA concern. Only enable this if the MMDB absolutely cannot be provisioned, and understand the implications.
**Data Structure:**
The `GeoInfo` returned by the resolution system includes:
- `country_code` (str | None): ISO 3166-1 alpha-2 country code (e.g., `"US"`, `"DE"`).
- `country_name` (str | None): Human-readable country name (e.g., `"United States"`).
- `asn` (str | None): Autonomous System Number (e.g., `"AS3320"`). Only populated when using the HTTP API; local MMDB lookups return `None`.
- `org` (str | None): Organization name associated with the ASN. Only populated when using the HTTP API; local MMDB lookups return `None`.
**Environment Variables:**
```bash
BANGUI_GEOIP_DB_PATH=/data/GeoLite2-Country.mmdb # Path to MaxMind MMDB (primary)
BANGUI_GEOIP_ALLOW_HTTP_FALLBACK="false" # Default: false (MMDB-only)
BANGUI_GEOIP_ALLOW_HTTP_FALLBACK="true" # Enable HTTP fallback (not recommended)
```
**Caching & Performance:**
- Resolved IPs are cached in-memory and persisted to SQLite for fast subsequent lookups.
- Failed lookups are cached for 5 minutes to avoid hammering external APIs.
- The background `geo_cache_flush` task (runs every 60 seconds) persists newly resolved entries to the database.
- The background `geo_re_resolve` task (configurable schedule) periodically re-resolves stale entries to keep data fresh.
### API Documentation Configuration
The `enable_docs` setting controls whether FastAPI serves interactive API documentation at `/api/docs` (Swagger UI) and `/api/redoc` (ReDoc).
**Default:** `false` — API documentation is disabled by default to prevent information disclosure in production.
**When to Enable:**
- Set `BANGUI_ENABLE_DOCS=true` in development and debugging environments only.
- Never enable in production. Exposed API documentation reveals all endpoints, request/response schemas, and allows direct API invocation from the browser.

View File

@@ -1,40 +1,3 @@
## TASK-029 — `Fail2BanConnectionError` leaks socket path in HTTP error responses
**Severity:** Medium
### Where found
`backend/app/exceptions.py``Fail2BanConnectionError.__init__()` formats the message as `f"{message} (socket: {socket_path})"`. `backend/app/main.py``_fail2ban_connection_handler()` returns `{"detail": f"Cannot reach fail2ban: {exc}"}` verbatim.
### Why this is needed
Every 502 response caused by fail2ban being unreachable includes the full socket path (e.g., `Cannot reach fail2ban: [Errno 2] No such file or directory (socket: /var/run/fail2ban/fail2ban.sock)`) in the JSON error body. This discloses internal infrastructure details to unauthenticated users who can trigger the error. Similarly, `_fail2ban_protocol_handler` includes raw exception details that may expose internal parsing logic.
### Goal
Return generic, user-friendly error messages in HTTP responses. Log full details server-side only.
### What to do
1. In `_fail2ban_connection_handler()`, replace:
```python
content={"detail": f"Cannot reach fail2ban: {exc}"}
```
with:
```python
content={"detail": "Cannot reach the fail2ban service. Check the server status page."}
```
2. In `_fail2ban_protocol_handler()`, similarly return a generic message.
3. Both handlers already log `error=str(exc)` server-side — this is correct and should remain.
### Possible traps and issues
- Update any tests that assert the exact `detail` string in 502 responses.
- If the frontend displays this error message directly to the user, ensure it still makes sense after genericizing.
### Docs changes needed
- `Backend-Development.md` — error message hygiene (no internal paths/details in responses).
### Doc references
- [Backend-Development.md](Backend-Development.md) — error handling
---
## TASK-030 — ip-api.com geo lookups use plain HTTP — IP addresses sent unencrypted
**Severity:** Medium