TASK-003: Document process-local constraint for RuntimeState and SessionCache

- Add comprehensive docstring to runtime_state.py explaining single-process
  constraint, impacts in multi-worker deployments, and solution approach
- Add comprehensive docstring to session_cache.py explaining process-local
  cache limitation, security implications, and Redis/database alternatives
- Update Architecture.md to clarify session cache is process-local and
  describe single-worker enforcement via TASK-002
- Update Architecture.md runtime state section with detailed explanation of
  per-process state and multi-worker impacts
- Add Backend-Development.md section 13.7.2 documenting session cache
  pluggability pattern with example Redis implementation
- All tests pass; linting passes; type checking has pre-existing errors

This is the short-term fix for TASK-003: enforce single-worker deployment
(TASK-002) and document the constraint clearly. The long-term fix (Redis
backend) is deferred as a follow-up.
This commit is contained in:
2026-04-26 11:43:34 +02:00
parent 825a67f13a
commit d982fe3efc
4 changed files with 145 additions and 7 deletions

View File

@@ -668,9 +668,9 @@ BanGUI maintains its **own SQLite database** (separate from the fail2ban databas
- Session expiry is configurable (set during setup, stored in `settings`).
- The frontend `AuthProvider` checks session validity on mount and redirects to `/login` if invalid.
- The backend `dependencies.py` provides an `authenticated` dependency that validates the session cookie on every protected endpoint.
- **Session validation cache** — validated session tokens are cached in memory for 10 seconds (`_session_cache` dict in `dependencies.py`) to avoid a SQLite round-trip on every request from the same browser. The cache is invalidated immediately on logout. This cache is process-local and not safe for multi-worker or distributed deployments. A clustered deployment should replace `_session_cache` with a shared cache or remove it entirely.
- **Session validation cache** (`InMemorySessionCache` in `app.utils.session_cache`) — validated session tokens are cached in memory for 10 seconds (configurable via `session_cache_ttl_seconds`) to avoid a SQLite round-trip on every request from the same browser. The cache is invalidated immediately on logout. **⚠️ This cache is process-local and not safe for multi-worker or distributed deployments.** In single-worker mode (enforced by TASK-002), this is safe and improves performance. For multi-worker deployments, replace `InMemorySessionCache` with a shared backend (Redis, database, shared memory) implementing the `SessionCache` protocol. See `app/utils/session_cache.py` module docstring for implementation details.
- **GeoCache** — `GeoCache` instance is created at startup and stored on `app.state.geo_cache`. It encapsulates all IP geolocation caching: in-memory lookup cache, negative cache for unresolvable IPs (with TTL), dirty set for persistence, and thread-safe async locking. Cache is loaded from the `geo_cache` SQLite table on startup. New resolutions are accumulated in memory and periodically flushed to the database by the `geo_cache_flush` background task. Stale entries are re-resolved by the `geo_re_resolve` task. Injected into routes and tasks via FastAPI's dependency system.
- **Runtime state** `RuntimeState` is process-local and only safe when BanGUI runs as a single asyncio worker. Mutating runtime state must not span `await` points because the current design relies on cooperative scheduling. Multi-worker or multi-process deployments must replace this runtime state with a shared coordination backend such as Redis, shared memory, or a database-backed store.
- **Runtime state** (`RuntimeState` in `app.utils.runtime_state`) — stores mutable application state: `server_status` (fail2ban online/offline), `last_activation` (jail activation tracking), `pending_recovery` (crash detection), and `runtime_settings` (effective configuration). **⚠️ RuntimeState is process-local and only safe when BanGUI runs as a single asyncio worker.** Mutations must not span `await` points (cooperative scheduling within a single event loop is safe). In multi-worker deployments, each process has its own copy — logouts from worker A don't affect worker B's cache, health status updates are per-worker, and activation tracking is unreliable. BanGUI enforces single-worker mode (TASK-002) to prevent this issue. For future multi-worker support, replace RuntimeState with a shared coordination backend (Redis, shared memory, database). See `app/utils/runtime_state.py` module docstring for details.
- **Setup-completion flag** — once `is_setup_complete()` returns `True`, the result is stored in `app.state._setup_complete_cached`. The `SetupRedirectMiddleware` skips the DB query on all subsequent requests, removing 1 SQL query per request for the common post-setup case. The completion flag is only written after the runtime database is successfully initialized and all initial setup settings are persisted, preventing a failed setup from permanently bypassing the setup wizard.
---