TASK-003: Document process-local constraint for RuntimeState and SessionCache

- Add comprehensive docstring to runtime_state.py explaining single-process
  constraint, impacts in multi-worker deployments, and solution approach
- Add comprehensive docstring to session_cache.py explaining process-local
  cache limitation, security implications, and Redis/database alternatives
- Update Architecture.md to clarify session cache is process-local and
  describe single-worker enforcement via TASK-002
- Update Architecture.md runtime state section with detailed explanation of
  per-process state and multi-worker impacts
- Add Backend-Development.md section 13.7.2 documenting session cache
  pluggability pattern with example Redis implementation
- All tests pass; linting passes; type checking has pre-existing errors

This is the short-term fix for TASK-003: enforce single-worker deployment
(TASK-002) and document the constraint clearly. The long-term fix (Redis
backend) is deferred as a follow-up.
This commit is contained in:
2026-04-26 11:43:34 +02:00
parent 825a67f13a
commit d982fe3efc
4 changed files with 145 additions and 7 deletions

View File

@@ -668,9 +668,9 @@ BanGUI maintains its **own SQLite database** (separate from the fail2ban databas
- Session expiry is configurable (set during setup, stored in `settings`).
- The frontend `AuthProvider` checks session validity on mount and redirects to `/login` if invalid.
- The backend `dependencies.py` provides an `authenticated` dependency that validates the session cookie on every protected endpoint.
- **Session validation cache** — validated session tokens are cached in memory for 10 seconds (`_session_cache` dict in `dependencies.py`) to avoid a SQLite round-trip on every request from the same browser. The cache is invalidated immediately on logout. This cache is process-local and not safe for multi-worker or distributed deployments. A clustered deployment should replace `_session_cache` with a shared cache or remove it entirely.
- **Session validation cache** (`InMemorySessionCache` in `app.utils.session_cache`) — validated session tokens are cached in memory for 10 seconds (configurable via `session_cache_ttl_seconds`) to avoid a SQLite round-trip on every request from the same browser. The cache is invalidated immediately on logout. **⚠️ This cache is process-local and not safe for multi-worker or distributed deployments.** In single-worker mode (enforced by TASK-002), this is safe and improves performance. For multi-worker deployments, replace `InMemorySessionCache` with a shared backend (Redis, database, shared memory) implementing the `SessionCache` protocol. See `app/utils/session_cache.py` module docstring for implementation details.
- **GeoCache** — `GeoCache` instance is created at startup and stored on `app.state.geo_cache`. It encapsulates all IP geolocation caching: in-memory lookup cache, negative cache for unresolvable IPs (with TTL), dirty set for persistence, and thread-safe async locking. Cache is loaded from the `geo_cache` SQLite table on startup. New resolutions are accumulated in memory and periodically flushed to the database by the `geo_cache_flush` background task. Stale entries are re-resolved by the `geo_re_resolve` task. Injected into routes and tasks via FastAPI's dependency system.
- **Runtime state** `RuntimeState` is process-local and only safe when BanGUI runs as a single asyncio worker. Mutating runtime state must not span `await` points because the current design relies on cooperative scheduling. Multi-worker or multi-process deployments must replace this runtime state with a shared coordination backend such as Redis, shared memory, or a database-backed store.
- **Runtime state** (`RuntimeState` in `app.utils.runtime_state`) — stores mutable application state: `server_status` (fail2ban online/offline), `last_activation` (jail activation tracking), `pending_recovery` (crash detection), and `runtime_settings` (effective configuration). **⚠️ RuntimeState is process-local and only safe when BanGUI runs as a single asyncio worker.** Mutations must not span `await` points (cooperative scheduling within a single event loop is safe). In multi-worker deployments, each process has its own copy — logouts from worker A don't affect worker B's cache, health status updates are per-worker, and activation tracking is unreliable. BanGUI enforces single-worker mode (TASK-002) to prevent this issue. For future multi-worker support, replace RuntimeState with a shared coordination backend (Redis, shared memory, database). See `app/utils/runtime_state.py` module docstring for details.
- **Setup-completion flag** — once `is_setup_complete()` returns `True`, the result is stored in `app.state._setup_complete_cached`. The `SetupRedirectMiddleware` skips the DB query on all subsequent requests, removing 1 SQL query per request for the common post-setup case. The completion flag is only written after the runtime database is successfully initialized and all initial setup settings are persisted, preventing a failed setup from permanently bypassing the setup wizard.
---

View File

@@ -645,6 +645,75 @@ async def get_session_repo() -> SessionRepository:
- Before each deployment, run `mypy --strict` to ensure all dependency providers return values compatible with their Protocol types.
- The `cast()` calls in `dependencies.py` are a documented signal that structural compatibility is being verified externally, not via explicit class inheritance.
#### 13.7.2 Session Cache Pluggability — Process-Local vs. Shared Backends
Session validation is expensive (SQLite lookup + password verification). To improve performance, **validated session tokens are cached** using the `SessionCache` interface (`app.utils.session_cache`). The default implementation, `InMemorySessionCache`, stores cached sessions in process-local memory.
**Current implementation (single-worker):**
```python
from app.utils.session_cache import SessionCache, InMemorySessionCache, NoOpSessionCache
class SessionCache(Protocol):
"""Interface for session token validation cache backends."""
def get(self, token: str) -> Session | None: ...
def set(self, token: str, session: Session, ttl_seconds: float) -> None: ...
def invalidate(self, token: str) -> None: ...
def clear(self) -> None: ...
# Default in-memory implementation — PROCESS-LOCAL
class InMemorySessionCache:
def __init__(self) -> None:
self._entries: dict[str, tuple[Session, float]] = {}
```
**Single-worker constraint:**
`InMemorySessionCache` is **process-local** — each worker process has its own dict. In single-worker mode (enforced by TASK-002), this is safe and improves performance. In multi-worker deployments:
- A logout by worker A clears the session from A's cache, but worker B still has it → logout doesn't work.
- Enabling/disabling the cache requires restarting all workers to take effect.
**Multi-worker solution:**
To support multiple workers (future enhancement), implement a shared backend behind the same `SessionCache` Protocol:
```python
# Example Redis implementation (not yet in codebase)
class RedisSessionCache:
"""Session cache backed by Redis."""
def __init__(self, redis_url: str) -> None:
self.client = aioredis.from_url(redis_url)
async def get(self, token: str) -> Session | None:
data = await self.client.get(f"session:{token}")
return Session.model_validate_json(data) if data else None
async def set(self, token: str, session: Session, ttl_seconds: float) -> None:
await self.client.setex(
f"session:{token}",
int(ttl_seconds),
session.model_dump_json()
)
async def invalidate(self, token: str) -> None:
await self.client.delete(f"session:{token}")
async def clear(self) -> None:
await self.client.flushdb()
```
To adopt a Redis backend:
1. Create `RedisSessionCache` in `app.utils.session_cache`.
2. Update `app.utils.runtime_state.set_runtime_settings()` to instantiate `RedisSessionCache` when `REDIS_URL` env var is set.
3. Update `app.config.Settings` to accept optional `REDIS_URL`.
4. Tests continue to use `InMemorySessionCache` (no Redis dependency in dev).
**Implementation rules:**
- All cache methods must be `async` (even if the backend is sync).
- Never log session tokens or session data.
- TTL must be respected — expired entries must be removed on access.
- See `app/utils/session_cache.py` for the full Protocol definition and current implementations.
### 14.8 Composition over Inheritance
- Favour **composing** small, focused objects over deep inheritance hierarchies.