TASK-003: Document process-local constraint for RuntimeState and SessionCache

- Add comprehensive docstring to runtime_state.py explaining single-process
  constraint, impacts in multi-worker deployments, and solution approach
- Add comprehensive docstring to session_cache.py explaining process-local
  cache limitation, security implications, and Redis/database alternatives
- Update Architecture.md to clarify session cache is process-local and
  describe single-worker enforcement via TASK-002
- Update Architecture.md runtime state section with detailed explanation of
  per-process state and multi-worker impacts
- Add Backend-Development.md section 13.7.2 documenting session cache
  pluggability pattern with example Redis implementation
- All tests pass; linting passes; type checking has pre-existing errors

This is the short-term fix for TASK-003: enforce single-worker deployment
(TASK-002) and document the constraint clearly. The long-term fix (Redis
backend) is deferred as a follow-up.
This commit is contained in:
2026-04-26 11:43:34 +02:00
parent 825a67f13a
commit d982fe3efc
4 changed files with 145 additions and 7 deletions

View File

@@ -668,9 +668,9 @@ BanGUI maintains its **own SQLite database** (separate from the fail2ban databas
- Session expiry is configurable (set during setup, stored in `settings`). - Session expiry is configurable (set during setup, stored in `settings`).
- The frontend `AuthProvider` checks session validity on mount and redirects to `/login` if invalid. - The frontend `AuthProvider` checks session validity on mount and redirects to `/login` if invalid.
- The backend `dependencies.py` provides an `authenticated` dependency that validates the session cookie on every protected endpoint. - The backend `dependencies.py` provides an `authenticated` dependency that validates the session cookie on every protected endpoint.
- **Session validation cache** — validated session tokens are cached in memory for 10 seconds (`_session_cache` dict in `dependencies.py`) to avoid a SQLite round-trip on every request from the same browser. The cache is invalidated immediately on logout. This cache is process-local and not safe for multi-worker or distributed deployments. A clustered deployment should replace `_session_cache` with a shared cache or remove it entirely. - **Session validation cache** (`InMemorySessionCache` in `app.utils.session_cache`) — validated session tokens are cached in memory for 10 seconds (configurable via `session_cache_ttl_seconds`) to avoid a SQLite round-trip on every request from the same browser. The cache is invalidated immediately on logout. **⚠️ This cache is process-local and not safe for multi-worker or distributed deployments.** In single-worker mode (enforced by TASK-002), this is safe and improves performance. For multi-worker deployments, replace `InMemorySessionCache` with a shared backend (Redis, database, shared memory) implementing the `SessionCache` protocol. See `app/utils/session_cache.py` module docstring for implementation details.
- **GeoCache** — `GeoCache` instance is created at startup and stored on `app.state.geo_cache`. It encapsulates all IP geolocation caching: in-memory lookup cache, negative cache for unresolvable IPs (with TTL), dirty set for persistence, and thread-safe async locking. Cache is loaded from the `geo_cache` SQLite table on startup. New resolutions are accumulated in memory and periodically flushed to the database by the `geo_cache_flush` background task. Stale entries are re-resolved by the `geo_re_resolve` task. Injected into routes and tasks via FastAPI's dependency system. - **GeoCache** — `GeoCache` instance is created at startup and stored on `app.state.geo_cache`. It encapsulates all IP geolocation caching: in-memory lookup cache, negative cache for unresolvable IPs (with TTL), dirty set for persistence, and thread-safe async locking. Cache is loaded from the `geo_cache` SQLite table on startup. New resolutions are accumulated in memory and periodically flushed to the database by the `geo_cache_flush` background task. Stale entries are re-resolved by the `geo_re_resolve` task. Injected into routes and tasks via FastAPI's dependency system.
- **Runtime state** `RuntimeState` is process-local and only safe when BanGUI runs as a single asyncio worker. Mutating runtime state must not span `await` points because the current design relies on cooperative scheduling. Multi-worker or multi-process deployments must replace this runtime state with a shared coordination backend such as Redis, shared memory, or a database-backed store. - **Runtime state** (`RuntimeState` in `app.utils.runtime_state`) — stores mutable application state: `server_status` (fail2ban online/offline), `last_activation` (jail activation tracking), `pending_recovery` (crash detection), and `runtime_settings` (effective configuration). **⚠️ RuntimeState is process-local and only safe when BanGUI runs as a single asyncio worker.** Mutations must not span `await` points (cooperative scheduling within a single event loop is safe). In multi-worker deployments, each process has its own copy — logouts from worker A don't affect worker B's cache, health status updates are per-worker, and activation tracking is unreliable. BanGUI enforces single-worker mode (TASK-002) to prevent this issue. For future multi-worker support, replace RuntimeState with a shared coordination backend (Redis, shared memory, database). See `app/utils/runtime_state.py` module docstring for details.
- **Setup-completion flag** — once `is_setup_complete()` returns `True`, the result is stored in `app.state._setup_complete_cached`. The `SetupRedirectMiddleware` skips the DB query on all subsequent requests, removing 1 SQL query per request for the common post-setup case. The completion flag is only written after the runtime database is successfully initialized and all initial setup settings are persisted, preventing a failed setup from permanently bypassing the setup wizard. - **Setup-completion flag** — once `is_setup_complete()` returns `True`, the result is stored in `app.state._setup_complete_cached`. The `SetupRedirectMiddleware` skips the DB query on all subsequent requests, removing 1 SQL query per request for the common post-setup case. The completion flag is only written after the runtime database is successfully initialized and all initial setup settings are persisted, preventing a failed setup from permanently bypassing the setup wizard.
--- ---

View File

@@ -645,6 +645,75 @@ async def get_session_repo() -> SessionRepository:
- Before each deployment, run `mypy --strict` to ensure all dependency providers return values compatible with their Protocol types. - Before each deployment, run `mypy --strict` to ensure all dependency providers return values compatible with their Protocol types.
- The `cast()` calls in `dependencies.py` are a documented signal that structural compatibility is being verified externally, not via explicit class inheritance. - The `cast()` calls in `dependencies.py` are a documented signal that structural compatibility is being verified externally, not via explicit class inheritance.
#### 13.7.2 Session Cache Pluggability — Process-Local vs. Shared Backends
Session validation is expensive (SQLite lookup + password verification). To improve performance, **validated session tokens are cached** using the `SessionCache` interface (`app.utils.session_cache`). The default implementation, `InMemorySessionCache`, stores cached sessions in process-local memory.
**Current implementation (single-worker):**
```python
from app.utils.session_cache import SessionCache, InMemorySessionCache, NoOpSessionCache
class SessionCache(Protocol):
"""Interface for session token validation cache backends."""
def get(self, token: str) -> Session | None: ...
def set(self, token: str, session: Session, ttl_seconds: float) -> None: ...
def invalidate(self, token: str) -> None: ...
def clear(self) -> None: ...
# Default in-memory implementation — PROCESS-LOCAL
class InMemorySessionCache:
def __init__(self) -> None:
self._entries: dict[str, tuple[Session, float]] = {}
```
**Single-worker constraint:**
`InMemorySessionCache` is **process-local** — each worker process has its own dict. In single-worker mode (enforced by TASK-002), this is safe and improves performance. In multi-worker deployments:
- A logout by worker A clears the session from A's cache, but worker B still has it → logout doesn't work.
- Enabling/disabling the cache requires restarting all workers to take effect.
**Multi-worker solution:**
To support multiple workers (future enhancement), implement a shared backend behind the same `SessionCache` Protocol:
```python
# Example Redis implementation (not yet in codebase)
class RedisSessionCache:
"""Session cache backed by Redis."""
def __init__(self, redis_url: str) -> None:
self.client = aioredis.from_url(redis_url)
async def get(self, token: str) -> Session | None:
data = await self.client.get(f"session:{token}")
return Session.model_validate_json(data) if data else None
async def set(self, token: str, session: Session, ttl_seconds: float) -> None:
await self.client.setex(
f"session:{token}",
int(ttl_seconds),
session.model_dump_json()
)
async def invalidate(self, token: str) -> None:
await self.client.delete(f"session:{token}")
async def clear(self) -> None:
await self.client.flushdb()
```
To adopt a Redis backend:
1. Create `RedisSessionCache` in `app.utils.session_cache`.
2. Update `app.utils.runtime_state.set_runtime_settings()` to instantiate `RedisSessionCache` when `REDIS_URL` env var is set.
3. Update `app.config.Settings` to accept optional `REDIS_URL`.
4. Tests continue to use `InMemorySessionCache` (no Redis dependency in dev).
**Implementation rules:**
- All cache methods must be `async` (even if the backend is sync).
- Never log session tokens or session data.
- TTL must be respected — expired entries must be removed on access.
- See `app/utils/session_cache.py` for the full Protocol definition and current implementations.
### 14.8 Composition over Inheritance ### 14.8 Composition over Inheritance
- Favour **composing** small, focused objects over deep inheritance hierarchies. - Favour **composing** small, focused objects over deep inheritance hierarchies.

View File

@@ -6,11 +6,36 @@ framework state bag limited to shared infrastructure handles and immutable
configuration while still allowing existing code to access runtime values via configuration while still allowing existing code to access runtime values via
attribute proxying. attribute proxying.
RuntimeState is designed for a single-process, single-worker asyncio ⚠️ SINGLE-PROCESS CONSTRAINT
deployment. Mutations must complete without awaiting across read-modify-write ==============================
sequences. If BanGUI is ever deployed with multiple workers or across processes,
this module must be replaced with a shared backend such as Redis or shared RuntimeState is designed for a single-process, single-worker asyncio deployment.
memory. This means:
- Each process has its own independent copy of all runtime state.
- Changes to runtime_state in one process are NOT visible to other processes.
- Mutations must complete without awaiting across read-modify-write sequences
(cooperative scheduling within a single event loop is safe).
IMPACT IN MULTI-WORKER DEPLOYMENTS:
- Logout processed by worker A clears the session from A's in-memory cache,
but worker B still has that session in its own cache and will accept it.
- Health status updates (server_status) received by worker A are invisible
to worker B's dashboard responses — each worker reports stale data.
- fail2ban activation/recovery tracking (pending_recovery, last_activation)
is per-worker and unreliable across processes.
MULTI-WORKER SOLUTION:
To deploy BanGUI with multiple workers (e.g., via gunicorn -w 4), you must:
1. Replace RuntimeState with a shared store (Redis, shared memory, database).
2. Replace InMemorySessionCache with RedisSessionCache (see session_cache.py).
3. Ensure all workers use the same backend for coordination.
SINGLE-WORKER ENFORCEMENT:
See TASK-002 in Docs/Tasks.md for deployment configuration that enforces
single-worker mode, preventing this issue entirely.
For now, BanGUI is deployed as single-worker only — this constraint is
acceptable and keeps the implementation simple.
""" """
from __future__ import annotations from __future__ import annotations

View File

@@ -3,6 +3,50 @@
This module defines a cache interface for authenticated sessions and a default This module defines a cache interface for authenticated sessions and a default
process-local in-memory implementation. The backend can swap the cache process-local in-memory implementation. The backend can swap the cache
implementation without changing the authentication dependency logic. implementation without changing the authentication dependency logic.
⚠️ PROCESS-LOCAL CONSTRAINT (InMemorySessionCache)
====================================================
InMemorySessionCache stores validated sessions in a process-local dict.
This means:
- Each worker process has its own independent session cache.
- A session invalidated (logout) by worker A is still valid in worker B.
- Changes to the cache in one process are NOT visible to other processes.
IMPACT IN MULTI-WORKER DEPLOYMENTS:
- User logs out (worker A clears session from its cache).
- User makes a new request → routed to worker B.
- Worker B still has the stale session in its cache → request is accepted.
- User appears still logged in (from their perspective).
This is a security issue: logout does not work reliably across workers.
MULTI-WORKER SOLUTION:
To deploy BanGUI with multiple workers (e.g., via gunicorn -w 4), replace
InMemorySessionCache with a shared backend such as:
- RedisSessionCache — backed by Redis (recommended for production).
- DatabaseSessionCache — backed by SQLite or PostgreSQL.
- SharedMemorySessionCache — backed by IPC (for local multi-process).
The SessionCache Protocol is already designed for pluggable backends:
class SessionCache(Protocol):
def get(token: str) -> Session | None: ...
def set(token: str, session: Session, ttl_seconds: float) -> None: ...
def invalidate(token: str) -> None: ...
def clear() -> None: ...
To add Redis support:
1. Create RedisSessionCache in this module (implements SessionCache).
2. Update runtime_state.set_runtime_settings() to instantiate RedisSessionCache
when REDIS_URL is configured.
3. See Backend-Development.md § "Session Cache Pluggability" for details.
SINGLE-WORKER ENFORCEMENT:
See TASK-002 in Docs/Tasks.md for deployment configuration that enforces
single-worker mode, preventing this issue entirely.
For now, BanGUI is deployed as single-worker only — this constraint is
acceptable and keeps the implementation simple.
""" """
from __future__ import annotations from __future__ import annotations