Files
BanGUI/backend/app/utils/session_cache.py
Lukas c4ede71fa6 Fix: Enforce single-worker deployment for session cache cluster safety
Addresses: Backend session cache not cluster-safe (multi-worker issue)

Problem:
- Session cache is process-local (InMemorySessionCache)
- Multi-worker deployments (uvicorn --workers N) create separate processes
- Each process has its own independent session cache
- Sessions cached in Worker A are invisible to Workers B, C, D
- Users randomly logged out when requests land on different workers
- Also affects RuntimeState, rate limiter, and background jobs

Solution (Option A - Strict single-worker enforcement):
- Enhance startup validation with clearer error messages
- Update error messages to explain the problem and how to fix it
- Document single-worker requirement prominently in Docker configs
- Update module docstrings to clarify constraints

Changes:
1. app/startup.py:
   - Enhanced _check_single_worker_mode() error message with troubleshooting
   - Enhanced _stage_check_worker_mode_and_acquire_lock() error message
   - Removed unused import

2. app/utils/session_cache.py:
   - Updated module docstring to explain constraints more clearly
   - Added references to deployment documentation
   - Clarified multi-worker solution for future implementation

3. app/utils/runtime_state.py:
   - Updated module docstring with deployment constraint references
   - Aligned messaging with session_cache.py

4. Docker/Dockerfile.backend:
   - Added comprehensive comments about single-worker requirement
   - Explained impact in multi-worker deployments
   - Referenced deployment constraints documentation

5. Docker/docker-compose.yml, compose.prod.yml, compose.debug.yml:
   - Added documentation comments about BANGUI_WORKERS constraint
   - Explained why single-worker is required

6. backend/tests/test_startup_integration.py:
   - Fixed test unpacking to match function return signature (3 values, not 2)

This ensures multi-worker deployments fail loudly at startup with clear
guidance on what went wrong and how to fix it. The database-backed scheduler
lock provides defense-in-depth for container orchestration scenarios.

For future multi-worker support, implement:
- Redis or database-backed session cache
- Shared RuntimeState coordination
- Distributed APScheduler backend

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-04-30 20:54:24 +02:00

125 lines
4.5 KiB
Python

"""Pluggable session cache abstraction.
This module defines a cache interface for authenticated sessions and a default
process-local in-memory implementation. The backend can swap the cache
implementation without changing the authentication dependency logic.
⚠️ PROCESS-LOCAL CONSTRAINT (InMemorySessionCache)
====================================================
InMemorySessionCache stores validated sessions in a process-local dict.
This means:
- Each worker process has its own independent session cache.
- A session invalidated (logout) by worker A is still valid in worker B.
- Changes to the cache in one process are NOT visible to other processes.
IMPACT IN MULTI-WORKER DEPLOYMENTS:
- User logs out (worker A clears session from its cache).
- User makes a new request → routed to worker B.
- Worker B still has the stale session in its cache → request is accepted.
- User appears still logged in (from their perspective).
This is a CRITICAL SECURITY ISSUE: logout does not work reliably across workers.
SINGLE-WORKER ENFORCEMENT:
BanGUI enforces single-worker mode to prevent this issue:
1. Environment variable check: BANGUI_WORKERS must be 1 or unset
2. Database lock: Only one instance can run the scheduler at a time
3. Startup validation: Fails loudly if multi-worker scenario is detected
See Docs/Architekture.md § Deployment Constraints for full details.
MULTI-WORKER SOLUTION (Future):
If multi-worker support is needed in the future, replace InMemorySessionCache
with a shared backend such as:
- RedisSessionCache — backed by Redis (recommended for production)
- DatabaseSessionCache — backed by SQLite or PostgreSQL
- SharedMemorySessionCache — backed by IPC (for local multi-process)
The SessionCache Protocol is designed for pluggable backends:
class SessionCache(Protocol):
def get(token: str) -> Session | None: ...
def set(token: str, session: Session, ttl_seconds: float) -> None: ...
def invalidate(token: str) -> None: ...
def clear() -> None: ...
To add Redis support:
1. Create RedisSessionCache in this module (implements SessionCache)
2. Update app/main.py _update_session_cache() to instantiate RedisSessionCache
when BANGUI_REDIS_URL is configured
3. Update Backend-Development.md with multi-worker deployment guidelines
CURRENT STATUS:
For now, BanGUI is deployed as single-worker only. This constraint is
acceptable and keeps the implementation simple. The database-backed scheduler
lock ensures only one instance runs background jobs, even in container
orchestration scenarios where multiple instances may start.
"""
from __future__ import annotations
import time
from typing import TYPE_CHECKING, Protocol
if TYPE_CHECKING: # pragma: no cover
from app.models.auth import Session
class SessionCache(Protocol):
"""Interface for session token validation cache backends."""
def get(self, token: str) -> Session | None:
"""Return the cached session for *token*, or ``None`` if missing."""
def set(self, token: str, session: Session, ttl_seconds: float) -> None:
"""Cache the validated *session* for *token* for *ttl_seconds*."""
def invalidate(self, token: str) -> None:
"""Remove *token* from the cache if it exists."""
def clear(self) -> None:
"""Remove all entries from the cache."""
class InMemorySessionCache:
"""A process-local session cache implementation."""
def __init__(self) -> None:
self._entries: dict[str, tuple[Session, float]] = {}
def get(self, token: str) -> Session | None:
entry = self._entries.get(token)
if entry is None:
return None
session, expires_at = entry
if time.monotonic() >= expires_at:
self._entries.pop(token, None)
return None
return session
def set(self, token: str, session: Session, ttl_seconds: float) -> None:
self._entries[token] = (session, time.monotonic() + ttl_seconds)
def invalidate(self, token: str) -> None:
self._entries.pop(token, None)
def clear(self) -> None:
self._entries.clear()
class NoOpSessionCache:
"""A no-op session cache used when caching is disabled."""
def get(self, token: str) -> Session | None:
return None
def set(self, token: str, session: Session, ttl_seconds: float) -> None:
return None
def invalidate(self, token: str) -> None:
return None
def clear(self) -> None:
return None