Commit Graph

5 Commits

Author SHA1 Message Date
b631c1c546 feat(backend): implement graceful shutdown for container stop
Graceful shutdown ensures in-flight operations complete before process exits:
- Lifespan shutdown handler drains pending tasks with 25s timeout
- Scheduler stops accepting new jobs immediately
- HTTP session, external logging, scheduler lock, DB conn closed cleanly
- 25s Python timeout leaves 5s margin before Docker's 30s SIGKILL

Files changed:
- backend/app/main.py: enhanced _lifespan shutdown with task drain
- Docker/Dockerfile.backend: documented signal handling in header
- Docker/docker-compose.yml: added stop_grace_period: 30s
- Docker/compose.prod.yml: added stop_grace_period: 30s
- Docs/Deployment.md: new Graceful Shutdown section with sequence table
- Docs/TROUBLESHOOTING.md: new Graceful Shutdown Issues section

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-02 22:47:10 +02:00
90f4c6239c Add resource limits to all Docker containers
- fail2ban: 0.5 CPU / 128M memory limit, 0.1 CPU / 64M reserved
- backend: 2.0 CPU / 512M memory limit, 1.0 CPU / 256M reserved
- frontend: 0.5 CPU / 128M memory limit, 0.25 CPU / 64M reserved

Prevents 'noisy neighbor' scenarios where one container exhausts
host resources (CPU, memory, disk). Limits are hard caps; reservations
guarantee minimum allocation to prevent OOM kills and ensure
responsive service even under load.

Fixes resource contention issue in production and staging environments.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-04-30 21:03:56 +02:00
c4ede71fa6 Fix: Enforce single-worker deployment for session cache cluster safety
Addresses: Backend session cache not cluster-safe (multi-worker issue)

Problem:
- Session cache is process-local (InMemorySessionCache)
- Multi-worker deployments (uvicorn --workers N) create separate processes
- Each process has its own independent session cache
- Sessions cached in Worker A are invisible to Workers B, C, D
- Users randomly logged out when requests land on different workers
- Also affects RuntimeState, rate limiter, and background jobs

Solution (Option A - Strict single-worker enforcement):
- Enhance startup validation with clearer error messages
- Update error messages to explain the problem and how to fix it
- Document single-worker requirement prominently in Docker configs
- Update module docstrings to clarify constraints

Changes:
1. app/startup.py:
   - Enhanced _check_single_worker_mode() error message with troubleshooting
   - Enhanced _stage_check_worker_mode_and_acquire_lock() error message
   - Removed unused import

2. app/utils/session_cache.py:
   - Updated module docstring to explain constraints more clearly
   - Added references to deployment documentation
   - Clarified multi-worker solution for future implementation

3. app/utils/runtime_state.py:
   - Updated module docstring with deployment constraint references
   - Aligned messaging with session_cache.py

4. Docker/Dockerfile.backend:
   - Added comprehensive comments about single-worker requirement
   - Explained impact in multi-worker deployments
   - Referenced deployment constraints documentation

5. Docker/docker-compose.yml, compose.prod.yml, compose.debug.yml:
   - Added documentation comments about BANGUI_WORKERS constraint
   - Explained why single-worker is required

6. backend/tests/test_startup_integration.py:
   - Fixed test unpacking to match function return signature (3 values, not 2)

This ensures multi-worker deployments fail loudly at startup with clear
guidance on what went wrong and how to fix it. The database-backed scheduler
lock provides defense-in-depth for container orchestration scenarios.

For future multi-worker support, implement:
- Redis or database-backed session cache
- Shared RuntimeState coordination
- Distributed APScheduler backend

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-04-30 20:54:24 +02:00
abb224e01b Docker: add PUID/PGID env vars, fix env format, add release script and VERSION 2026-03-16 19:22:16 +01:00
cdf73e2d65 docker files 2026-03-15 18:10:25 +01:00