Add multi-worker detection for APScheduler safety

- Add _check_single_worker_mode() to startup.py that detects and rejects
  multi-worker configurations, raising a clear RuntimeError with instructions
- Set BANGUI_WORKERS=1 as default in Dockerfile.backend
- Document single-worker requirement in compose.prod.yml
- Add 'Deployment Constraints' section to Architekture.md explaining why
  single-worker mode is required and detailing future multi-worker support
- Add '9.1 Background Tasks and Scheduler Architecture' section to
  Backend-Development.md documenting task structure and single-worker requirement
- Add comprehensive test suite (test_startup.py) covering all scenarios:
  allows single worker, rejects multi-worker, validates config format,
  and verifies informative error messages

This fix addresses TASK-002 which identified that in-process APScheduler is
unsafe in multi-worker deployments due to each worker creating independent
scheduler instances, causing duplicate background job execution.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This commit is contained in:
2026-04-26 11:39:51 +02:00
parent def412797a
commit 825a67f13a
6 changed files with 212 additions and 18 deletions

View File

@@ -345,6 +345,71 @@ async def test_list_jails_returns_200(client: AsyncClient) -> None:
---
## 9.1 Background Tasks and Scheduler Architecture
BanGUI uses **APScheduler 4.x** (async mode) to manage background jobs that execute on a schedule without user interaction. This section documents how to write and register background tasks.
### Task Location and Structure
All background tasks live in `backend/app/tasks/` as separate modules. Each task:
- Exports a `register(app: FastAPI) -> None` or `async def register(app: FastAPI) -> None` function.
- Opens its own database connection using `app.db.open_db()` or the `task_db()` helper.
- Closes connections when work completes (use the async context manager pattern).
- Runs independently of the FastAPI request/response cycle.
### Example Task
```python
# backend/app/tasks/my_task.py
import structlog
from fastapi import FastAPI
from apscheduler.schedulers.asyncio import AsyncIOScheduler
log = structlog.get_logger()
async def my_background_job(app: FastAPI) -> None:
"""Do important work on a schedule."""
log.info("my_background_job_started")
try:
db = await app.db.open_db(app.state.settings.database_path)
try:
# Do work...
pass
finally:
await db.close()
except Exception:
log.error("my_background_job_failed", exc_info=True)
def register(app: FastAPI) -> None:
"""Register the job with the scheduler."""
scheduler: AsyncIOScheduler = app.state.scheduler
scheduler.add_job(
my_background_job,
args=(app,),
trigger="interval",
seconds=60,
id="my_task",
name="My Background Job",
)
```
### Accessing Shared Resources in Tasks
Since tasks do not have access to `Depends(get_db)` (no request scope), they must:
1. **Open their own DB connection** via `app.state.db_factory.open_db(path)`.
2. **Access app-level state** — `app.state.http_session`, `app.state.geo_cache`, `app.state.settings`, etc.
3. **Use structlog** for all logging (never `print()`).
### Single-Worker Requirement
**The scheduler is bound to a single asyncio event loop and cannot be shared across multiple worker processes.** BanGUI enforces single-worker mode to prevent duplicate task execution.
- **Deployment constraint:** Set `BANGUI_WORKERS=1` (default).
- **Startup validation:** `startup_shared_resources()` raises `RuntimeError` if `BANGUI_WORKERS > 1`.
- See [Architekture.md § 9.2](Architekture.md) for full details.
---
## 10. Code Style & Tooling
| Tool | Purpose |
@@ -443,7 +508,7 @@ class Settings(BaseSettings):
---
## 12. Git & Workflow
## 13. Git & Workflow
- **Branch naming:** `feature/<short-description>`, `fix/<short-description>`, `chore/<short-description>`.
- **Commit messages:** imperative tense, max 72 chars first line (`Add jail reload endpoint`, `Fix ban history query`).
@@ -453,11 +518,11 @@ class Settings(BaseSettings):
---
## 13. Coding Principles
## 14. Coding Principles
These principles are **non-negotiable**. Every backend contributor must internalise and apply them daily.
### 13.1 Clean Code
### 14.1 Clean Code
- Write code that **reads like well-written prose** — a new developer should understand intent without asking.
- **Meaningful names** — variables, functions, and classes must reveal their purpose. Avoid abbreviations (`cnt`, `mgr`, `tmp`) unless universally understood.
@@ -488,7 +553,7 @@ async def check(ip, j):
raise Exception("not found")
```
### 13.2 Separation of Concerns (SoC)
### 14.2 Separation of Concerns (SoC)
- Each module, class, and function must have a **single, well-defined responsibility**.
- **Routers** → HTTP layer only (parse requests, return responses).
@@ -498,29 +563,29 @@ async def check(ip, j):
- **Tasks** → scheduled background jobs.
- Never mix layers — a router must not execute SQL, and a repository must not raise `HTTPException`.
### 13.3 Single Responsibility Principle (SRP)
### 14.3 Single Responsibility Principle (SRP)
- A class or module should have **one and only one reason to change**.
- If a service handles both ban management *and* email notifications, split it into `BanService` and `NotificationService`.
### 13.4 Don't Repeat Yourself (DRY)
### 14.4 Don't Repeat Yourself (DRY)
- Extract shared logic into utility functions, base classes, or dependency providers.
- If the same block of code appears in more than one place, **refactor it** into a single source of truth.
- But don't over-abstract — premature DRY that couples unrelated features is worse than a little duplication (see **Rule of Three**: refactor when something appears a third time).
### 13.5 KISS — Keep It Simple, Stupid
### 14.5 KISS — Keep It Simple, Stupid
- Choose the simplest solution that works correctly.
- Avoid clever tricks, premature optimisation, and over-engineering.
- If a standard library function does the job, prefer it over a custom implementation.
### 13.6 YAGNI — You Aren't Gonna Need It
### 14.6 YAGNI — You Aren't Gonna Need It
- Do **not** build features, abstractions, or config options "just in case".
- Implement what is required **now**. Extend later when a real need emerges.
### 13.7 Dependency Inversion Principle (DIP)
### 14.7 Dependency Inversion Principle (DIP)
- High-level modules (services) must not depend on low-level modules (repositories) directly. Both should depend on **abstractions** (protocols / interfaces).
- Use FastAPI's `Depends()` to inject implementations — this makes swapping and testing trivial.
@@ -580,17 +645,17 @@ async def get_session_repo() -> SessionRepository:
- Before each deployment, run `mypy --strict` to ensure all dependency providers return values compatible with their Protocol types.
- The `cast()` calls in `dependencies.py` are a documented signal that structural compatibility is being verified externally, not via explicit class inheritance.
### 13.8 Composition over Inheritance
### 14.8 Composition over Inheritance
- Favour **composing** small, focused objects over deep inheritance hierarchies.
- Use mixins or protocols only when a clear "is-a" relationship exists; otherwise, pass collaborators as constructor arguments.
### 13.9 Fail Fast
### 14.9 Fail Fast
- Validate inputs as early as possible — at the API boundary with Pydantic, at service entry with assertions or domain checks.
- Raise specific exceptions immediately rather than letting bad data propagate silently.
### 13.10 Law of Demeter (Principle of Least Knowledge)
### 14.10 Law of Demeter (Principle of Least Knowledge)
- A function should only call methods on:
1. Its own object (`self`).
@@ -598,7 +663,7 @@ async def get_session_repo() -> SessionRepository:
3. Objects it creates.
- Avoid long accessor chains like `request.state.db.cursor().execute(...)` — wrap them in a meaningful method.
### 13.11 Defensive Programming
### 14.11 Defensive Programming
- Never trust external input — validate and sanitise everything that crosses a boundary (HTTP request, file, socket, environment variable).
- Handle edge cases explicitly: empty lists, `None` values, negative numbers, empty strings.
@@ -606,7 +671,7 @@ async def get_session_repo() -> SessionRepository:
---
## 14. Quick Reference — Do / Don't
## 15. Quick Reference — Do / Don't
| Do | Don't |
|---|---|