refactor(backend): clean up models setup, improve ip utils, add adr docs

- Extract ADR documents for architectural decisions (SQLite, FastAPI, React, APScheduler, Scheduler) - Refactor setup.py: improve code structure and readability - Add IP validation utilities with test coverage - Update frontend components (BanTable, HistoryPage) - Add pre-commit hooks and CONTRIBUTING.md - Add .editorconfig for consistent coding standards
2026-05-03 18:04:45 +02:00
parent 2f9fc8076d
commit 5f0ab40816
17 changed files with 517 additions and 48 deletions
--- a/Docs/adr/ADR-005-Single-Instance-Scheduler.md
+++ b/Docs/adr/ADR-005-Single-Instance-Scheduler.md
@@ -0,0 +1,61 @@
+# ADR-005: Single-Instance Scheduler Enforcement
+
+## Status
+Accepted
+
+## Context
+APScheduler's `AsyncIOScheduler` is bound to a single asyncio event loop.
+Running multiple scheduler instances leads to duplicate jobs, database lock
+contention, and undefined behaviour.
+
+## Decision
+Enforce exactly **one scheduler instance** across the entire application lifecycle,
+using a database-level distributed lock.
+
+## Mechanism
+
+### 1. Startup gate: `BANGUI_WORKERS=1`
+The Docker compose file is configured with `BANGUI_WORKERS=1` and the startup DAG
+validates this variable. If the variable is not set to `1`, startup aborts with
+a clear error message.
+
+### 2. Runtime lock: `scheduler_lock` table
+During startup, after opening the SQLite database, the application attempts:
+
+```sql
+INSERT INTO scheduler_lock (lock_name, heartbeat_at)
+VALUES ('scheduler', unixepoch())
+ON CONFLICT(lock_name) DO UPDATE SET heartbeat_at = unixepoch()
+WHERE (unixepoch() - heartbeat_at) < 30;
+```
+
+- If the INSERT succeeds, this instance holds the lock and starts the scheduler.
+- If the INSERT is a no-op (heartbeat is recent), another instance holds the lock
+  and startup continues without starting the scheduler.
+- A background task (`scheduler_lock_heartbeat`) updates the heartbeat every 10
+  seconds. If the process crashes, the lock expires after 30 seconds, allowing
+  a restart to acquire it immediately.
+
+### 3. Deployment topology
+| Deployment | Behaviour |
+|---|---|
+| Single container | Scheduler runs normally |
+| Single Pod (Kubernetes) | Scheduler runs normally |
+| Accidental multi-process restart | Second process fails to start scheduler; first continues |
+| Intentional multi-worker | Not supported; requires external job store (future) |
+
+## Rationale
+
+### Why this approach?
+- **No external coordination service:** No ZooKeeper, etcd, or Redis needed.
+  The existing SQLite database is reused.
+- **Atomic:** SQLite's INSERT with ON CONFLICT is atomic; no race condition.
+- **Self-healing:** Lock expiry means a crashed instance automatically releases
+  its lock. No manual cleanup required.
+- **Crash-safe:** A heartbeat-based TTL ensures stale locks are not held
+  indefinitely.
+
+## Consequences
+- `BANGUI_WORKERS` must always be `1`. This is documented and enforced.
+- Future multi-worker deployments require migration to a persistent job store
+  (PostgreSQL + SQLAlchemy job store, or Redis).