Enforce single-executor safety regardless of process launcher through a
robust database-backed lock mechanism that works reliably in container
orchestration environments.
Key changes:
1. Add scheduler_lock table to database schema (migration 4)
- Singleton row (id=1) prevents concurrent execution
- Stores PID, hostname, creation timestamp, heartbeat timestamp
- Atomic transaction prevents race conditions
2. Create scheduler lock utility (app/utils/scheduler_lock.py)
- acquire_scheduler_lock(): Atomically acquire or fail
- release_scheduler_lock(): Clean up on shutdown
- update_scheduler_lock_heartbeat(): Keep lock alive (every 10 seconds)
- get_scheduler_lock_info(): Debug/inspect lock status
- Stale lock detection: TTL-based (60 second expiry)
3. Reorder startup DAG stages
- DATABASE now comes first (required for lock acquisition)
- WORKER_MODE depends on DATABASE (performs lock check after initialization)
- Maintains all other stage dependencies intact
4. Update startup process (app/startup.py)
- Replace _check_single_worker_mode() with two-tier check:
* Fast check: BANGUI_WORKERS env var (if explicitly set to >1)
* Authoritative check: Database lock (catches misconfiguration)
- Return startup_db from startup_shared_resources() for lock management
5. Register scheduler lock heartbeat task
- New task: scheduler_lock_heartbeat (app/tasks/scheduler_lock_heartbeat.py)
- Updates lock heartbeat every 10 seconds (keeps lock alive)
- Prevents false positives from temporary load spikes
6. Add lock release to lifespan shutdown (app/main.py)
- Release lock before closing database
- Allows other instances to acquire during rolling deployments
- Graceful handoff between instances
7. Comprehensive test coverage (backend/tests/test_scheduler_lock.py)
- Lock acquisition success and failure cases
- Stale lock cleanup on startup
- Lock release and heartbeat updates
- Full lifecycle: acquire → heartbeat → release
8. Update documentation (Docs/Architekture.md § 9.3)
- Explain single-executor requirement
- Document database-backed locking mechanism
- Compare with alternative approaches (filesystem, env var)
- Include troubleshooting guide
- Container orchestration examples (Docker, Kubernetes, systemd)
Why database-backed instead of filesystem?
- Atomicity: SQLite transactions prevent TOCTOU race windows
- Container-safe: Works across containers with shared DB volumes
- No NFS/SMB edge cases
- Timestamp-based stale detection (PID reuse is unreliable)
- More reliable in rolling deployments
Benefits:
- Works with any process manager (uvicorn, gunicorn, etc.)
- Handles simultaneous startup attempts correctly
- Automatic failover on instance crash (stale lock cleanup)
- Clear error messages with troubleshooting steps
- No environment variable required (lock is authoritative)
- Scales to multi-worker deployments if combined with external job store
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
328 lines
11 KiB
Python
328 lines
11 KiB
Python
"""Application database schema definition and initialisation.
|
|
|
|
BanGUI maintains its own SQLite database that stores configuration, session
|
|
state, blocklist source definitions, and import run logs. This module is
|
|
the single source of truth for the schema — all ``CREATE TABLE`` statements
|
|
live here and are applied on first run via :func:`init_db`.
|
|
|
|
The fail2ban database is separate and is accessed read-only by the history
|
|
and ban services.
|
|
"""
|
|
|
|
import aiosqlite
|
|
import structlog
|
|
|
|
log: structlog.stdlib.BoundLogger = structlog.get_logger()
|
|
|
|
# ---------------------------------------------------------------------------
|
|
# DDL statements
|
|
# ---------------------------------------------------------------------------
|
|
|
|
_CREATE_SETTINGS: str = """
|
|
CREATE TABLE IF NOT EXISTS settings (
|
|
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
|
key TEXT NOT NULL UNIQUE,
|
|
value TEXT NOT NULL,
|
|
created_at TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%fZ', 'now')),
|
|
updated_at TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%fZ', 'now'))
|
|
);
|
|
"""
|
|
|
|
_CREATE_SESSIONS: str = """
|
|
CREATE TABLE IF NOT EXISTS sessions (
|
|
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
|
token_hash TEXT NOT NULL UNIQUE,
|
|
created_at TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%fZ', 'now')),
|
|
expires_at TEXT NOT NULL
|
|
);
|
|
"""
|
|
|
|
_CREATE_SESSIONS_TOKEN_INDEX: str = """
|
|
CREATE UNIQUE INDEX IF NOT EXISTS idx_sessions_token_hash ON sessions (token_hash);
|
|
"""
|
|
|
|
_CREATE_BLOCKLIST_SOURCES: str = """
|
|
CREATE TABLE IF NOT EXISTS blocklist_sources (
|
|
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
|
name TEXT NOT NULL,
|
|
url TEXT NOT NULL UNIQUE,
|
|
enabled INTEGER NOT NULL DEFAULT 1,
|
|
created_at TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%fZ', 'now')),
|
|
updated_at TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%fZ', 'now'))
|
|
);
|
|
"""
|
|
|
|
_CREATE_IMPORT_LOG: str = """
|
|
CREATE TABLE IF NOT EXISTS import_log (
|
|
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
|
source_id INTEGER REFERENCES blocklist_sources(id) ON DELETE SET NULL,
|
|
source_url TEXT NOT NULL,
|
|
timestamp TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%fZ', 'now')),
|
|
ips_imported INTEGER NOT NULL DEFAULT 0,
|
|
ips_skipped INTEGER NOT NULL DEFAULT 0,
|
|
errors TEXT
|
|
);
|
|
"""
|
|
|
|
_CREATE_GEO_CACHE: str = """
|
|
CREATE TABLE IF NOT EXISTS geo_cache (
|
|
ip TEXT PRIMARY KEY,
|
|
country_code TEXT,
|
|
country_name TEXT,
|
|
asn TEXT,
|
|
org TEXT,
|
|
cached_at TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%fZ', 'now'))
|
|
);
|
|
"""
|
|
|
|
_CREATE_HISTORY_ARCHIVE: str = """
|
|
CREATE TABLE IF NOT EXISTS history_archive (
|
|
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
|
jail TEXT NOT NULL,
|
|
ip TEXT NOT NULL,
|
|
timeofban INTEGER NOT NULL,
|
|
bancount INTEGER NOT NULL,
|
|
data TEXT NOT NULL,
|
|
action TEXT NOT NULL CHECK(action IN ('ban', 'unban')),
|
|
created_at TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%fZ', 'now')),
|
|
UNIQUE(ip, jail, action, timeofban)
|
|
);
|
|
"""
|
|
|
|
_CREATE_SCHEMA_MIGRATIONS: str = """
|
|
CREATE TABLE IF NOT EXISTS schema_migrations (
|
|
version INTEGER PRIMARY KEY,
|
|
migrated_at TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%fZ', 'now'))
|
|
);
|
|
"""
|
|
|
|
# Ordered list of DDL statements to execute on initialisation.
|
|
_SCHEMA_STATEMENTS: list[str] = [
|
|
_CREATE_SETTINGS,
|
|
_CREATE_SESSIONS,
|
|
_CREATE_SESSIONS_TOKEN_INDEX,
|
|
_CREATE_BLOCKLIST_SOURCES,
|
|
_CREATE_IMPORT_LOG,
|
|
_CREATE_GEO_CACHE,
|
|
_CREATE_HISTORY_ARCHIVE,
|
|
]
|
|
|
|
_CURRENT_SCHEMA_VERSION: int = 4
|
|
|
|
_MIGRATIONS: dict[int, str] = {
|
|
1: "\n".join(_SCHEMA_STATEMENTS),
|
|
2: """
|
|
-- Migration 2: Hash session tokens for security.
|
|
-- Drop the old sessions table and recreate with token_hash column.
|
|
-- This invalidates all existing sessions, which is acceptable as the DB
|
|
-- contents were exposed in plaintext.
|
|
DROP TABLE IF EXISTS sessions;
|
|
CREATE TABLE sessions (
|
|
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
|
token_hash TEXT NOT NULL UNIQUE,
|
|
created_at TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%fZ', 'now')),
|
|
expires_at TEXT NOT NULL
|
|
);
|
|
CREATE UNIQUE INDEX idx_sessions_token_hash ON sessions (token_hash);
|
|
""",
|
|
3: """
|
|
-- Migration 3: Add last_seen timestamp to geo_cache for retention policy.
|
|
-- Tracks when each IP was last referenced to enable purging of stale entries.
|
|
-- Default to current timestamp for existing rows.
|
|
ALTER TABLE geo_cache ADD COLUMN last_seen TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%fZ', 'now'));
|
|
""",
|
|
4: """
|
|
-- Migration 4: Add scheduler_lock table for multi-worker safety.
|
|
-- Implements database-backed locking to ensure only one worker runs the scheduler.
|
|
-- Uses atomic transactions to prevent race conditions in container orchestration.
|
|
-- Lock is held by the process that successfully inserts the singleton row (id=1).
|
|
CREATE TABLE scheduler_lock (
|
|
id INTEGER PRIMARY KEY CHECK (id = 1),
|
|
pid INTEGER NOT NULL,
|
|
hostname TEXT NOT NULL,
|
|
created_at REAL NOT NULL,
|
|
heartbeat_at REAL NOT NULL
|
|
);
|
|
""",
|
|
}
|
|
|
|
|
|
# ---------------------------------------------------------------------------
|
|
# Public API
|
|
# ---------------------------------------------------------------------------
|
|
|
|
|
|
async def _configure_connection(db: aiosqlite.Connection) -> None:
|
|
"""Apply hardening pragmas to a newly-opened SQLite connection."""
|
|
await db.execute("PRAGMA journal_mode=WAL;")
|
|
await db.execute("PRAGMA foreign_keys=ON;")
|
|
await db.execute("PRAGMA busy_timeout=5000;")
|
|
|
|
|
|
async def _get_current_schema_version(db: aiosqlite.Connection) -> int:
|
|
"""Return the highest applied schema version for the given database."""
|
|
await db.execute(_CREATE_SCHEMA_MIGRATIONS)
|
|
async with db.execute("SELECT MAX(version) FROM schema_migrations;") as cursor:
|
|
row = await cursor.fetchone()
|
|
if row is None or row[0] is None:
|
|
return 0
|
|
return int(row[0])
|
|
|
|
|
|
async def _parse_migration_statements(script: str) -> list[str]:
|
|
"""Parse a migration script into individual SQL statements.
|
|
|
|
Splits on semicolons but ignores semicolons inside string literals and
|
|
comments. Handles both block (-- comment) and line comments.
|
|
|
|
Args:
|
|
script: The raw migration script.
|
|
|
|
Returns:
|
|
List of SQL statements (stripped of whitespace and comments).
|
|
"""
|
|
statements: list[str] = []
|
|
current_stmt: list[str] = []
|
|
i = 0
|
|
|
|
while i < len(script):
|
|
char = script[i]
|
|
|
|
# Skip block comments (-- ...)
|
|
if i < len(script) - 1 and script[i:i+2] == "--":
|
|
while i < len(script) and script[i] != "\n":
|
|
i += 1
|
|
i += 1
|
|
continue
|
|
|
|
# Skip line comments (/* ... */)
|
|
if i < len(script) - 1 and script[i:i+2] == "/*":
|
|
i += 2
|
|
while i < len(script) - 1:
|
|
if script[i:i+2] == "*/":
|
|
i += 2
|
|
break
|
|
i += 1
|
|
continue
|
|
|
|
# Handle string literals (single or double quotes)
|
|
if char in ("'", '"'):
|
|
quote = char
|
|
current_stmt.append(char)
|
|
i += 1
|
|
while i < len(script):
|
|
if script[i] == quote:
|
|
if i + 1 < len(script) and script[i + 1] == quote:
|
|
# Escaped quote
|
|
current_stmt.append(quote)
|
|
current_stmt.append(quote)
|
|
i += 2
|
|
else:
|
|
# End of string
|
|
current_stmt.append(quote)
|
|
i += 1
|
|
break
|
|
else:
|
|
current_stmt.append(script[i])
|
|
i += 1
|
|
continue
|
|
|
|
# Statement separator
|
|
if char == ";":
|
|
stmt = "".join(current_stmt).strip()
|
|
if stmt:
|
|
statements.append(stmt)
|
|
current_stmt = []
|
|
i += 1
|
|
continue
|
|
|
|
current_stmt.append(char)
|
|
i += 1
|
|
|
|
# Add any remaining statement
|
|
stmt = "".join(current_stmt).strip()
|
|
if stmt:
|
|
statements.append(stmt)
|
|
|
|
return statements
|
|
|
|
|
|
async def _apply_migration(db: aiosqlite.Connection, version: int) -> None:
|
|
"""Apply a single migration step and record its completion atomically.
|
|
|
|
Wraps all DDL statements and the schema_migrations insert in a single
|
|
BEGIN IMMEDIATE ... COMMIT transaction to ensure atomicity. If any
|
|
statement fails, the entire migration is rolled back.
|
|
|
|
Args:
|
|
db: An open aiosqlite.Connection.
|
|
version: The migration version number.
|
|
|
|
Raises:
|
|
Any exception from executing the migration statements or inserting
|
|
the schema migration record will cause a rollback.
|
|
"""
|
|
migration_script = _MIGRATIONS[version]
|
|
statements = await _parse_migration_statements(migration_script)
|
|
|
|
try:
|
|
await db.execute("BEGIN IMMEDIATE;")
|
|
|
|
for statement in statements:
|
|
await db.execute(statement)
|
|
|
|
await db.execute("INSERT INTO schema_migrations (version) VALUES (?);", (version,))
|
|
|
|
await db.commit()
|
|
except Exception:
|
|
await db.rollback()
|
|
raise
|
|
|
|
|
|
async def _migrate_schema(db: aiosqlite.Connection) -> None:
|
|
"""Migrate the database schema to the latest supported version."""
|
|
current_version = await _get_current_schema_version(db)
|
|
if current_version == _CURRENT_SCHEMA_VERSION:
|
|
return
|
|
|
|
if current_version > _CURRENT_SCHEMA_VERSION:
|
|
raise RuntimeError(
|
|
f"database schema version {current_version} is newer than supported "
|
|
f"version {_CURRENT_SCHEMA_VERSION}"
|
|
)
|
|
|
|
log.info("migrating_database_schema", from_version=current_version, to_version=_CURRENT_SCHEMA_VERSION)
|
|
for next_version in range(current_version + 1, _CURRENT_SCHEMA_VERSION + 1):
|
|
await _apply_migration(db, next_version)
|
|
log.info("database_schema_ready", schema_version=_CURRENT_SCHEMA_VERSION)
|
|
|
|
|
|
async def init_db(db: aiosqlite.Connection) -> None:
|
|
"""Create or migrate the BanGUI application database schema.
|
|
|
|
This function is idempotent — calling it on an already-initialised
|
|
database has no effect. It should be called once during application
|
|
startup inside the FastAPI lifespan handler.
|
|
|
|
Args:
|
|
db: An open :class:`aiosqlite.Connection` to the application database.
|
|
"""
|
|
log.info("initialising_database_schema")
|
|
await _configure_connection(db)
|
|
await _migrate_schema(db)
|
|
|
|
|
|
async def open_db(database_path: str) -> aiosqlite.Connection:
|
|
"""Open a new application SQLite connection with the standard settings.
|
|
|
|
Args:
|
|
database_path: Path to the BanGUI SQLite database.
|
|
|
|
Returns:
|
|
A configured :class:`aiosqlite.Connection` instance.
|
|
"""
|
|
db = await aiosqlite.connect(database_path)
|
|
db.row_factory = aiosqlite.Row
|
|
await _configure_connection(db)
|
|
return db
|