# Database Schema Documentation BanGUI uses two SQLite databases: | Database | Purpose | Location | |---|---|---| | **BanGUI app DB** | Own configuration, sessions, blocklist sources, import logs, geo cache | `bangui.db` | | **fail2ban DB** | fail2ban's internal ban/jail data (read-only) | Configured via `FAIL2BAN_DB` env var | --- ## 1. BanGUI Application Schema Single source of truth: `backend/app/db.py`. ### 1.1 `settings` Key-value store for application configuration. | Column | Type | Constraints | |---|---|---| | `id` | INTEGER | PRIMARY KEY AUTOINCREMENT | | `key` | TEXT | NOT NULL UNIQUE | | `value` | TEXT | NOT NULL | | `created_at` | TEXT | NOT NULL DEFAULT ISO 8601 | | `updated_at` | TEXT | NOT NULL DEFAULT ISO 8601 | **Indexes:** PK only. **Purpose:** Stores app-wide settings (e.g., timezone, UI preferences). All settings access goes through `settings_repo` / `settings_service`. --- ### 1.2 `sessions` Session tokens for web authentication. | Column | Type | Constraints | |---|---|---| | `id` | INTEGER | PRIMARY KEY AUTOINCREMENT | | `token_hash` | TEXT | NOT NULL UNIQUE | | `created_at` | TEXT | NOT NULL DEFAULT ISO 8601 | | `expires_at` | TEXT | NOT NULL | **Indexes:** `idx_sessions_token_hash` (UNIQUE) on `token_hash`. **Purpose:** Web session management. Tokens are SHA-256 hashed before storage. Sessions expire and are cleaned up by `session_cleanup` task. See `auth_service.py`. --- ### 1.3 `blocklist_sources` Blocklist source definitions for the import pipeline. | Column | Type | Constraints | |---|---|---| | `id` | INTEGER | PRIMARY KEY AUTOINCREMENT | | `name` | TEXT | NOT NULL | | `url` | TEXT | NOT NULL UNIQUE | | `enabled` | INTEGER | NOT NULL DEFAULT 1 (boolean) | | `created_at` | TEXT | NOT NULL DEFAULT ISO 8601 | | `updated_at` | TEXT | NOT NULL DEFAULT ISO 8601 | **Indexes:** PK only. **Purpose:** Defines sources for blocklist imports. See `blocklist_repo`, `blocklist_service`, `blocklist_import_workflow`. --- ### 1.4 `import_log` Audit log of individual blocklist import operations. | Column | Type | Constraints | |---|---|---| | `id` | INTEGER | PRIMARY KEY AUTOINCREMENT | | `source_id` | INTEGER | REFERENCES `blocklist_sources(id)` ON DELETE RESTRICT | | `source_url` | TEXT | NOT NULL | | `timestamp` | INTEGER | NOT NULL (UNIX epoch) | | `ips_imported` | INTEGER | NOT NULL DEFAULT 0 | | `ips_skipped` | INTEGER | NOT NULL DEFAULT 0 | | `errors` | TEXT | | **Indexes:** - `idx_import_log_id_desc` on `(id DESC)` — cursor pagination - `idx_import_log_source_id_desc` on `(source_id, id DESC)` — filtered pagination **Purpose:** Audit trail for imports. `source_id` RESTRICT prevents source deletion when logs exist. See migration 9. **Migration 8:** `timestamp` migrated from TEXT ISO 8601 to INTEGER UNIX epoch. --- ### 1.5 `geo_cache` Geo-IP lookup cache for ban IP metadata. | Column | Type | Constraints | |---|---|---| | `ip` | TEXT | PRIMARY KEY | | `country_code` | TEXT | | | `country_name` | TEXT | | | `asn` | TEXT | | | `org` | TEXT | | | `cached_at` | TEXT | NOT NULL DEFAULT ISO 8601 | **Additional (migration 3):** | Column | Type | Constraints | |---|---|---| | `last_seen` | TEXT | NOT NULL DEFAULT ISO 8601 | **Indexes:** PK only. **Purpose:** Caches GeoIP results to reduce third-party API calls. TTL managed by `geo_cache_cleanup` task. See `geo_cache_repo`, `geo_service`. --- ### 1.6 `history_archive` Archived ban/unban history mirrored from fail2ban DB. | Column | Type | Constraints | |---|---|---| | `id` | INTEGER | PRIMARY KEY AUTOINCREMENT | | `jail` | TEXT | NOT NULL | | `ip` | TEXT | NOT NULL | | `timeofban` | INTEGER | NOT NULL (UNIX epoch) | | `bancount` | INTEGER | NOT NULL | | `data` | TEXT | NOT NULL (JSON) | | `action` | TEXT | NOT NULL CHECK IN ('ban', 'unban') | | `created_at` | TEXT | NOT NULL DEFAULT ISO 8601 | **Constraints:** `UNIQUE(ip, jail, action, timeofban)` prevents duplicate archive rows. **Indexes:** - `idx_history_archive_jail_timeofban` on `(jail, timeofban DESC)` — dashboard filter by jail + time ordering - `idx_history_archive_timeofban_jail_action` on `(timeofban DESC, jail, action)` — timeline filters - `idx_history_archive_ip` on `(ip)` — IP prefix/exact searches - `idx_history_archive_action` on `(action)` — ban/unban filtering **Purpose:** Long-term ban history. Synced from fail2ban DB by `history_sync` task. See `history_archive_repo`, `history_service`. --- ### 1.7 `scheduler_lock` Database-backed mutex for multi-worker scheduler safety. | Column | Type | Constraints | |---|---|---| | `id` | INTEGER | PRIMARY KEY CHECK (id = 1) — singleton row | | `pid` | INTEGER | NOT NULL | | `hostname` | TEXT | NOT NULL | | `created_at` | REAL | NOT NULL (UNIX epoch) | | `heartbeat_at` | REAL | NOT NULL (UNIX epoch) | **Indexes:** PK only (singleton constraint). **Purpose:** Only one worker process holds the scheduler lock at a time. Lock is heartbeat-renewed by `scheduler_lock_heartbeat` task. Uses `BEGIN IMMEDIATE` transaction to acquire atomically. See `scheduler_lock.py`. --- ### 1.8 `import_runs` Tracks unique blocklist imports for idempotent retries. | Column | Type | Constraints | |---|---|---| | `id` | INTEGER | PRIMARY KEY AUTOINCREMENT | | `source_id` | INTEGER | NOT NULL REFERENCES `blocklist_sources(id)` ON DELETE CASCADE | | `content_hash` | TEXT | NOT NULL | | `status` | TEXT | NOT NULL CHECK IN ('pending', 'completed', 'failed') | | `imported_count` | INTEGER | NOT NULL DEFAULT 0 | | `skipped_count` | INTEGER | NOT NULL DEFAULT 0 | | `error_message` | TEXT | | | `created_at` | TEXT | NOT NULL DEFAULT ISO 8601 | | `updated_at` | TEXT | NOT NULL DEFAULT ISO 8601 | **Constraints:** `UNIQUE(source_id, content_hash)` — same source + content = same import run. **Indexes:** `idx_import_runs_source_status` on `(source_id, status)` — lookup completed imports by source. **Purpose:** Prevents duplicate IP bans on import crash/retry. See migration 6 and `blocklist_import_workflow`. --- ### 1.9 `schema_migrations` Tracks applied schema versions. | Column | Type | Constraints | |---|---|---| | `version` | INTEGER | PRIMARY KEY | | `migrated_at` | TEXT | NOT NULL DEFAULT ISO 8601 | **Indexes:** PK only. **Purpose:** Idempotent schema migration tracker. Records each applied version number. See `init_db()` and `_migrate_schema()`. --- ## 2. Fail2ban Database Schema Read-only access via `fail2ban_db_repo`. Fail2ban manages this DB; BanGUI mirrors data into `history_archive`. ### 2.1 `fail2banDb` | Column | Type | Constraints | |---|---|---| | `version` | INTEGER | | Single row tracking DB schema version. --- ### 2.2 `jails` | Column | Type | Constraints | |---|---|---| | `name` | TEXT | NOT NULL UNIQUE | | `enabled` | INTEGER | NOT NULL DEFAULT 1 | **Indexes:** `jails_name` on `(name)`. --- ### 2.3 `logs` | Column | Type | Constraints | |---|---|---| | `jail` | TEXT | NOT NULL FK → `jails(name)` ON DELETE CASCADE | | `path` | TEXT | | | `firstlinemd5` | TEXT | | | `lastfilepos` | INTEGER | DEFAULT 0 | | `UNIQUE(jail, path)` | | | | `UNIQUE(jail, path, firstlinemd5)` | | | **Indexes:** `logs_path` on `(path)`, `logs_jail_path` on `(jail, path)`. --- ### 2.4 `bans` | Column | Type | Constraints | |---|---|---| | `jail` | TEXT | NOT NULL FK → `jails(name)` | | `ip` | TEXT | | | `timeofban` | INTEGER | NOT NULL | | `bantime` | INTEGER | NOT NULL | | `bancount` | INTEGER | NOT NULL DEFAULT 1 | | `data` | JSON | | **Indexes:** - `bans_jail_timeofban_ip` on `(jail, timeofban)` - `bans_jail_ip` on `(jail, ip)` - `bans_ip` on `(ip)` --- ### 2.5 `bips` Backup IPs table (ban backup). | Column | Type | Constraints | |---|---|---| | `ip` | TEXT | NOT NULL | | `jail` | TEXT | NOT NULL FK → `jails(name)` | | `timeofban` | INTEGER | NOT NULL | | `bantime` | INTEGER | NOT NULL | | `bancount` | INTEGER | NOT NULL DEFAULT 1 | | `data` | JSON | | | PRIMARY KEY | `(ip, jail)` | | **Indexes:** `bips_timeofban` on `(timeofban)`, `bips_ip` on `(ip)`. --- ## 3. Relationships and Constraints ``` blocklist_sources (1) ──(id)──→ import_log.source_id [RESTRICT on delete] └──→ import_runs.source_id [CASCADE on delete] settings: standalone (key-value, no FK) sessions: standalone (token hash, no FK) geo_cache: standalone (IP → geo data, no FK) history_archive: standalone (archived ban history, no FK) scheduler_lock: singleton row (id=1), no FK schema_migrations: standalone (migration tracking, no FK) ``` Fail2ban tables are separate and read-only from BanGUI's perspective. --- ## 4. Indexes Summary | Table | Index | Columns | |---|---|---| | `sessions` | `idx_sessions_token_hash` | `token_hash` UNIQUE | | `import_log` | `idx_import_log_id_desc` | `id DESC` | | `import_log` | `idx_import_log_source_id_desc` | `source_id, id DESC` | | `import_runs` | `idx_import_runs_source_status` | `source_id, status` | | `history_archive` | `idx_history_archive_jail_timeofban` | `jail, timeofban DESC` | | `history_archive` | `idx_history_archive_timeofban_jail_action` | `timeofban DESC, jail, action` | | `history_archive` | `idx_history_archive_ip` | `ip` | | `history_archive` | `idx_history_archive_action` | `action` | | `jails` | `jails_name` | `name` | | `logs` | `logs_path` | `path` | | `logs` | `logs_jail_path` | `jail, path` | | `bans` | `bans_jail_timeofban_ip` | `jail, timeofban` | | `bans` | `bans_jail_ip` | `jail, ip` | | `bans` | `bans_ip` | `ip` | | `bips` | `bips_timeofban` | `timeofban` | | `bips` | `bips_ip` | `ip` | --- ## 5. Migration History | Version | Description | |---|---| | 1 | Initial schema: `settings`, `sessions`, `blocklist_sources`, `import_log`, `geo_cache`, `history_archive`, `schema_migrations` | | 2 | Hash session tokens (`token_hash` column). Invalidates all existing sessions. | | 3 | Add `last_seen` to `geo_cache` for retention policy. | | 4 | Add `scheduler_lock` table for multi-worker scheduler mutex. | | 5 | Add indexes to `history_archive` for query performance (4 indexes). | | 6 | Add `import_runs` table for idempotent import tracking. | | 7 | Add indexes to `import_log` for cursor-based pagination. | | 8 | Migrate `import_log.timestamp` from TEXT ISO 8601 → INTEGER UNIX epoch. | | 9 | Change `import_log.source_id` FK to `ON DELETE RESTRICT` (prevents orphaned logs). Recreate table with new FK semantics. | **Current schema version:** 9 (`_CURRENT_SCHEMA_VERSION` in `db.py`). --- ## 6. Performance Notes - **WAL mode** (`PRAGMA journal_mode=WAL`) — concurrent reads allowed, better write performance under concurrency. - **Foreign keys enforced** (`PRAGMA foreign_keys=ON`) — data integrity at DB level. - **Busy timeout** 5000 ms — prevents "database is locked" errors under contention. - **`history_archive` indexes** — tuned for dashboard filter + time ordering + pagination. See migration 5 and `PERFORMANCE.md`. - **`import_log` indexes** — tuned for cursor-based pagination (newest-first by id). See migration 7. - **`geo_cache` PK on `ip`** — O(1) lookup for geo enrichment on ban events. - **`scheduler_lock` singleton** (`CHECK (id = 1)`) — trivial lock existence check. For detailed query patterns and benchmarks, see `Docs/PERFORMANCE.md`.