BanGUI

Author	SHA1	Message	Date
Lukas	0a3f9c6c16	refactor(backend): external logging metrics, required mode, health checks - Add external_logging_init_failures counter - Add external_log_required flag, raise if init fails and required - Health endpoint: add external_logging status check - Blocklist service: enrich with metadata fields, update import logic - Health check task: add runtime_state dependency, fix return typing - Metrics: add Histogram for request latencies - Frontend: align BlocklistImportLogSection props - Docs: update deployment guide, remove stale tasks Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-04 03:45:13 +02:00
Lukas	eb339efcfd	Add Kubernetes liveness/readiness probes and middleware order validation - Split /health into /health/live (liveness) and /health/ready (readiness) following Kubernetes conventions. Combined /health retained for backward compatibility with existing Docker HEALTHCHECK definitions. - Add ReadyCheck and ReadyResponse models for structured readiness output. - Add _assert_middleware_order() startup check enforcing: RateLimit → Csrf → CorrelationId middleware chain. - Register CorrelationIdMiddleware, CsrfMiddleware, RateLimitMiddleware in create_app() with documented required order (reverse of processing). - Add correlation.py, csrf.py, rate_limit.py middleware modules. - Add health probe tests in test_health_probes.py. - Update test_main.py with middleware order assertion tests. - Update frontend useFetchData hook tests. - Docs: update Deployment.md with Kubernetes probe config examples.	2026-05-04 02:42:09 +02:00
Lukas	1c3dff31e8	feat(rate-limiting): add per-bucket limits and startup validation - Add per-bucket rate limit config (ban, unban, import, config, jail, filter, action) - Add process-local warning at startup for multi-worker deployments - Document Redis migration path for shared state across workers - Remove Issue #42 from Tasks.md (resolved)	2026-05-03 20:53:21 +02:00
Lukas	ae9313568e	feat: enforce single-worker at startup Fail with RuntimeError when WEB_CONCURRENCY or BANGUI_WORKERS > 1. In-memory session cache, rate-limit windows, and runtime state are process-local. Multi-worker silently causes stale limits, ghost sessions, inconsistent status. Skipped when TESTING=1. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-03 20:33:23 +02:00
Lukas	4d09d2538d	docs: Add security best practices to Deployment.md - Secrets management via environment variables - Container security hardening (non-root user, filesystem permissions, capabilities) - Network security and TLS termination guidance - Prune obsolete task tracking from Tasks.md Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-03 19:48:52 +02:00
Lukas	5058a50143	Refactor backend: fix geo cache cleanup, scheduler heartbeat, correlation middleware; update docs	2026-05-03 16:02:40 +02:00
Lukas	b631c1c546	feat(backend): implement graceful shutdown for container stop Graceful shutdown ensures in-flight operations complete before process exits: - Lifespan shutdown handler drains pending tasks with 25s timeout - Scheduler stops accepting new jobs immediately - HTTP session, external logging, scheduler lock, DB conn closed cleanly - 25s Python timeout leaves 5s margin before Docker's 30s SIGKILL Files changed: - backend/app/main.py: enhanced _lifespan shutdown with task drain - Docker/Dockerfile.backend: documented signal handling in header - Docker/docker-compose.yml: added stop_grace_period: 30s - Docker/compose.prod.yml: added stop_grace_period: 30s - Docs/Deployment.md: new Graceful Shutdown section with sequence table - Docs/TROUBLESHOOTING.md: new Graceful Shutdown Issues section Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-02 22:47:10 +02:00
Lukas	0d5882b32f	Fix HIGH priority issues: unbounded queries, rate limiting, health checks Issue #3 - Unbounded Query Results (OOM): - get_all_archived_history() now uses keyset pagination with bounded max_rows (50k default) - Added 'id' field to records from get_archived_history() and get_archived_history_keyset() - Protocol signature updated with page_size, max_rows, last_ban_id params Issue #7 - Docker Health Check Fails: - Added curl to Dockerfile.backend runtime image - HEALTHCHECK now uses 'curl -f http://localhost:8000/api/health' - compose.prod.yml: increased start_period to 40s, timeout to 10s - Frontend healthcheck proxies to backend /api/health Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-01 21:47:36 +02:00
Lukas	445c2c5418	Update configuration and documentation - Update .env.example with latest environment variables - Update deployment and task documentation - Update backend configuration settings Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-05-01 18:10:03 +02:00
Lukas	05c3b564ae	Refactor scheduler lock implementation with heartbeat mechanism - Add heartbeat-based lock renewal in scheduler_lock_heartbeat.py - Update scheduler_lock.py with improved lock management - Add comprehensive tests for scheduler lock functionality - Update deployment and task documentation Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-04-30 22:10:38 +02:00
Lukas	94d6352d1d	Fix health check endpoint to return 503 when fail2ban is offline The health check endpoint now properly indicates service unavailability: - Returns HTTP 200 when fail2ban is online - Returns HTTP 503 when fail2ban is offline This allows Docker and other orchestration tools to correctly detect when fail2ban is unreachable and automatically restart the backend container, preventing the situation where Docker treats the container as healthy despite fail2ban being down. Changes: - Update GET /api/health to return 503 on fail2ban offline - Return appropriate JSON response bodies for each state - Update tests to verify both online (200) and offline (503) scenarios - Update Dockerfile HEALTHCHECK documentation - Add Health Checks section to Deployment.md documentation All tests pass with 100% coverage on health.py. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-04-30 21:56:42 +02:00
Lukas	90f4c6239c	Add resource limits to all Docker containers - fail2ban: 0.5 CPU / 128M memory limit, 0.1 CPU / 64M reserved - backend: 2.0 CPU / 512M memory limit, 1.0 CPU / 256M reserved - frontend: 0.5 CPU / 128M memory limit, 0.25 CPU / 64M reserved Prevents 'noisy neighbor' scenarios where one container exhausts host resources (CPU, memory, disk). Limits are hard caps; reservations guarantee minimum allocation to prevent OOM kills and ensure responsive service even under load. Fixes resource contention issue in production and staging environments. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-04-30 21:03:56 +02:00

12 Commits