Commit Graph

348 Commits

Author SHA1 Message Date
408eb900eb Remove Tasks.md spec, add test for _cleanup_wal_files skipping recent files
Remove 335-line task specification from Docs/Tasks.md.
Add test confirming _cleanup_wal_files skips recently-modified WAL/SHM files.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-23 23:04:04 +02:00
407ca83850 Add tests for since timestamp accuracy in ban_service
- test_since_unix_returns_utc_epoch: validates since_unix('24h') returns UTC epoch
- test_ban_trend_since_is_within_expected_range: validates 23h-ago ban falls in 24h+slack window

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-23 23:00:51 +02:00
72273ca945 Add logging duplication tests
- test_logging_configuration_no_duplicate_handlers: verify create_app() twice leaves ≤1 StreamHandler
- test_uvicorn_access_logs_go_through_root_handler: verify uvicorn.access can emit JSON via JSONFormatter
- test_external_logging_processor_queues_record: verify _external_logging_processor queues to handler
- test_plain_text_logs_not_emitted_after_startup: verify app.db emits JSON not plain text

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-23 22:42:52 +02:00
9e59fc8bae Add granular DB error types with retry logic
New exceptions: DatabaseBusyError, DatabasePermissionDeniedError,
DatabasePathInvalidError, DatabaseCorruptedError, DatabaseUnavailableError.

open_db creates parent directory if missing. Catches all aiosqlite errors
and maps to specific exception types.

get_db retries up to 3x on locked database with backoff.
Propagates specific exceptions instead of generic HTTPException.

Tests for all new error types and retry behavior.
2026-05-23 22:21:42 +02:00
ef8feba4b2 docs: add comprehensive task backlog and bump version to rc.5
- Document database error handling, logging duplication, ban service
timestamp, and orphaned SQLite file issues in Tasks.md
- Bump backend version from 0.9.19-rc.4 to 0.9.19-rc.5
2026-05-23 22:09:06 +02:00
aebe0d0236 chore(release): bump version to 0.9.19-rc.4
- Add production Docker Compose configuration

- Add check_auth.py diagnostic script for session 401 debugging
2026-05-23 21:27:52 +02:00
9fe52755a5 fix(db): fix migration failures when upgrading from 0.8.0 schema
Migration 1: remove idx_sessions_token_hash from _SCHEMA_STATEMENTS.
The legacy schema has sessions.token (not token_hash). The IF NOT EXISTS
guard only prevents duplicate index names — it still requires the column
to exist. Migration 2 drops and rebuilds sessions with token_hash anyway,
so creating the index in migration 1 was redundant.

Migration 3: replace ALTER TABLE ADD COLUMN with a table rebuild.
SQLite rejects ALTER TABLE ADD COLUMN NOT NULL DEFAULT <expression> when
the table already contains rows. The old DB has ~181k geo_cache rows, so
the ALTER always failed. Rebuild copies existing rows with last_seen set
to cached_at as a reasonable approximation of last-seen time.
2026-05-22 21:47:32 +02:00
12fe70d768 chore: bump to v0.9.19-rc.1 and add local OpenAPI build support
- Add release candidate (rc) support to release.sh with latestRC tagging
- Bump VERSION, backend pyproject.toml, and frontend package.json to 0.9.19-rc.1
- Add local frontend/openapi.json so build no longer needs running backend
- Update generate:types and validate-types.sh to use local openapi.json
- Fix frontend tests: remove unused imports/variables and update mock data
2026-05-22 20:36:14 +02:00
83b2cb67b1 backup
Some checks failed
CI / Backend Tests (pull_request) Has been cancelled
CI / Lint (pull_request) Has been cancelled
CI / Type Check (pull_request) Has been cancelled
CI / Import Boundary (pull_request) Has been cancelled
CI / OpenAPI Breaking Changes (pull_request) Has been cancelled
CI / OpenAPI Baseline Commit (pull_request) Has been cancelled
2026-05-20 20:18:58 +02:00
7308ff88d6 fix(rate-limit): stop double-counting requests in middleware
Multiple RateLimitMiddleware instances were each calling
check_allowed() on every request, halving the effective global
limit (200 req/min became ~100). Added path_prefixes and skip_paths
so each instance only checks the paths it owns.

- Auth middleware scoped to /api/v1/auth/login and /api/v1/setup
- History middleware scoped to /api/v1/history
- Global middleware skips auth and history paths
- Updated tests to match single-count behavior
2026-05-15 23:04:02 +02:00
77df5d5d65 fixed tests 2026-05-15 20:41:05 +02:00
96ce516ecf fix(logging): resolve logging_compat keyword arg conflicts
- Fix logging_compat._log() to handle extra keyword arguments properly
- Update config.py, main.py, and test_bans.py for compatibility
- Update Tasks.md and runner.csx
2026-05-10 15:54:00 +02:00
7ec80fdeec refactor(logging): replace structlog with stdlib logging compat layer
- Remove structlog dependency from backend/pyproject.toml
- Add app.utils.logging_compat shim for keyword-arg logging API
- Add app.utils.json_formatter for JSON log output with extra fields
- Update all backend modules to use logging_compat.get_logger()
- Update docstrings in log_sanitizer.py and json_formatter.py
- Update test comment in test_async_utils.py
- Record 406 failing tests in Docs/Tasks.md for tracking
2026-05-10 13:37:54 +02:00
7790736918 feat(jail-config): add banaction and banaction_allports to blocklist config
Adds iptables-multiport and iptables-allports ban actions to the blocklist-import jail configuration and updates the corresponding test assertions.
2026-05-10 09:35:33 +02:00
79df1aa493 backup 2026-05-10 08:48:42 +02:00
e4c3ae718c fix(backend): relax SSRF validation for loopback in dev, graceful metrics/regexploit fallback
- ip_utils: allow loopback (127.0.0.1) in dev mode (BANGUI_LOG_LEVEL=debug)
  so e2e tests can reach a mock HTTP server on the host
- metrics: make all operations no-ops when prometheus_client not installed
- regex_validator: graceful fallback when regexploit not installed
- geo_cache: use attribute access instead of dict subscript for typed rows
- rate_limit: support bucket_override parameter for per-endpoint rate limits
- ban_service: construct DomainActiveBan explicitly instead of model_copy

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-08 08:07:13 +02:00
481f32bb85 backup 2026-05-05 18:47:56 +02:00
d25b56e7e1 backup 2026-05-04 13:13:01 +02:00
744275d17f backup 2026-05-04 07:20:20 +02:00
58173bd6a9 backup 2026-05-04 07:20:16 +02:00
69e1726045 Refactor data fetching hooks, add page size lint test
- Simplify useFetchData: remove unused URL building logic
- Add usePolledData initial implementation
- Add router page_size param validation test
- Update API reference docs
- Clean up tasks doc
2026-05-04 06:48:24 +02:00
0a3f9c6c16 refactor(backend): external logging metrics, required mode, health checks
- Add external_logging_init_failures counter
- Add external_log_required flag, raise if init fails and required
- Health endpoint: add external_logging status check
- Blocklist service: enrich with metadata fields, update import logic
- Health check task: add runtime_state dependency, fix return typing
- Metrics: add Histogram for request latencies
- Frontend: align BlocklistImportLogSection props
- Docs: update deployment guide, remove stale tasks

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-04 03:45:13 +02:00
eb339efcfd Add Kubernetes liveness/readiness probes and middleware order validation
- Split /health into /health/live (liveness) and /health/ready (readiness)
  following Kubernetes conventions. Combined /health retained for backward
  compatibility with existing Docker HEALTHCHECK definitions.
- Add ReadyCheck and ReadyResponse models for structured readiness output.
- Add _assert_middleware_order() startup check enforcing:
  RateLimit → Csrf → CorrelationId middleware chain.
- Register CorrelationIdMiddleware, CsrfMiddleware, RateLimitMiddleware
  in create_app() with documented required order (reverse of processing).
- Add correlation.py, csrf.py, rate_limit.py middleware modules.
- Add health probe tests in test_health_probes.py.
- Update test_main.py with middleware order assertion tests.
- Update frontend useFetchData hook tests.
- Docs: update Deployment.md with Kubernetes probe config examples.
2026-05-04 02:42:09 +02:00
65fe747cba feat(backend): add deprecation middleware and API versioning support
- Add deprecation middleware for warning headers on sunset endpoints
- Add jails_v2 router for API v2 migration path
- Update CI workflow with new test coverage
- Update API versioning documentation
- Remove completed tasks from Tasks.md
2026-05-04 00:03:52 +02:00
fc57c83f79 refactor: split pagination logic from response models
- Extract pagination logic to separate util module
- Update response models to use new pagination util
- Fix pagination calculation edge cases

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-03 22:57:21 +02:00
edebf1a339 feat(services): add ErrorContract enum and PartialResult type
Add typed wrappers for error handling patterns in error_handling.py:

- ErrorContract(enum): machine-checkable pattern selector with
  from_value() helper and string constants matching the existing
  ABORT_ON_ERROR/RETURN_DEFAULT/PARTIAL_RESULT module-level values
- ErrorEntry: typed error container for PARTIAL_RESULT (context + cause)
- PartialResult[T]: typed result wrapper for PARTIAL_RESULT operations

Existing string constants preserved for backward compat.
Updated module docstring with type annotation table and examples.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-03 22:46:47 +02:00
52a70c3eea Add import-linter boundary to forbid routers importing app.dependencies
Issue #51: Enforce repository boundary at CI level using import-linter.
Contract 'forbid_router_db_import' checks that app.routers never imports
app.dependencies directly, keeping the DB access path through service
contexts only.

- Add import-linter>=2.0.0 to dev dependencies (backend/pyproject.toml)
- Configure [tool.importlinter] with package_root and root_packages
- Add [[tool.importlinter.contracts]] with type='forbidden', source
  app.routers, forbidden app.dependencies
- Add 'Import Boundary' CI job (import-linter)
- Document import-linter in CONTRIBUTING.md code quality table

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-03 22:42:46 +02:00
dafe8d61e2 feat(security): add CSRF header constants and security-headers endpoint
Move X-BanGUI-Request header name/value to backend/app/utils/constants.py as single source of truth. Add GET /api/v1/config/security-headers endpoint. Update csrf middleware, frontend api client, and docs to use shared constants.
2026-05-03 22:06:43 +02:00
cee3daffc1 fix: enforce PRAGMA query_only on fail2ban DB and refactor CSRF cookie name
- Add _acquire_readonly_connection() that applies PRAGMA query_only=ON after connect
- Verify PRAGMA value back to catch URI flag bypasses
- Wrap in async context manager _readonly_connection() used by all repo methods
- Replace hardcoded '_SESSION_COOKIE_NAME' in CSRF middleware with import from
  app.utils.constants
- Remove completed Issues #45 and #46 from Docs/Tasks.md (Issue #46 now fixed,
  #45 cache invalidation deferred to auth refactor branch)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-03 21:47:42 +02:00
1c3dff31e8 feat(rate-limiting): add per-bucket limits and startup validation
- Add per-bucket rate limit config (ban, unban, import, config, jail, filter, action)
- Add process-local warning at startup for multi-worker deployments
- Document Redis migration path for shared state across workers
- Remove Issue #42 from Tasks.md (resolved)
2026-05-03 20:53:21 +02:00
c3cd1574dc fix(auth): invalidate session cache on login
Stale sessions from a stolen device could be reused up to the cache
TTL after a legitimate user re-logs in, because login never cleared
the existing cache entry.

Changes:
- Add invalidate_by_user(user_id) to SessionCache protocol
- InMemorySessionCache maintains a user_id -> set[token] index to
  support O(1) invalidation of all sessions for a given user
- NoOpSessionCache stub updated for API compatibility
- auth_service.login() now returns the Session object alongside
  signed_token and expires_at
- login router calls session_cache.invalidate_by_user(session.id)
  immediately after successful authentication

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-03 20:51:51 +02:00
ae9313568e feat: enforce single-worker at startup
Fail with RuntimeError when WEB_CONCURRENCY or BANGUI_WORKERS > 1.

In-memory session cache, rate-limit windows, and runtime state are
process-local. Multi-worker silently causes stale limits, ghost sessions,
inconsistent status.

Skipped when TESTING=1.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-03 20:33:23 +02:00
c96b87ee8b feat: reject common passwords in SetupRequest
- Add ~75 common plaintext passwords to setup.py validator
- Check case-insensitively; passes complexity but blocked
- Add tests: reject common, accept unique, short common fail on length
- Update Security.md docs

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-03 18:25:17 +02:00
96525573fa Normalise IP addresses across backend
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-03 18:19:41 +02:00
5f0ab40816 refactor(backend): clean up models setup, improve ip utils, add adr docs
- Extract ADR documents for architectural decisions (SQLite, FastAPI, React, APScheduler, Scheduler)
- Refactor setup.py: improve code structure and readability
- Add IP validation utilities with test coverage
- Update frontend components (BanTable, HistoryPage)
- Add pre-commit hooks and CONTRIBUTING.md
- Add .editorconfig for consistent coding standards
2026-05-03 18:04:45 +02:00
2f9fc8076d refactor(backend): clean up jail service, add error handling service
- Extract jail status/processing to helper functions
- Add error_handling.py service for centralized error handling
- Update config.py with validation and defaults
- Update .env.example with all config options
- Remove obsolete Tasks.md, add Service-Development.md
- Minor fixes across routers and services

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-03 17:40:37 +02:00
2df029f7e8 refactor(ban_service): extract _bans_by_country_load_data helper
Break up long function into focused helper. Load data logic separate from aggregation.
2026-05-03 17:00:34 +02:00
5058a50143 Refactor backend: fix geo cache cleanup, scheduler heartbeat, correlation middleware; update docs 2026-05-03 16:02:40 +02:00
896751ada9 fix: handle socket close errors properly in PapertrailLogHandler
- Replace contextlib.suppress with try/except + warning log
- Add test for fail2ban client
- Remove stale Issue #21 from Tasks.md (indexes)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-03 12:25:14 +02:00
Copilot
22db607875 Add fail2ban DB index management and socket-based path resolution
- New get_fail2ban_db_path() in setup_service resolves DB path from configured socket path
- New ensure_fail2ban_indexes() creates missing performance indexes on bans table
- Call ensure_fail2ban_indexes on every startup before first ban query
- Remove completed tasks from Docs/Tasks.md
- Update Docs/PERFORMANCE.md with index findings
2026-05-03 12:17:31 +02:00
0133489920 Update observability docs and task utilities
- Add Observability.md documentation
- Standardize task logging with correlation_id support
- Add log_sanitizer utility for PII masking
- Update Tasks.md tracking
- Update geo_cache tasks and other task modules with correlation_id

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-03 11:52:09 +02:00
7b93499551 Refactor config loading and add status code docs
- Move config loading to dedicated ConfigLoader class with validation
- Add DATABASE_MIGRATIONS.md content to TROUBLESHOOTING.md
- Add API_STATUS_CODES.md documenting all API response codes
- Update runner.csx to use new config structure
- Add check_responses.py validation script
- Update config tests for new structure

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-03 11:52:01 +02:00
8f26776bb3 docs: add OpenAPI responses={} to all router endpoints
Add explicit HTTP status code documentation to every endpoint
across 15 router files. Each endpoint now declares all possible
response codes (200/201/204/400/401/404/409/429/502/503) with
descriptions so frontend can distinguish error types.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-03 01:12:08 +02:00
7ad885d276 refactor: separate config service from jail config service
- Split config_service.py into config_service.py and jail_config_service.py
- Update Docs/Tasks.md, Security.md, TROUBLESHOOTING.md
2026-05-03 01:05:18 +02:00
881cfbdd71 fix: replace broad except Exception with specific exception types
- jail_service: catch ValueError (fail2ban protocol error) instead of Exception
- health.py: catch AttributeError (not OSError/TypeError) for defensive checks
- ban_service: re-raise programming errors in geo lookup handlers
- server_service: catch Fail2BanConnectionError, Fail2BanProtocolError, ValueError
- config_writer: catch OSError instead of Exception

Programming errors now bubble to global handler instead of being silently caught.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-03 00:54:44 +02:00
bd6170722a feat(geo): add cache hit/miss metrics and prewarm support
- Add _hits/_misses counters to GeoCache for cache hit/miss ratio tracking
- Reset counters on clear()
- Count hits before misses in lookup_batch() to avoid interleaving
- Add synchronous prewarm() using asyncio.create_task for fire-and-forget
- Add hits/misses fields to GeoCacheStatsResponse model
- Add TestCacheMetrics (5 tests), TestPrewarm (3 tests), TestLargeBanList (2 tests)
- Fix _make_async_db() mock: db.execute is not async, returns ctx manager
- Move collections.abc to TYPE_CHECKING block (TC003)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-03 00:35:47 +02:00
0817a4cb47 fix(regex_validator): add ReDoS detection via regexploit
Detect catastrophic backtracking patterns before regex compilation
using regexploit library. Add ReDoSDetectedError exception and
_MINIMUM_STARRINESS threshold (>=3) to catch dangerous patterns
like (a+)+b. Update pyproject.toml deps, add tests for detection.
2026-05-03 00:05:33 +02:00
e436727942 fix: atomic upsert for import runs (Issue #12)
Replace check-then-insert race condition with INSERT ON CONFLICT.
- upsert_pending uses RETURNING id for atomic upsert
- UNIQUE(source_id, content_hash) constraint from migration 6
- blocklist_import_workflow updated to use upsert_pending
- test_import_source_success fixed for async mock patterns

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-02 23:39:43 +02:00
1285bc8571 feat: comprehensive health check with DB, scheduler, cache
- Add /api/v1/health endpoint with component-level checks
- Verify DB connectivity, fail2ban socket, scheduler, session cache
- Add SQLite WAL cleanup on startup (orphan crash files)
- Migration 8: import_log.timestamp → INTEGER UNIX epoch
- Align import_log timestamps with history_archive (already UNIX int)
- Add unit tests for DB cleanup and health router

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-02 23:03:57 +02:00
b631c1c546 feat(backend): implement graceful shutdown for container stop
Graceful shutdown ensures in-flight operations complete before process exits:
- Lifespan shutdown handler drains pending tasks with 25s timeout
- Scheduler stops accepting new jobs immediately
- HTTP session, external logging, scheduler lock, DB conn closed cleanly
- 25s Python timeout leaves 5s margin before Docker's 30s SIGKILL

Files changed:
- backend/app/main.py: enhanced _lifespan shutdown with task drain
- Docker/Dockerfile.backend: documented signal handling in header
- Docker/docker-compose.yml: added stop_grace_period: 30s
- Docker/compose.prod.yml: added stop_grace_period: 30s
- Docs/Deployment.md: new Graceful Shutdown section with sequence table
- Docs/TROUBLESHOOTING.md: new Graceful Shutdown Issues section

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-02 22:47:10 +02:00