Commit Graph

26 Commits

Author SHA1 Message Date
0a3f9c6c16 refactor(backend): external logging metrics, required mode, health checks
- Add external_logging_init_failures counter
- Add external_log_required flag, raise if init fails and required
- Health endpoint: add external_logging status check
- Blocklist service: enrich with metadata fields, update import logic
- Health check task: add runtime_state dependency, fix return typing
- Metrics: add Histogram for request latencies
- Frontend: align BlocklistImportLogSection props
- Docs: update deployment guide, remove stale tasks

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-04 03:45:13 +02:00
2f9fc8076d refactor(backend): clean up jail service, add error handling service
- Extract jail status/processing to helper functions
- Add error_handling.py service for centralized error handling
- Update config.py with validation and defaults
- Update .env.example with all config options
- Remove obsolete Tasks.md, add Service-Development.md
- Minor fixes across routers and services

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-03 17:40:37 +02:00
7b93499551 Refactor config loading and add status code docs
- Move config loading to dedicated ConfigLoader class with validation
- Add DATABASE_MIGRATIONS.md content to TROUBLESHOOTING.md
- Add API_STATUS_CODES.md documenting all API response codes
- Update runner.csx to use new config structure
- Add check_responses.py validation script
- Update config tests for new structure

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-03 11:52:01 +02:00
cc6dbcf3f0 feat: implement API versioning /api/v1/
- All backend routers moved to /api/v1/ prefix
- Frontend BASE_URL updated to /api/v1
- Setup redirect middleware updated to redirect to /api/v1/setup
- Health router path fixed: prefix=/api/v1/health, @router.get('')
- conftest.py: set server_status=online for test fixture
- Created Docs/API_VERSIONING.md with deprecation policy
- Updated Docs/Backend-Development.md with versioning section
- Updated Instructions.md curl examples

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-02 21:29:30 +02:00
37078b742b Implement structured logging to centralized platforms (Datadog, Papertrail, ELK)
This commit adds support for shipping logs to external centralized logging platforms, addressing the MEDIUM priority task for structured logging infrastructure.

## Key Changes:

### 1. New Documentation: Docs/Observability.md
- Comprehensive guide to logging architecture and configuration
- Covers all three supported platforms (Datadog, Papertrail, Elasticsearch)
- Includes best practices, security considerations, and troubleshooting
- Documents sensitive data handling and compliance requirements

### 2. Core Implementation: app/utils/external_logging.py
- ExternalLogHandler: Abstract base class for non-blocking log delivery
- DatadogLogHandler: HTTP API integration with JSON payloads
- PapertrailLogHandler: Syslog protocol over TCP
- ElasticsearchLogHandler: Bulk API integration with NDJSON format
- Features:
  - Async buffering with configurable batch size and flush interval
  - Exponential backoff retry logic
  - Non-blocking delivery (never blocks application logic)
  - Proper error handling and internal logging
  - Lifecycle management (start/shutdown)

### 3. Configuration: app/config.py
- New Settings fields for external logging:
  - external_logging_enabled (default: False)
  - external_logging_provider (datadog/papertrail/elasticsearch)
  - external_logging_buffer_size (default: 1000)
  - external_logging_flush_interval_seconds (default: 5.0)
  - Provider-specific configuration (API keys, hosts, batch sizes)
- All fields have sensible defaults
- Full field validation and normalization

### 4. Integration: app/main.py
- Global _external_log_handler for application lifecycle
- _external_logging_processor: structlog processor for handler integration
- Updated _configure_logging(): Add handler to processor chain when enabled
- Updated _lifespan(): Initialize handler before startup, shutdown on termination

### 5. Tests: backend/tests/test_external_logging.py
- 20 comprehensive tests covering all handlers and factory
- Configuration validation tests
- All tests passing

## Design Decisions:

1. **Non-blocking Delivery**: External logging never blocks request handling.
   Failures are logged locally but don't impact application.

2. **Buffering Strategy**: In-memory buffer with configurable size prevents
   unbounded memory growth. When buffer fills, oldest logs are dropped with
   a warning.

3. **Retry Logic**: Transient failures (timeouts, 5xx errors) are retried
   with exponential backoff. Permanent failures (bad credentials) are logged
   and skipped.

4. **Disabled by Default**: External logging is opt-in via environment
   variables, maintaining backward compatibility with existing deployments.

5. **Provider Flexibility**: Support for multiple platforms allows users to
   choose based on their infrastructure (cloud-native, on-premise, etc).

## Backward Compatibility:

- All new configuration fields have defaults
- External logging disabled by default
- No changes to existing logging behavior unless explicitly configured
- No new required dependencies

## Testing:

- All 20 new tests passing
- Existing tests unaffected (same count of passing tests)
- Configuration validation tested
- Handler creation and lifecycle management tested

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-01 18:25:26 +02:00
445c2c5418 Update configuration and documentation
- Update .env.example with latest environment variables
- Update deployment and task documentation
- Update backend configuration settings

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-01 18:10:03 +02:00
8138857ee1 feat: Implement session secret rotation support
Adds support for gradual session secret rotation without forcing logout:

- Add BANGUI_SESSION_SECRET_PREVIOUS config field for rotation window
- Implement unwrap_session_token_with_rotation() to accept tokens signed with
  either current or previous secret
- Update validate_session() to transparently accept old tokens during rotation
- Update logout() to accept tokens from both secrets
- Add comprehensive logging for rotation events and metrics
- Add 8 new tests covering all rotation scenarios
- Update documentation with step-by-step rotation strategy
- Update .env.example with previous secret field

Key features:
- No forced logout: old tokens continue working during rotation window
- Transparent validation: old tokens are automatically logged for monitoring
- Production-safe: can rotate secrets without service interruption
- Metrics-ready: logs track token rotation for observability

Rotation workflow:
1. Generate new secret and set BANGUI_SESSION_SECRET
2. Set BANGUI_SESSION_SECRET_PREVIOUS to old secret
3. Wait for old tokens to expire (≥ session_duration_minutes)
4. Unset BANGUI_SESSION_SECRET_PREVIOUS to complete rotation

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-01 18:01:11 +02:00
6bc440dce4 Refactor backend configuration and authentication
- Add comprehensive documentation for backend development
- Improve client IP detection with utility functions and tests
- Update auth router with better error handling
- Refactor config module with environment-based settings

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-04-29 19:39:55 +02:00
1d91e24a88 TASK-030: Secure IP geolocation with MMDB-primary resolver
Make MaxMind GeoLite2-Country MMDB the primary IP resolver (local, encrypted)
and demote ip-api.com to optional fallback only (disabled by default).

Changes:
- Add geoip_allow_http_fallback config flag (default False) to Settings
- Refactor GeoCache.lookup() and lookup_batch() to try MMDB first
- Update startup.py to pass config flag and log security warning when HTTP enabled
- Update all 49 tests to reflect new MMDB-primary strategy
- Add comprehensive geoip configuration section to Backend-Development.md
- Update Architekture.md to show MMDB + optional HTTP in system dependencies
- Update .env.example with BANGUI_GEOIP_DB_PATH and HTTP fallback flag

Security impact:
- 99% of IP addresses (successful MMDB lookups) now stay local, encrypted
- HTTP-only IPs are cached for 5 minutes to minimize external calls
- Operators must explicitly enable HTTP fallback (security-conscious default)
- GDPR/CCPA compliance: no PII sent over unencrypted networks by default

Fixes TASK-030: Resolved plaintext IP transmission to ip-api.com

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-04-26 15:31:39 +02:00
df841c21e4 TASK-026: Disable API docs in production, protect with BANGUI_ENABLE_DOCS setting
Addresses security concern where FastAPI's default behavior exposes interactive
API documentation (/docs, /redoc) without authentication, allowing attackers to
enumerate endpoints and understand API schemas.

Changes:
- Add BANGUI_ENABLE_DOCS boolean setting (default: false) to Settings
- Modify create_app() to conditionally set docs_url, redoc_url, openapi_url
- Add docs endpoints to SetupRedirectMiddleware allowlist (/api/docs, /api/redoc, /api/openapi.json)
- Set BANGUI_ENABLE_DOCS=true in Docker/compose.debug.yml for development
- Production compose files leave it unset (defaults to false, docs disabled)
- Add comprehensive tests for docs configuration
- Document the new setting in Backend-Development.md

Security Impact:
- API documentation is now disabled by default in production
- Development environments can enable docs by setting BANGUI_ENABLE_DOCS=true
- Docs endpoints are inaccessible in production without manual configuration

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-04-26 15:09:51 +02:00
d9022b9d06 Refactor config and add comprehensive tests
- Updated config.py to support environment-based configuration
- Added test_config.py with full test coverage
- Updated Backend-Development.md with configuration documentation
- Removed outdated tasks from Tasks.md

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-04-26 14:14:35 +02:00
308cf680a7 TASK-014: Add log path validation to prevent arbitrary file access
Restrict monitored log paths to a configurable allowlist of safe directories
to prevent authenticated users from instructing fail2ban to monitor arbitrary
files on the system, which could leak contents via fail2ban logging.

Changes:
- Add 'allowed_log_dirs' setting to Settings (defaults to /var/log, /config/log)
- Add @field_validator to AddLogPathRequest to validate log paths at request time
- Validator resolves paths to canonical form and checks against allowed prefixes
- Use Path.is_relative_to() to prevent prefix bypass attacks like /var/log_evil
- Add comprehensive tests for valid/invalid paths and symlink handling
- Update Features.md and Backend-Development.md with security documentation

Security improvements:
- Blocks access to sensitive files (/etc/shadow, /etc/passwd, etc.)
- Resolves symlinks before validation to prevent escape routes
- Uses proper path comparison instead of string prefix matching
- Configurable via BANGUI_ALLOWED_LOG_DIRS environment variable

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-04-26 13:49:04 +02:00
8698b89f6a TASK-010: Replace .split() with shlex.split() for fail2ban_start_command
- Add @field_validator for fail2ban_start_command to validate with shlex.split()
  at startup, catching misconfigured commands with mismatched quotes
- Replace .split() with shlex.split() in jail_config.py line 450
- Replace .split() with shlex.split() in config_misc.py line 154
- Update Backend-Development.md with configuration documentation explaining
  quoted path handling and common pitfalls
- Add comprehensive test suite (8 tests) covering valid commands, quoted paths,
  and mismatched quote errors

This fix ensures commands like '/opt/my tools/fail2ban-client' start are
correctly parsed as two tokens instead of three, preventing execution failures
when the path contains spaces.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-04-26 13:04:14 +02:00
f55c317f87 Backend refactoring updates
- Update Docker compose debug configuration
- Update backend documentation
- Update tasks documentation
- Update backend config module

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-04-26 12:05:01 +02:00
148756fb79 Finish external HTTP client resilience: add shared aiohttp config, retry support, and update task status 2026-04-09 22:01:11 +02:00
e1d741956e Disable session cache by default and make it opt-in for single-process deployments 2026-04-09 21:52:57 +02:00
4043cdfa3c Harden session cookie security with configurable cookie flags 2026-04-09 21:43:32 +02:00
c2982116a8 Add deployment-safe backend config and production-safe CORS defaults 2026-04-06 20:47:31 +02:00
1a7096b276 Add environment-driven CORS settings and backend regression tests 2026-04-06 20:42:33 +02:00
1c0bac1353 refactor: improve backend type safety and import organization
- Add TYPE_CHECKING guards for runtime-expensive imports (aiohttp, aiosqlite)
- Reorganize imports to follow PEP 8 conventions
- Convert TypeAlias to modern PEP 695 type syntax (where appropriate)
- Use Sequence/Mapping from collections.abc for type hints (covariant)
- Replace string literals with cast() for improved type inference
- Fix casting of Fail2BanResponse and TypedDict patterns
- Add IpLookupResult TypedDict for precise return type annotation
- Reformat overlong lines for readability (120 char limit)
- Add asyncio_mode and filterwarnings to pytest config
- Update test fixtures with improved type hints

This improves mypy type checking and makes type relationships explicit.
2026-03-22 14:24:24 +01:00
21753c4f06 Fix Stage 0 bootstrap and startup regression
Task 0.1: Create database parent directory before connecting
- main.py _lifespan now calls Path(database_path).parent.mkdir(parents=True,
  exist_ok=True) before aiosqlite.connect() so the app starts cleanly on
  a fresh Docker volume with a nested database path.

Task 0.2: SetupRedirectMiddleware redirects when db is None
- Guard now reads: if db is None or not is_setup_complete(db)
  A missing database (startup still in progress) is treated as setup not
  complete instead of silently allowing all API routes through.

Task 0.3: SetupGuard redirects to /setup on API failure
- .catch() handler now sets status to 'pending' instead of 'done'.
  A crashed backend cannot serve protected routes; conservative fallback
  is to redirect to /setup.

Task 0.4: SetupPage shows spinner while checking setup status
- Added 'checking' boolean state; full-screen Spinner is rendered until
  getSetupStatus() resolves, preventing form flash before redirect.
- Added console.warn in catch block; cleanup return added to useEffect.

Also: remove unused type: ignore[call-arg] from config.py.

Tests: 18 backend tests pass; 117 frontend tests pass.
2026-03-15 18:05:53 +01:00
4b6e118a88 Fix ActivateJailDialog blocking logic and mypy false positive
Two frontend bugs and one mypy false positive fixed:

- ActivateJailDialog: Activate button was never disabled when
  blockingIssues.length > 0 (missing condition in disabled prop).
- ActivateJailDialog: handleConfirm called onActivated() even when
  the backend returned active=false (blocked activation). Dialog now
  stays open and shows result.message instead.
- config.py: Settings() call flagged by mypy --strict because
  pydantic-settings loads required fields from env vars at runtime;
  suppressed with a targeted type: ignore[call-arg] comment.

Tests: added ActivateJailDialog.test.tsx (5 tests covering button state,
backend-rejection handling, success path, and crash detection callback).
2026-03-14 19:50:55 +01:00
0966f347c4 feat: Task 3 — invalid jail config recovery (pre-validation, crash detection, rollback)
- Backend: extend activate_jail() with pre-validation and 4-attempt post-reload
  health probe; add validate_jail_config() and rollback_jail() service functions
- Backend: new endpoints POST /api/config/jails/{name}/validate,
  GET /api/config/pending-recovery, POST /api/config/jails/{name}/rollback
- Backend: extend JailActivationResponse with fail2ban_running + validation_warnings;
  add JailValidationIssue, JailValidationResult, PendingRecovery, RollbackResponse models
- Backend: health_check task tracks last_activation and creates PendingRecovery
  record when fail2ban goes offline within 60 s of an activation
- Backend: add fail2ban_start_command setting (configurable start cmd for rollback)
- Frontend: ActivateJailDialog — pre-validation on open, crash-detected callback,
  extended spinner text during activation+verify
- Frontend: JailsTab — Validate Config button for inactive jails, validation
  result panels (blocking errors + advisory warnings)
- Frontend: RecoveryBanner component — polls pending-recovery, shows full-width
  alert with Disable & Restart / View Logs buttons
- Frontend: MainLayout — mount RecoveryBanner at layout level
- Tests: 19 new backend service tests (validate, rollback, filter/action parsing)
  + 6 health_check crash-detection tests + 11 router tests; 5 RecoveryBanner
  frontend tests; fix mock setup in existing activate_jail tests
2026-03-14 14:13:07 +01:00
ea35695221 Add better jail configuration: file CRUD, enable/disable, log paths
Task 4 (Better Jail Configuration) implementation:
- Add fail2ban_config_dir setting to app/config.py
- New file_config_service: list/view/edit/create jail.d, filter.d, action.d files
  with path-traversal prevention and 512 KB content size limit
- New file_config router: GET/PUT/POST endpoints for jail files, filter files,
  and action files; PUT .../enabled for toggle on/off
- Extend config_service with delete_log_path() and add_log_path()
- Add DELETE /api/config/jails/{name}/logpath and POST /api/config/jails/{name}/logpath
- Extend geo router with re-resolve endpoint; add geo_re_resolve background task
- Update blocklist_service with revised scheduling helpers
- Update Docker compose files with BANGUI_FAIL2BAN_CONFIG_DIR env var and
  rw volume mount for the fail2ban config directory
- Frontend: new Jail Files, Filters, Actions tabs in ConfigPage; file editor
  with accordion-per-file, editable textarea, save/create; add/delete log paths
- Frontend: types in types/config.ts; API calls in api/config.ts and api/endpoints.ts
- 63 new backend tests (test_file_config_service, test_file_config, test_geo_re_resolve)
- 6 new frontend tests in ConfigPageLogPath.test.tsx
- ruff, mypy --strict, tsc --noEmit, eslint: all clean; 617 backend tests pass
2026-03-12 20:08:33 +01:00
12a859061c Fix missing country: neg cache, geoip2 fallback, re-resolve endpoint
- Add 5-min negative cache (_neg_cache) so failing IPs are throttled
  rather than hammering the API on every request
- Add MaxMind GeoLite2 fallback (init_geoip / _geoip_lookup) that fires
  when ip-api fails; controlled by BANGUI_GEOIP_DB_PATH env var
- Fix lookup_batch bug: failed API results were stored in positive cache
- Add _persist_neg_entry: INSERT OR IGNORE into geo_cache with NULL
  country_code so re-resolve can find historically failed IPs
- Add POST /api/geo/re-resolve: clears neg cache, batch-retries all
  geo_cache rows with country_code IS NULL, returns resolved/total count
- BanTable + MapPage: wrap the country — placeholder in a Fluent UI
  Tooltip explaining the retry behaviour
- Add geoip2>=4.8.0 dependency; geoip_db_path config setting
- Tests: add TestNegativeCache (4), TestGeoipFallback (4), TestReResolve (4)
2026-03-07 20:42:34 +01:00
7392c930d6 feat: Stage 1 — backend and frontend scaffolding
Backend (tasks 1.1, 1.5–1.8):
- pyproject.toml with FastAPI, Pydantic v2, aiosqlite, APScheduler 3.x,
  structlog, bcrypt; ruff + mypy strict configured
- Pydantic Settings (BANGUI_ prefix env vars, fail-fast validation)
- SQLite schema: settings, sessions, blocklist_sources, import_log;
  WAL mode + foreign keys; idempotent init_db()
- FastAPI app factory with lifespan (DB, aiohttp session, scheduler),
  CORS, unhandled-exception handler, GET /api/health
- Fail2BanClient: async Unix-socket wrapper using run_in_executor,
  custom error types, async context manager
- Utility modules: ip_utils, time_utils, constants
- 47 tests; ruff 0 errors; mypy --strict 0 errors

Frontend (tasks 1.2–1.4):
- Vite + React 18 + TypeScript strict; Fluent UI v9; ESLint + Prettier
- Custom brand theme (#0F6CBD, WCAG AA contrast) with light/dark variants
- Typed fetch API client (ApiError, get/post/put/del) + endpoints constants
- tsc --noEmit 0 errors
2026-02-28 21:15:01 +01:00