Multiple RateLimitMiddleware instances were each calling
check_allowed() on every request, halving the effective global
limit (200 req/min became ~100). Added path_prefixes and skip_paths
so each instance only checks the paths it owns.
- Auth middleware scoped to /api/v1/auth/login and /api/v1/setup
- History middleware scoped to /api/v1/history
- Global middleware skips auth and history paths
- Updated tests to match single-count behavior
- Fix logging_compat._log() to handle extra keyword arguments properly
- Update config.py, main.py, and test_bans.py for compatibility
- Update Tasks.md and runner.csx
- Remove structlog dependency from backend/pyproject.toml
- Add app.utils.logging_compat shim for keyword-arg logging API
- Add app.utils.json_formatter for JSON log output with extra fields
- Update all backend modules to use logging_compat.get_logger()
- Update docstrings in log_sanitizer.py and json_formatter.py
- Update test comment in test_async_utils.py
- Record 406 failing tests in Docs/Tasks.md for tracking
- ip_utils: allow loopback (127.0.0.1) in dev mode (BANGUI_LOG_LEVEL=debug)
so e2e tests can reach a mock HTTP server on the host
- metrics: make all operations no-ops when prometheus_client not installed
- regex_validator: graceful fallback when regexploit not installed
- geo_cache: use attribute access instead of dict subscript for typed rows
- rate_limit: support bucket_override parameter for per-endpoint rate limits
- ban_service: construct DomainActiveBan explicitly instead of model_copy
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- Add deprecation middleware for warning headers on sunset endpoints
- Add jails_v2 router for API v2 migration path
- Update CI workflow with new test coverage
- Update API versioning documentation
- Remove completed tasks from Tasks.md
Fail with RuntimeError when WEB_CONCURRENCY or BANGUI_WORKERS > 1.
In-memory session cache, rate-limit windows, and runtime state are
process-local. Multi-worker silently causes stale limits, ghost sessions,
inconsistent status.
Skipped when TESTING=1.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- All backend routers moved to /api/v1/ prefix
- Frontend BASE_URL updated to /api/v1
- Setup redirect middleware updated to redirect to /api/v1/setup
- Health router path fixed: prefix=/api/v1/health, @router.get('')
- conftest.py: set server_status=online for test fixture
- Created Docs/API_VERSIONING.md with deprecation policy
- Updated Docs/Backend-Development.md with versioning section
- Updated Instructions.md curl examples
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This commit adds support for shipping logs to external centralized logging platforms, addressing the MEDIUM priority task for structured logging infrastructure.
## Key Changes:
### 1. New Documentation: Docs/Observability.md
- Comprehensive guide to logging architecture and configuration
- Covers all three supported platforms (Datadog, Papertrail, Elasticsearch)
- Includes best practices, security considerations, and troubleshooting
- Documents sensitive data handling and compliance requirements
### 2. Core Implementation: app/utils/external_logging.py
- ExternalLogHandler: Abstract base class for non-blocking log delivery
- DatadogLogHandler: HTTP API integration with JSON payloads
- PapertrailLogHandler: Syslog protocol over TCP
- ElasticsearchLogHandler: Bulk API integration with NDJSON format
- Features:
- Async buffering with configurable batch size and flush interval
- Exponential backoff retry logic
- Non-blocking delivery (never blocks application logic)
- Proper error handling and internal logging
- Lifecycle management (start/shutdown)
### 3. Configuration: app/config.py
- New Settings fields for external logging:
- external_logging_enabled (default: False)
- external_logging_provider (datadog/papertrail/elasticsearch)
- external_logging_buffer_size (default: 1000)
- external_logging_flush_interval_seconds (default: 5.0)
- Provider-specific configuration (API keys, hosts, batch sizes)
- All fields have sensible defaults
- Full field validation and normalization
### 4. Integration: app/main.py
- Global _external_log_handler for application lifecycle
- _external_logging_processor: structlog processor for handler integration
- Updated _configure_logging(): Add handler to processor chain when enabled
- Updated _lifespan(): Initialize handler before startup, shutdown on termination
### 5. Tests: backend/tests/test_external_logging.py
- 20 comprehensive tests covering all handlers and factory
- Configuration validation tests
- All tests passing
## Design Decisions:
1. **Non-blocking Delivery**: External logging never blocks request handling.
Failures are logged locally but don't impact application.
2. **Buffering Strategy**: In-memory buffer with configurable size prevents
unbounded memory growth. When buffer fills, oldest logs are dropped with
a warning.
3. **Retry Logic**: Transient failures (timeouts, 5xx errors) are retried
with exponential backoff. Permanent failures (bad credentials) are logged
and skipped.
4. **Disabled by Default**: External logging is opt-in via environment
variables, maintaining backward compatibility with existing deployments.
5. **Provider Flexibility**: Support for multiple platforms allows users to
choose based on their infrastructure (cloud-native, on-premise, etc).
## Backward Compatibility:
- All new configuration fields have defaults
- External logging disabled by default
- No changes to existing logging behavior unless explicitly configured
- No new required dependencies
## Testing:
- All 20 new tests passing
- Existing tests unaffected (same count of passing tests)
- Configuration validation tested
- Handler creation and lifecycle management tested
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Add SecurityHeadersMiddleware to backend/app/main.py
- Implements Content-Security-Policy: default-src 'self'
- Implements X-Frame-Options: DENY (clickjacking protection)
- Implements X-Content-Type-Options: nosniff (MIME-sniffing protection)
- Implements X-XSS-Protection: 1; mode=block (browser XSS filters)
- Add CSP meta tag to frontend/index.html for defense-in-depth
- Create Docs/Security.md with comprehensive security headers documentation
- Add test suite (backend/tests/test_security_headers_middleware.py) with 5 tests
- Tests verify headers are present on success and error responses
- Tests ensure all four security headers are correctly set
- All existing tests continue to pass
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Add global rate limiter utility with configurable limits and cleanup
- Move rate limiting logic to middleware for consistent application
- Update auth routes to use new rate limiter
- Add comprehensive tests for rate limiter functionality
- Update documentation with backend development guidelines and tasks
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
**Problem:** Broad exception handlers created fragility where adding a new
DomainError subclass without explicit registration would silently fall through
to the generic exception handler, losing the specific error_code and metadata.
**Solution:**
1. Import DomainError in main.py for explicit handler registration
2. Fix type hints in exception handlers from 'Exception' to specific types
- NotFoundError handler now typed as 'NotFoundError'
- BadRequestError handler now typed as 'BadRequestError'
- ConflictError handler now typed as 'ConflictError'
- DomainError handler now typed as 'DomainError'
- ServiceUnavailableError handler now typed as 'ServiceUnavailableError'
3. Add DomainError as an explicit catch-all handler in the registration chain
- Positioned after specific handlers, before HTTPException
- Any unregistered DomainError subclass now gets correct error_code + metadata
4. Document the exception handler hierarchy with detailed comments
5. Update Backend-Development.md with handler hierarchy documentation
6. Update Architekture.md section 2.2 with exception handler details
7. Fix test expectations in test_main.py to verify ErrorResponse format
**Impact:** Any new DomainError subclass now automatically gets correct HTTP 500
status, error_code, and metadata - even if developer forgets explicit handler.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Align frontend and backend error observability with correlation IDs and
structured telemetry for distributed tracing across systems.
Backend changes:
- Add CorrelationIdMiddleware to generate/extract correlation IDs
- Include correlation_id in all ErrorResponse objects
- Store correlation ID in structlog contextvars for automatic inclusion in logs
- Add correlation ID to response headers (X-Correlation-ID)
Frontend changes:
- API client automatically generates session-scoped UUID4 and includes
X-Correlation-ID header in all requests
- Extract correlation ID from API error responses
- Update error handlers to use telemetry with correlation IDs
- Add telemetry logging to ErrorBoundary, PageErrorBoundary, SectionErrorBoundary
- Implement redaction utilities for privacy-safe logging of sensitive data
Documentation:
- Add observability guidelines to Web-Development.md
* Correlation ID usage patterns
* Privacy & security best practices
* Telemetry event structure
* Redaction utilities for sensitive data
- Add distributed tracing architecture section to Architecture.md
* Correlation ID flow across frontend/backend
* Example troubleshooting scenario
* Implementation details for future enhancements
Testing:
- Add comprehensive tests for correlation middleware
- Update error boundary tests to verify telemetry integration
- Verify TypeScript and ESLint pass with no warnings
Fixes: Issue #40 - Frontend and backend observability are not aligned
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Change _fail2ban_connection_handler() to return generic message instead of
leaking socket path in HTTP 502 response body
- Change _fail2ban_protocol_handler() to return generic message instead of
leaking raw exception details in HTTP 502 response body
- Full error details are still logged server-side (error=str(exc)) for debugging
- Update Backend-Development.md with error message hygiene section explaining
the pattern: generic user-friendly messages in HTTP responses, full details
in server logs only
Fixes TASK-029: Fail2BanConnectionError leaks socket path in HTTP error responses
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Addresses security concern where FastAPI's default behavior exposes interactive
API documentation (/docs, /redoc) without authentication, allowing attackers to
enumerate endpoints and understand API schemas.
Changes:
- Add BANGUI_ENABLE_DOCS boolean setting (default: false) to Settings
- Modify create_app() to conditionally set docs_url, redoc_url, openapi_url
- Add docs endpoints to SetupRedirectMiddleware allowlist (/api/docs, /api/redoc, /api/openapi.json)
- Set BANGUI_ENABLE_DOCS=true in Docker/compose.debug.yml for development
- Production compose files leave it unset (defaults to false, docs disabled)
- Add comprehensive tests for docs configuration
- Document the new setting in Backend-Development.md
Security Impact:
- API documentation is now disabled by default in production
- Development environments can enable docs by setting BANGUI_ENABLE_DOCS=true
- Docs endpoints are inaccessible in production without manual configuration
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Add in-memory rate limiter with per-IP deque tracking of attempt timestamps
- Limit login attempts to 5 per 60 seconds per IP, return 429 on excess
- Add Retry-After header to rate limit responses
- Implement IP extraction utility with proxy trust validation (prevent X-Forwarded-For spoofing)
- Integrate rate limiter into auth router and dependencies
- Add 10-second asyncio.sleep on failed login attempts to further slow brute-force
- Add comprehensive tests for rate limiting (9 new tests, all passing)
- Update Features.md to document login rate limiting
- Update Backend-Development.md with rate limiting conventions and design patterns
- Fix test infrastructure issues: update password to meet complexity requirements
- Fix TestValidateSession tests to use Bearer token authentication
- All tests passing: 23 auth tests + full test suite coverage
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Move session cache initialization from per-request _build_app_context to
startup lifespan handler. The session cache type is now decided once at app
startup based on settings, making _build_app_context pure (read-only).
Changes:
- Move cache initialization logic to new _update_session_cache() in main.py
- Call _update_session_cache() during lifespan startup to initialize cache
- Remove three if/elif/elif branches mutating state.session_cache from _build_app_context
- Add cache swap logic to set_runtime_settings() in runtime_state.py to handle
runtime settings changes (e.g., setup wizard updates)
- Keep app.state.session_cache initialization in create_app() for test compatibility
This ensures:
- _build_app_context is pure and doesn't mutate app state on each request
- Session cache configuration decisions are centralized at startup
- Settings changes during runtime (via setup wizard) also trigger cache swap
- Cache initialization logic is isolated in one place
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Use NoOpSessionCache in backend/app/main.py and dynamically switch cache implementation in backend/app/dependencies.py so disabled cache mode remains safe while get_session_cache always returns a valid object.
Add source=archive option for dashboard endpoints and history service; update Docs/Tasks.md; include archive branch for list_bans, bans_by_country, ban_trend, bans_by_jail; tests for archive paths.
On startup BanGUI now verifies that the four fail2ban jail config files
required by its two custom jails (manual-Jail and blocklist-import) are
present in `$fail2ban_config_dir/jail.d`. Any missing file is created
with the correct default content; existing files are never overwritten.
Files managed:
- manual-Jail.conf (enabled=false template)
- manual-Jail.local (enabled=true override)
- blocklist-import.conf (enabled=false template)
- blocklist-import.local (enabled=true override)
The check runs in the lifespan hook immediately after logging is
configured, before the database is opened.
Task 0.1: Create database parent directory before connecting
- main.py _lifespan now calls Path(database_path).parent.mkdir(parents=True,
exist_ok=True) before aiosqlite.connect() so the app starts cleanly on
a fresh Docker volume with a nested database path.
Task 0.2: SetupRedirectMiddleware redirects when db is None
- Guard now reads: if db is None or not is_setup_complete(db)
A missing database (startup still in progress) is treated as setup not
complete instead of silently allowing all API routes through.
Task 0.3: SetupGuard redirects to /setup on API failure
- .catch() handler now sets status to 'pending' instead of 'done'.
A crashed backend cannot serve protected routes; conservative fallback
is to redirect to /setup.
Task 0.4: SetupPage shows spinner while checking setup status
- Added 'checking' boolean state; full-screen Spinner is rendered until
getSetupStatus() resolves, preventing form flash before redirect.
- Added console.warn in catch block; cleanup return added to useEffect.
Also: remove unused type: ignore[call-arg] from config.py.
Tests: 18 backend tests pass; 117 frontend tests pass.