- Add SecurityHeadersMiddleware to backend/app/main.py
- Implements Content-Security-Policy: default-src 'self'
- Implements X-Frame-Options: DENY (clickjacking protection)
- Implements X-Content-Type-Options: nosniff (MIME-sniffing protection)
- Implements X-XSS-Protection: 1; mode=block (browser XSS filters)
- Add CSP meta tag to frontend/index.html for defense-in-depth
- Create Docs/Security.md with comprehensive security headers documentation
- Add test suite (backend/tests/test_security_headers_middleware.py) with 5 tests
- Tests verify headers are present on success and error responses
- Tests ensure all four security headers are correctly set
- All existing tests continue to pass
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Add global rate limiter utility with configurable limits and cleanup
- Move rate limiting logic to middleware for consistent application
- Update auth routes to use new rate limiter
- Add comprehensive tests for rate limiter functionality
- Update documentation with backend development guidelines and tasks
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Addresses: Backend session cache not cluster-safe (multi-worker issue)
Problem:
- Session cache is process-local (InMemorySessionCache)
- Multi-worker deployments (uvicorn --workers N) create separate processes
- Each process has its own independent session cache
- Sessions cached in Worker A are invisible to Workers B, C, D
- Users randomly logged out when requests land on different workers
- Also affects RuntimeState, rate limiter, and background jobs
Solution (Option A - Strict single-worker enforcement):
- Enhance startup validation with clearer error messages
- Update error messages to explain the problem and how to fix it
- Document single-worker requirement prominently in Docker configs
- Update module docstrings to clarify constraints
Changes:
1. app/startup.py:
- Enhanced _check_single_worker_mode() error message with troubleshooting
- Enhanced _stage_check_worker_mode_and_acquire_lock() error message
- Removed unused import
2. app/utils/session_cache.py:
- Updated module docstring to explain constraints more clearly
- Added references to deployment documentation
- Clarified multi-worker solution for future implementation
3. app/utils/runtime_state.py:
- Updated module docstring with deployment constraint references
- Aligned messaging with session_cache.py
4. Docker/Dockerfile.backend:
- Added comprehensive comments about single-worker requirement
- Explained impact in multi-worker deployments
- Referenced deployment constraints documentation
5. Docker/docker-compose.yml, compose.prod.yml, compose.debug.yml:
- Added documentation comments about BANGUI_WORKERS constraint
- Explained why single-worker is required
6. backend/tests/test_startup_integration.py:
- Fixed test unpacking to match function return signature (3 values, not 2)
This ensures multi-worker deployments fail loudly at startup with clear
guidance on what went wrong and how to fix it. The database-backed scheduler
lock provides defense-in-depth for container orchestration scenarios.
For future multi-worker support, implement:
- Redis or database-backed session cache
- Shared RuntimeState coordination
- Distributed APScheduler backend
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Move all repository imports (session_repo, blocklist_repo, import_log_repo,
settings_repo, history_archive_repo, geo_cache_repo, fail2ban_db_repo) and
service imports (auth_service, health_service, default_fail2ban_metadata_service)
to module level in app/dependencies.py.
This eliminates the pattern of local imports inside provider functions,
providing consistency and reducing import overhead. The from app.db import
open_db remains a local import since it's only used within get_db().
- Verified no circular dependencies exist
- All repository and service provider functions simplified to return modules
- Updated Architekture.md § 2.3 to document the module-level import pattern
- All tests pass (28 dependency + auth tests)
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Update rate limiter to use exponential backoff instead of fixed limit
- Implement progressive delays for failed login attempts (0.5s, 1s, 2s, 4s, 5s max)
- Update auth router documentation and endpoint docs
- Refactor test suite to match new rate limiting behavior
- Update backend development documentation
- Clean up unused tasks documentation
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
**Problem:** Broad exception handlers created fragility where adding a new
DomainError subclass without explicit registration would silently fall through
to the generic exception handler, losing the specific error_code and metadata.
**Solution:**
1. Import DomainError in main.py for explicit handler registration
2. Fix type hints in exception handlers from 'Exception' to specific types
- NotFoundError handler now typed as 'NotFoundError'
- BadRequestError handler now typed as 'BadRequestError'
- ConflictError handler now typed as 'ConflictError'
- DomainError handler now typed as 'DomainError'
- ServiceUnavailableError handler now typed as 'ServiceUnavailableError'
3. Add DomainError as an explicit catch-all handler in the registration chain
- Positioned after specific handlers, before HTTPException
- Any unregistered DomainError subclass now gets correct error_code + metadata
4. Document the exception handler hierarchy with detailed comments
5. Update Backend-Development.md with handler hierarchy documentation
6. Update Architekture.md section 2.2 with exception handler details
7. Fix test expectations in test_main.py to verify ErrorResponse format
**Impact:** Any new DomainError subclass now automatically gets correct HTTP 500
status, error_code, and metadata - even if developer forgets explicit handler.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Models in app/models/ are now pure data classes with no cross-layer dependencies.
This ensures the models layer remains a true leaf node in the dependency graph.
Changes:
- Create app/models/_common.py with shared types (TimeRange, bucket_count, constants)
- Move TimeRange and time-range constants from ban.py to _common.py
- Update history.py, routers, and services to import from _common.py
- Remove imports from app.config and app.utils from config.py models
- Move field validators from models to router layer:
- Add log_target validation in config_misc router
- Add log_path validation in jail_config router
- Update test_models.py to reflect validators moved to router layer
- Update documentation (Architekture.md, Backend-Development.md) with model layering rules
- Fix import ordering and type annotations in affected files
Model layering rule: Models may only import from:
✓ Standard library and third-party packages (Pydantic, typing)
✓ Other models in app/models/ (sibling models)
✓ app.models.response (response envelopes)
✗ app.services, app.config, app.utils, or any application layer
Validation requiring app-level state (settings, allowed directories) now happens
at the router or service layer, not in model validators.
Fixes: Models were not true leaf nodes due to circular imports and app-layer dependencies
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Align frontend and backend error observability with correlation IDs and
structured telemetry for distributed tracing across systems.
Backend changes:
- Add CorrelationIdMiddleware to generate/extract correlation IDs
- Include correlation_id in all ErrorResponse objects
- Store correlation ID in structlog contextvars for automatic inclusion in logs
- Add correlation ID to response headers (X-Correlation-ID)
Frontend changes:
- API client automatically generates session-scoped UUID4 and includes
X-Correlation-ID header in all requests
- Extract correlation ID from API error responses
- Update error handlers to use telemetry with correlation IDs
- Add telemetry logging to ErrorBoundary, PageErrorBoundary, SectionErrorBoundary
- Implement redaction utilities for privacy-safe logging of sensitive data
Documentation:
- Add observability guidelines to Web-Development.md
* Correlation ID usage patterns
* Privacy & security best practices
* Telemetry event structure
* Redaction utilities for sensitive data
- Add distributed tracing architecture section to Architecture.md
* Correlation ID flow across frontend/backend
* Example troubleshooting scenario
* Implementation details for future enhancements
Testing:
- Add comprehensive tests for correlation middleware
- Update error boundary tests to verify telemetry integration
- Verify TypeScript and ESLint pass with no warnings
Fixes: Issue #40 - Frontend and backend observability are not aligned
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Add comprehensive 'Dependency Wiring and Service Composition' section to
Architekture.md (§ 2.3) documenting:
* The lightweight FastAPI Depends() pattern used as composition root
* Service composition through explicit parameter passing
* Service context dependencies pattern (SessionServiceContext, etc.)
* Repository boundary enforcement
* Lifecycle and scope management
* Checklist for adding new services
- Update Backend-Development.md to reference the new Architecture section
from the 'Dependency Layering' section
- Enhance dependencies.py module docstring with clear explanation of:
* Composition root pattern
* Explicit over implicit principles
* Service context dependencies
* Repository boundary enforcement
This resolves issue #39 by providing clear guidance on dependency wiring
without over-engineering. The pattern uses FastAPI's built-in Depends()
framework and avoids heavyweight container libraries, keeping the solution
lightweight and maintainable.
Fixes: #39
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Add composite index on (jail, timeofban DESC) for dashboard filtering
- Add composite index on (timeofban DESC, jail, action) for time-range queries
- Add single-column indexes on ip and action for targeted filtering
- Update schema version to 5 and document in Backend-Development.md
Indexes optimize common dashboard and API query patterns with pagination.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Add comprehensive documentation for backend development
- Improve client IP detection with utility functions and tests
- Update auth router with better error handling
- Refactor config module with environment-based settings
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Implement periodic cleanup of expired rate-limiter entries to prevent
unbounded memory growth during long runtimes.
Changes:
- Create rate_limiter_cleanup task that calls cleanup_expired() every 30 minutes
- Register the task in the startup DAG alongside other background jobs
- Update rate_limiter module documentation with operational notes about the
cleanup lifecycle and memory management strategy
The cleanup is conservative and only removes IPs with no recent attempts
(all timestamps outside the rate-limit window), so active IPs are preserved.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Replace fire-and-forget reschedule pattern with proper async/await:
- Changed reschedule() from fire-and-forget to awaitable async function
- Errors are now properly propagated instead of silently failing
- Added structured logging for reschedule start and completion
- Schedule updates are now deterministic and observable to callers
Changes:
- app/tasks/blocklist_import.py: Convert reschedule to async, remove asyncio.ensure_future
- tests/test_tasks/test_blocklist_import.py: Add tests for error propagation and logging
- Docs/Features.md: Document scheduling reliability guarantees
All 15 blocklist_import tests pass with 100% coverage.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Implement transactional setup with explicit state machine and crash-safety
to prevent partial commits from leaving inconsistent state.
## Changes
### Core Implementation
1. **settings_repo.py**: Add atomic batch settings write
- New set_settings_batch() method: writes multiple settings in single
transaction (BEGIN IMMEDIATE ... COMMIT). Either all settings persist
or none do, preventing partial state if crash occurs mid-batch.
2. **setup_service.py**: Refactor run_setup() with transactional phases
- Phase 0: Compute password hash early (before any DB writes) to ensure
idempotency. Same hash is used throughout retries, preventing divergent
hashes from bcrypt's random salt.
- Phase 1 (Bootstrap DB transaction): Set setup_state=in_progress and
database_path, then commit. First checkpoint for crash detection.
- Phase 2 (Filesystem): Initialize runtime database (idempotent)
- Phase 3 (Runtime DB transaction): Batch-write all settings atomically
- Phase 4 (Bootstrap DB transaction): Set setup_state=complete and
setup_completed=1. Final commit point.
3. **protocols.py**: Add set_settings_batch to SettingsRepository protocol
### Testing
- Added 6 new transactionality tests covering:
- State machine transitions (None → in_progress → complete)
- Password hash idempotency across retries
- Atomic batch writes (all-or-nothing persistence)
- Bootstrap DB state tracking
- Database path propagation to both DBs
- Recovery on partial failure
- All 18 tests pass (12 existing + 6 new)
### Documentation
- Updated Docs/Architekture.md with new section 6:
- Setup state machine with state transitions
- Transaction boundary documentation
- Password hash idempotency rationale
- Backward compatibility notes
## Design Decisions
### Why This Approach
- Current code already idempotent via INSERT OR REPLACE, but password
hash non-idempotency created silent inconsistency risk
- Simpler than multi-state machine: 2 states sufficient for detection
- Maintains backward compatibility (setup_completed key still written)
- Explicit transactions make crash-safety obvious to future maintainers
### Crash Scenarios Now Handled
1. Crash after Phase 1 → detected by setup_state=in_progress on retry
2. Crash after Phase 2 → runtime DB may be partial, safe to retry
3. Crash after Phase 3 → runtime DB rolls back on next connection
4. Crash after Phase 4 → setup_completed detected, skipped
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
## Problem
The blocklist URL validation at create/update time has a TOCTOU (time-of-check-to-time-of-use) window.
An attacker can perform a DNS-rebinding attack where:
1. User adds blocklist URL pointing to attacker.com
2. At create time, attacker.com resolves to a public IP → validation passes
3. Later, when fetching, attacker.com resolves to 192.168.1.1 (internal network)
4. HTTP client connects to the private IP, potentially accessing internal services
## Solution
Add runtime destination IP validation at connection time via a custom socket factory:
- Created 'dns_validated_connector.py' with create_dns_validated_socket_factory() that validates
all resolved IPs before socket creation
- HTTP session now uses the validated socket factory, protecting all blocklist imports globally
- Rejects connections to RFC 1918 private ranges, loopback, link-local, ULA, multicast, and
reserved addresses (IPv4 and IPv6)
- Added comprehensive test coverage with 13 test cases
## Changes
- backend/app/services/dns_validated_connector.py: Custom socket factory with IP validation
- backend/app/startup.py: Use DNS-validated socket factory in HTTP session creation
- backend/app/utils/ip_utils.py: Updated docstring explaining runtime validation
- backend/app/services/blocklist_downloader.py: Updated module docstring
- backend/app/services/blocklist_service.py: Updated docstrings explaining two-layer protection
- backend/tests/test_services/test_dns_validated_connector.py: Test suite for socket factory
- Docs/Architekture.md: Added detailed section on DNS-rebinding protection
## Testing
- All 13 DNS validation tests pass
- All blocklist downloader tests pass (unaffected by changes)
- Linting: ruff, mypy pass with --strict
- Test coverage: 90% line coverage on dns_validated_connector.py
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This commit standardizes how API responses are wrapped, solving issue #24.
Problem:
- Inconsistent response envelopes (jails vs items vs bans vs no wrapper)
- Frontend required multiple field name variants
- Integration bugs from branching logic
- No clear pattern for different response types
Solution:
- Created response.py with base classes: PaginatedListResponse,
CollectionResponse, CommandResponse
- Standardized all list/collection responses to use 'items' field
- Domain-specific field names for detail and aggregation responses
- Updated all backends routers and mappers
- Updated frontend types and hooks to match
Changes:
Backend:
- backend/app/models/response.py (new): Base response models
- backend/app/models/ban.py: Updated responses to inherit from bases
- backend/app/models/jail.py: Updated JailListResponse, JailCommandResponse
- backend/app/models/config.py: Updated collection responses
- backend/app/services/jail_service.py: Updated return statements
- backend/app/mappers/ban_mappers.py: Updated 'bans' to 'items'
- backend/tests/test_mappers/test_ban_mappers.py: Updated tests
Frontend:
- frontend/src/types/jail.ts: Updated response interfaces
- frontend/src/types/config.ts: Updated response interfaces
- frontend/src/hooks/useActiveBans.ts: Updated selector
- frontend/src/hooks/useJailList.ts: Updated selector
- frontend/src/hooks/useJailConfigs.ts: Updated selector
- frontend/src/hooks/useConfigActiveStatus.ts: Updated field access
- frontend/src/hooks/useJailAdmin.ts: Updated field access
Documentation:
- Docs/Backend-Development.md: Added § 4.1 API Response Envelope Policy
The policy defines:
1. Paginated lists use PaginatedListResponse (items, total, page, page_size)
2. Non-paginated collections use CollectionResponse (items, total)
3. Detail responses use entity-specific field names (jail, status, settings)
4. Command responses use CommandResponse (message, success, optional target)
5. Aggregations use domain-specific fields (jails, countries, buckets, bans)
All responses now follow one of these patterns, reducing frontend complexity.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Created StartupDAG class to orchestrate startup stages with explicit dependencies
- Defined 6 startup stages: WORKER_MODE → DATABASE → GEO_CACHE → HTTP_SESSION → SCHEDULER → TASKS
- Each stage has prerequisites, error handling, and rollback support
- Refactored startup_shared_resources() to use the DAG
- Added StartupContext for resource tracking and failure management
- Partial failures automatically roll back all completed resources in reverse order
- Added health checks to verify all resources initialized successfully
- Comprehensive test coverage: 15 DAG unit tests + 3 integration tests + 6 existing tests
- Documented startup DAG in Architekture.md with detailed stage descriptions and failure modes
This replaces implicit ordering with explicit dependency tracking, making lifecycle
changes safe and failure modes predictable. Hidden order dependencies no longer exist.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Convert inconsistent modeling style to standardized Pydantic models for all
external-facing data structures while maintaining TypedDict compatibility where
appropriate for internal layer-private structures.
Changes:
- Converted IpLookupResult TypedDict to use IpLookupResponse Pydantic model
in jail_service.lookup_ip() for consistency with routers
- Added GeoCacheEntry Pydantic model for geo cache repository rows
- Converted GeoCacheRow TypedDict to use GeoCacheEntry alias
- Converted ImportLogRow TypedDict to use ImportLogEntry alias
- Updated routers and services to work with Pydantic models
- Updated all tests to use Pydantic model field access (attributes)
instead of dict subscripting
Documentation:
- Added 'Model Type Usage by Layer' section to Backend-Development.md
- Defines when TypedDict is allowed (internal structures) vs Pydantic
(external-facing, cross-boundary data)
- Provides clear guidance on modeling conventions per layer
Benefits:
- Consistent validation and serialization behavior
- Better IDE support and type checking
- Clearer separation of concerns by layer
- Reduced maintenance cost from mixed validation approaches
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Add ban domain model for core business logic separation
- Implement mapper pattern for DTO/domain conversions
- Update ban service with new domain-driven approach
- Refactor router endpoints to use new architecture
- Add comprehensive mapper tests
- Update documentation with architecture changes
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This commit enforces the repository boundary by eliminating direct database connection
dependencies (DbDep) from all routers. Routers now depend on service context dependencies
that combine the database connection with the related repositories.
Changes:
- Add 5 service context dependencies in dependencies.py:
* SessionServiceContext: db + session_repo
* BlocklistServiceContext: db + blocklist_repo + import_log_repo + settings_repo
* SettingsServiceContext: db + settings_repo
* BanServiceContext: db + fail2ban_db_repo
* HistoryServiceContext: db + fail2ban_db_repo + history_archive_repo
- Refactor all 9 routers (auth, bans, blocklist, config_misc, dashboard, geo,
history, jails, setup) to use service contexts instead of DbDep.
- Update Backend-Development.md with clear examples of the new pattern and
documentation of available service contexts.
Rationale:
- Enforces the repository boundary through the dependency system
- Makes database operations explicit and auditable
- Improves testability by allowing service contexts to be mocked
- Prevents accidental direct database access from routers
The deprecated DbDep remains available for backward compatibility with
services that have not yet been refactored, but routers can no longer import it.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Hide raw database connections (DbDep) from routers by removing from public exports
- Maintain DbDep as deprecated export for backward compatibility
- Add _DbDep internal dependency for use by other dependencies like require_auth
- Update module docstring to explain dependency layering rules
- Add comprehensive documentation section on dependency layering to Backend-Development.md
This enforces the architectural boundary where:
- Routers depend on repository dependencies (SessionRepoDep, BlocklistRepositoryDep, etc)
- Services orchestrate operations through repositories
- Only repositories execute SQL queries
The repository boundary is now technically enforced through the dependency injection
system, making it impossible for routers to accidentally bypass repositories and
access the database directly.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
TASK-004: Replace module-level mutable runtime flags in service layer with
injected state holder, eliminating hidden global state and improving testability
and synchronization boundaries.
Changes:
- Create JailServiceState dataclass in app/utils/runtime_state.py to hold
backend capability cache and synchronization lock
- Add JailServiceState as a field in RuntimeState (with default_factory)
- Remove module-level _backend_cmd_supported and _backend_cmd_lock from
jail_service.py
- Refactor _check_backend_cmd_supported() to accept state parameter
- Inject JailServiceState into list_jails() and _fetch_jail_summary() via
parameters
- Add get_jail_service_state() dependency provider in app/dependencies.py
- Add JailServiceStateDep type alias for router injection
- Update jails router to receive and pass state to service functions
- Update all tests to use jail_service_state fixture and pass state to functions
- Remove duplicate _MAX_PAGE_SIZE constant definition
- Document mutable state management in Backend-Development.md
- Update Architecture.md to describe JailServiceState and state nesting pattern
Benefits:
- Eliminates global mutable state and associated race conditions
- Makes state visible to callers (not hidden in module scope)
- Enables test isolation (each test gets fresh state)
- Prepares codebase for multi-worker deployments (state can be extracted to
shared backend)
- Synchronization boundaries are now explicit (state.get_backend_cmd_lock())
Compliance:
- All tests pass (17 passed in TestListJails, TestGetJail, TestLockInitialization)
- No ruff linting errors
- Type-safe: JailServiceState properly typed with asyncio.Lock, bool | None
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Extracted the monolithic import_source() function (776 lines) into focused,
testable components with clear single responsibilities:
- BlocklistDownloader: HTTP download with exponential backoff retry logic
* Handles transient failures (429, 5xx errors, timeouts)
* Configurable retry attempts and backoff strategy
* 93% test coverage
- BlocklistParser: Parse and validate IP addresses
* Extract valid IPv4/IPv6 addresses from text
* Skip CIDRs and malformed entries gracefully
* Separate parsing from validation concerns
* 100% test coverage
- BanExecutor: Ban execution with error handling
* Ban IPs via fail2ban socket
* Stop on JailNotFoundError (jail doesn't exist)
* Continue on JailOperationError (individual ban failures)
* 100% test coverage
- BlocklistImportWorkflow: Thin orchestrator
* Coordinates the download → parse → ban → log flow
* Pre-warms geo cache with newly banned IPs
* 96% test coverage
- blocklist_service.py: Maintains public API
* Source CRUD (create, read, update, delete)
* URL validation and preview functionality
* Scheduling configuration and import triggers
* 92% test coverage
Benefits:
* Each component is independently testable with mock dependencies
* Error handling is explicit and localized
* Components can evolve independently
* Logging is contextual and clear
* Retry and transient error handling are isolated
Testing:
* All 36 existing blocklist_service tests pass
* All 13 blocklist import task tests pass
* Added 17 comprehensive component unit tests
* Combined 96%+ coverage on new modules
* Zero type errors in new code
Documentation:
* Updated Refactoring.md with detailed architecture notes
* Added component architecture diagram to Architekture.md
* Documented ownership and responsibilities of each component
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Remove hidden cross-service coupling by making dependencies explicit through
dependency injection while maintaining backward compatibility via lazy imports.
Key changes:
- history_service and ban_service: Removed direct module-level imports of
fail2ban_metadata_service, added optional service parameters to functions
- Added get_fail2ban_metadata_service() provider to dependencies.py
- Updated history router to inject Fail2BanMetadataService dependency
- history_service functions now use lazy imports in fallback paths for
backward compatibility when service is not explicitly injected
- All test patches updated to use internal _get_fail2ban_db_path() helper
- jail_config_service and jail_service already follow best practices
This pattern prevents circular imports, makes services testable via explicit
mocking, and documents service dependencies clearly.
Fixes: Instructions.md #2 - Hidden cross-service coupling
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Fixes a critical security vulnerability where the session token was
being returned in the JSON response body of POST /api/auth/login.
This exposed the token to JavaScript, allowing malicious scripts to
steal it and bypass the HttpOnly cookie protection.
Changes:
- Backend: Remove 'token' field from LoginResponse model (auth.py)
- Backend: Update login() endpoint to return only 'expires_at'
- Frontend: Update LoginResponse type to exclude 'token' field
- Backend: Update test helper _login() to extract token from cookie
- Backend: Update test cases to verify token is NOT in response body
- Documentation: Add section 'Authentication Endpoints' in Backend-Development.md
- Documentation: Update Web-Development.md to explain HttpOnly cookie benefits
Security benefit: Session tokens are now only accessible via HttpOnly
cookies, protected from JavaScript access, XSS attacks, and malicious
third-party scripts. The frontend continues to use only the cookie for
authentication.
All auth tests pass (23 tests). Type checking and linting pass with
zero errors.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add automatic cleanup of stale geolocation cache entries to prevent
unbounded database growth. Resolves the issue where unique IP addresses
accumulated indefinitely in the geo_cache table, degrading query performance.
## Changes
### Database Schema (Migration 3)
- Add 'last_seen' column to geo_cache table tracking last reference time
- Existing entries default to current timestamp
### Repository Layer (geo_cache_repo.py)
- Update upsert_entry() to set/refresh last_seen on insert/update
- Update upsert_neg_entry() to set/refresh last_seen on negative cache hits
- Update bulk_upsert_entries() to set/refresh last_seen in batch operations
- Add delete_stale_entries(db, cutoff_iso) -> int for purging old entries
### Background Task (geo_cache_cleanup.py)
- New APScheduler task that runs nightly (24-hour interval)
- Calculates cutoff as 90 days ago from current time (UTC)
- Deletes all entries with last_seen older than cutoff
- Logs operation results (info when deleted > 0, debug when 0 deleted)
- Configurable retention period via GEO_CACHE_RETENTION_DAYS constant
### Application Startup (startup.py)
- Register geo_cache_cleanup task in scheduler during app startup
- Placed after geo_cache_flush in task registration order
### Tests
- Add delete_stale_entries test cases covering:
* Removal of old entries beyond cutoff
* No deletion when all entries are recent
* Empty table edge case
- Update existing test fixtures to include last_seen column
- Add full test suite for cleanup task registration and execution
### Documentation
- Architekture.md: Document cleanup task, update schema/diagram
- Backend-Development.md: Add retention policy documentation
## Behavior
When an IP is accessed, its last_seen is refreshed. After 90 days of no
access, an IP is purged by the nightly cleanup. On next encounter, the IP
is re-resolved from MaxMind MMDB or ip-api.com (if configured).
This is acceptable because:
1. Stale geolocation data may become inaccurate over time
2. Re-resolution cost is minimal compared to unbounded storage growth
3. Active IPs maintain fresh data through their last_seen updates
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Bcrypt silently truncates passwords at 72 bytes, so passwords longer than 72
characters provide no additional security. This commit enforces the 72-byte
maximum across the authentication and setup flows.
Changes:
- Add max_length=72 to LoginRequest.password and SetupRequest.master_password
- Update field validator in SetupRequest to explicitly check max_length
- Add comprehensive tests for password length validation (6 new test cases)
- Document the 72-byte limitation in Features.md (master password options)
- Add new section 12 'Password Hashing' in Backend-Development.md explaining:
- The bcrypt truncation behavior
- Why the limit is enforced
- The validation flow from frontend to backend
- What happens when passwords exceed the limit
All existing tests pass, no regressions introduced.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Make MaxMind GeoLite2-Country MMDB the primary IP resolver (local, encrypted)
and demote ip-api.com to optional fallback only (disabled by default).
Changes:
- Add geoip_allow_http_fallback config flag (default False) to Settings
- Refactor GeoCache.lookup() and lookup_batch() to try MMDB first
- Update startup.py to pass config flag and log security warning when HTTP enabled
- Update all 49 tests to reflect new MMDB-primary strategy
- Add comprehensive geoip configuration section to Backend-Development.md
- Update Architekture.md to show MMDB + optional HTTP in system dependencies
- Update .env.example with BANGUI_GEOIP_DB_PATH and HTTP fallback flag
Security impact:
- 99% of IP addresses (successful MMDB lookups) now stay local, encrypted
- HTTP-only IPs are cached for 5 minutes to minimize external calls
- Operators must explicitly enable HTTP fallback (security-conscious default)
- GDPR/CCPA compliance: no PII sent over unencrypted networks by default
Fixes TASK-030: Resolved plaintext IP transmission to ip-api.com
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Change _fail2ban_connection_handler() to return generic message instead of
leaking socket path in HTTP 502 response body
- Change _fail2ban_protocol_handler() to return generic message instead of
leaking raw exception details in HTTP 502 response body
- Full error details are still logged server-side (error=str(exc)) for debugging
- Update Backend-Development.md with error message hygiene section explaining
the pattern: generic user-friendly messages in HTTP responses, full details
in server logs only
Fixes TASK-029: Fail2BanConnectionError leaks socket path in HTTP error responses
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Create logged_task() helper in backend/app/utils/async_utils.py to wrap
fire-and-forget coroutines with exception logging
- Ensures unhandled task exceptions are always logged to structlog instead of
silently discarded (Python 3.11+ RuntimeWarning)
- Update ban_service.py to use logged_task() for geo_cache.lookup_batch()
background resolution
- Add comprehensive tests for logged_task() in test_async_utils.py
- Document fire-and-forget task conventions in Backend-Development.md
The logged_task() wrapper catches any exception raised in a background task,
logs it with full traceback context and task name, and never re-raises.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Addresses security concern where FastAPI's default behavior exposes interactive
API documentation (/docs, /redoc) without authentication, allowing attackers to
enumerate endpoints and understand API schemas.
Changes:
- Add BANGUI_ENABLE_DOCS boolean setting (default: false) to Settings
- Modify create_app() to conditionally set docs_url, redoc_url, openapi_url
- Add docs endpoints to SetupRedirectMiddleware allowlist (/api/docs, /api/redoc, /api/openapi.json)
- Set BANGUI_ENABLE_DOCS=true in Docker/compose.debug.yml for development
- Production compose files leave it unset (defaults to false, docs disabled)
- Add comprehensive tests for docs configuration
- Document the new setting in Backend-Development.md
Security Impact:
- API documentation is now disabled by default in production
- Development environments can enable docs by setting BANGUI_ENABLE_DOCS=true
- Docs endpoints are inaccessible in production without manual configuration
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Remove the early-return branch that skipped HMAC verification for unsigned tokens
- Raise ValueError if the signature separator is absent
- Update unwrap_session_token docstring to reflect mandatory signing requirement
- Add comprehensive session token signing documentation to Backend-Development.md
- Document the session token format, signing/verification pattern, and security rationale
All tokens must now carry a valid HMAC-SHA256 signature. Tokens without a
signature are rejected immediately. This removes the vulnerability where an
attacker with database access could bypass the HMAC layer by using raw tokens.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Replace non-atomic db.executescript() with explicit transaction control.
Wrap each migration's DDL statements and schema_migrations insert in a
single BEGIN IMMEDIATE ... COMMIT transaction to ensure atomicity.
Changes:
- Add _parse_migration_statements() to split migration scripts into
individual statements while handling comments and string literals
- Update _apply_migration() to wrap all statements in a single explicit
transaction with rollback on error
- Ensure _get_current_schema_version() uses execute() instead of
executescript()
- Add 9 new tests for migration atomicity and statement parsing
- Update Backend-Development.md with migration authoring guidelines
If a crash occurs between DDL execution and schema_migrations insert,
the next startup will re-apply the entire migration atomically,
preventing partial migrations and data corruption.
Test coverage: 98% on db.py (up from 55%)
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Store session tokens as one-way SHA256 hashes instead of plaintext
- Hash tokens on write (create_session) and on read (get_session, delete_session)
- Add migration to drop plaintext sessions table and recreate with token_hash column
- Update Session model: token field still contains raw token for signing
- Add test to verify tokens are hashed in database, not plaintext
- Update Architekture.md to document session token hashing
- Update Backend-Development.md with implementation pattern and best practices
Prevents direct session token hijacking if database file is exposed to attacker.
If plaintext DB was readable, sessions are invalidated by the migration anyway.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
**Issue:**
- log_target accepted arbitrary paths, allowing authenticated users to write
files as root via fail2ban (e.g., /etc/cron.d/bangui-pwned)
- fail2ban runs as root and opens files specified in log_target
**Solution:**
1. **Model layer validation:** Already existed in GlobalConfigUpdate, prevents
invalid paths before reaching service
2. **Service layer validation:** Added defensive check in update_global_config()
that validates log_target even if model validation is bypassed
3. **New validation helper:** Added validate_log_target() utility that accepts
special values (STDOUT, STDERR, SYSLOG) or paths within allowed directories
**Changes:**
- app/utils/path_utils.py: Added validate_log_target() helper
- app/services/config_service.py: Added service-layer validation before
sending command to fail2ban
- backend/tests: Fixed session_secret length issues in fixtures (min 32 chars)
- backend/tests: Added tests for valid special log targets
- Docs/Backend-Development.md: Documented log_target security requirements
**Test Coverage:**
- Model validation rejects /etc/passwd (existing test)
- Model validation accepts STDOUT, STDERR, SYSLOG special values
- Model validation accepts paths in allowed directories
- Service layer validation tested with special values
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Updated config.py to support environment-based configuration
- Added test_config.py with full test coverage
- Updated Backend-Development.md with configuration documentation
- Removed outdated tasks from Tasks.md
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>