- Add ban domain model for core business logic separation
- Implement mapper pattern for DTO/domain conversions
- Update ban service with new domain-driven approach
- Refactor router endpoints to use new architecture
- Add comprehensive mapper tests
- Update documentation with architecture changes
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
TASK-004: Replace module-level mutable runtime flags in service layer with
injected state holder, eliminating hidden global state and improving testability
and synchronization boundaries.
Changes:
- Create JailServiceState dataclass in app/utils/runtime_state.py to hold
backend capability cache and synchronization lock
- Add JailServiceState as a field in RuntimeState (with default_factory)
- Remove module-level _backend_cmd_supported and _backend_cmd_lock from
jail_service.py
- Refactor _check_backend_cmd_supported() to accept state parameter
- Inject JailServiceState into list_jails() and _fetch_jail_summary() via
parameters
- Add get_jail_service_state() dependency provider in app/dependencies.py
- Add JailServiceStateDep type alias for router injection
- Update jails router to receive and pass state to service functions
- Update all tests to use jail_service_state fixture and pass state to functions
- Remove duplicate _MAX_PAGE_SIZE constant definition
- Document mutable state management in Backend-Development.md
- Update Architecture.md to describe JailServiceState and state nesting pattern
Benefits:
- Eliminates global mutable state and associated race conditions
- Makes state visible to callers (not hidden in module scope)
- Enables test isolation (each test gets fresh state)
- Prepares codebase for multi-worker deployments (state can be extracted to
shared backend)
- Synchronization boundaries are now explicit (state.get_backend_cmd_lock())
Compliance:
- All tests pass (17 passed in TestListJails, TestGetJail, TestLockInitialization)
- No ruff linting errors
- Type-safe: JailServiceState properly typed with asyncio.Lock, bool | None
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Extracted the monolithic import_source() function (776 lines) into focused,
testable components with clear single responsibilities:
- BlocklistDownloader: HTTP download with exponential backoff retry logic
* Handles transient failures (429, 5xx errors, timeouts)
* Configurable retry attempts and backoff strategy
* 93% test coverage
- BlocklistParser: Parse and validate IP addresses
* Extract valid IPv4/IPv6 addresses from text
* Skip CIDRs and malformed entries gracefully
* Separate parsing from validation concerns
* 100% test coverage
- BanExecutor: Ban execution with error handling
* Ban IPs via fail2ban socket
* Stop on JailNotFoundError (jail doesn't exist)
* Continue on JailOperationError (individual ban failures)
* 100% test coverage
- BlocklistImportWorkflow: Thin orchestrator
* Coordinates the download → parse → ban → log flow
* Pre-warms geo cache with newly banned IPs
* 96% test coverage
- blocklist_service.py: Maintains public API
* Source CRUD (create, read, update, delete)
* URL validation and preview functionality
* Scheduling configuration and import triggers
* 92% test coverage
Benefits:
* Each component is independently testable with mock dependencies
* Error handling is explicit and localized
* Components can evolve independently
* Logging is contextual and clear
* Retry and transient error handling are isolated
Testing:
* All 36 existing blocklist_service tests pass
* All 13 blocklist import task tests pass
* Added 17 comprehensive component unit tests
* Combined 96%+ coverage on new modules
* Zero type errors in new code
Documentation:
* Updated Refactoring.md with detailed architecture notes
* Added component architecture diagram to Architekture.md
* Documented ownership and responsibilities of each component
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add automatic cleanup of stale geolocation cache entries to prevent
unbounded database growth. Resolves the issue where unique IP addresses
accumulated indefinitely in the geo_cache table, degrading query performance.
## Changes
### Database Schema (Migration 3)
- Add 'last_seen' column to geo_cache table tracking last reference time
- Existing entries default to current timestamp
### Repository Layer (geo_cache_repo.py)
- Update upsert_entry() to set/refresh last_seen on insert/update
- Update upsert_neg_entry() to set/refresh last_seen on negative cache hits
- Update bulk_upsert_entries() to set/refresh last_seen in batch operations
- Add delete_stale_entries(db, cutoff_iso) -> int for purging old entries
### Background Task (geo_cache_cleanup.py)
- New APScheduler task that runs nightly (24-hour interval)
- Calculates cutoff as 90 days ago from current time (UTC)
- Deletes all entries with last_seen older than cutoff
- Logs operation results (info when deleted > 0, debug when 0 deleted)
- Configurable retention period via GEO_CACHE_RETENTION_DAYS constant
### Application Startup (startup.py)
- Register geo_cache_cleanup task in scheduler during app startup
- Placed after geo_cache_flush in task registration order
### Tests
- Add delete_stale_entries test cases covering:
* Removal of old entries beyond cutoff
* No deletion when all entries are recent
* Empty table edge case
- Update existing test fixtures to include last_seen column
- Add full test suite for cleanup task registration and execution
### Documentation
- Architekture.md: Document cleanup task, update schema/diagram
- Backend-Development.md: Add retention policy documentation
## Behavior
When an IP is accessed, its last_seen is refreshed. After 90 days of no
access, an IP is purged by the nightly cleanup. On next encounter, the IP
is re-resolved from MaxMind MMDB or ip-api.com (if configured).
This is acceptable because:
1. Stale geolocation data may become inaccurate over time
2. Re-resolution cost is minimal compared to unbounded storage growth
3. Active IPs maintain fresh data through their last_seen updates
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Make MaxMind GeoLite2-Country MMDB the primary IP resolver (local, encrypted)
and demote ip-api.com to optional fallback only (disabled by default).
Changes:
- Add geoip_allow_http_fallback config flag (default False) to Settings
- Refactor GeoCache.lookup() and lookup_batch() to try MMDB first
- Update startup.py to pass config flag and log security warning when HTTP enabled
- Update all 49 tests to reflect new MMDB-primary strategy
- Add comprehensive geoip configuration section to Backend-Development.md
- Update Architekture.md to show MMDB + optional HTTP in system dependencies
- Update .env.example with BANGUI_GEOIP_DB_PATH and HTTP fallback flag
Security impact:
- 99% of IP addresses (successful MMDB lookups) now stay local, encrypted
- HTTP-only IPs are cached for 5 minutes to minimize external calls
- Operators must explicitly enable HTTP fallback (security-conscious default)
- GDPR/CCPA compliance: no PII sent over unencrypted networks by default
Fixes TASK-030: Resolved plaintext IP transmission to ip-api.com
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Store session tokens as one-way SHA256 hashes instead of plaintext
- Hash tokens on write (create_session) and on read (get_session, delete_session)
- Add migration to drop plaintext sessions table and recreate with token_hash column
- Update Session model: token field still contains raw token for signing
- Add test to verify tokens are hashed in database, not plaintext
- Update Architekture.md to document session token hashing
- Update Backend-Development.md with implementation pattern and best practices
Prevents direct session token hijacking if database file is exposed to attacker.
If plaintext DB was readable, sessions are invalidated by the migration anyway.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
TASK-006: Document the nginx routing configuration that ensures API requests
returning 404 from FastAPI are not intercepted by the SPA wildcard fallback
rule. This prevents development bugs from being masked by 200 responses
containing HTML instead of 404 errors.
Added section 9.2 in Architekture.md covering:
- nginx location block priority (longest-prefix matching)
- Routing configuration for /api/, /assets/, and /
- Detailed routing behavior diagrams
- Critical implementation notes to prevent regressions
The current nginx.conf is already correct:
- /api/ location has no try_files and proxies directly to backend
- /assets/ location uses try_files with =404
- / catch-all uses SPA fallback to index.html
This ensures:
✓ API typos like /api/jailss return 404, not SPA HTML
✓ Frontend routes serve SPA HTML for client-side routing
✓ Static assets properly return 404 when missing
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Add comprehensive docstring to runtime_state.py explaining single-process
constraint, impacts in multi-worker deployments, and solution approach
- Add comprehensive docstring to session_cache.py explaining process-local
cache limitation, security implications, and Redis/database alternatives
- Update Architecture.md to clarify session cache is process-local and
describe single-worker enforcement via TASK-002
- Update Architecture.md runtime state section with detailed explanation of
per-process state and multi-worker impacts
- Add Backend-Development.md section 13.7.2 documenting session cache
pluggability pattern with example Redis implementation
- All tests pass; linting passes; type checking has pre-existing errors
This is the short-term fix for TASK-003: enforce single-worker deployment
(TASK-002) and document the constraint clearly. The long-term fix (Redis
backend) is deferred as a follow-up.
- Add _check_single_worker_mode() to startup.py that detects and rejects
multi-worker configurations, raising a clear RuntimeError with instructions
- Set BANGUI_WORKERS=1 as default in Dockerfile.backend
- Document single-worker requirement in compose.prod.yml
- Add 'Deployment Constraints' section to Architekture.md explaining why
single-worker mode is required and detailing future multi-worker support
- Add '9.1 Background Tasks and Scheduler Architecture' section to
Backend-Development.md documenting task structure and single-worker requirement
- Add comprehensive test suite (test_startup.py) covering all scenarios:
allows single worker, rejects multi-worker, validates config format,
and verifies informative error messages
This fix addresses TASK-002 which identified that in-process APScheduler is
unsafe in multi-worker deployments due to each worker creating independent
scheduler instances, causing duplicate background job execution.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Refactor action_config_service, filter_config_service, jail_config_service, and jail_service
- Add jail_socket utility module for socket communication
- Update test_jail_service with new test cases
- Update architecture and task documentation
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Create GeoCache class with all mutable state as instance attributes:
- _cache, _neg_cache, _dirty, _geoip_reader, _geoip_initialized, _cache_lock
- All public methods: lookup(), lookup_batch(), lookup_cached_only(), flush_dirty(), load_from_db(), clear(), etc.
Initialization & Dependency Injection:
- Instantiate GeoCache in startup.py and store on app.state.geo_cache
- Add get_geo_cache() dependency function in dependencies.py
- Inject into routes and tasks via FastAPI's dependency system
Backward Compatibility:
- Maintain module-level functions in geo_service.py as deprecated wrappers
- All old callers continue to work through _default_geo_cache instance
- Remove test-escape-hatch functions (clear_cache, clear_neg_cache moved to methods)
Background Tasks:
- Update geo_cache_flush.py and geo_re_resolve.py to receive GeoCache instance
- Tasks now operate on injected instance rather than module globals
Tests:
- Refactor test_geo_service.py with geo_cache fixture providing fresh instances
- Update patch paths to target GeoCache methods correctly
- Fix internal state assertions to access instance attributes
Documentation:
- Update Architekture.md to document GeoCache as managed stateful service
- Describe cache lifecycle (load on startup, flush periodically, re-resolve stale)
- Note process-local limitations for multi-worker deployments
Fixes violation of Single Responsibility Principle: module no longer owns both
lookup logic and cache lifecycle management. Cache is now a first-class
injectable service with transparent lifecycle.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Consolidate duplicate _ok(), _to_dict(), ensure_list(), and is_not_found_error()
functions from 6 service modules into a single canonical implementation at
backend/app/utils/fail2ban_response.py.
Changes:
- Create fail2ban_response.py with canonical implementations
- Remove local duplicates from: ban_service, jail_service, config_service,
health_service, server_service, config_file_utils
- Update all imports to use shared module
- Add comprehensive docstrings and examples
- Update Architecture.md and Backend-Development.md documentation
Benefits:
- Single source of truth for response parsing logic
- Eliminates code duplication across service layer
- Improves maintainability and consistency
- Enables centralized bug fixes and improvements
Tests: All 228 service tests passing, no regressions
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Move all shared domain exception classes to backend/app/exceptions.py and update services/routers to import the canonical exceptions. Update docs to reflect the shared exceptions source.
Extract jail, filter, and action configuration management into separate
domain-focused service modules:
- jail_config_service.py: Jail activation, deactivation, validation, rollback
- filter_config_service.py: Filter discovery, CRUD, assignment to jails
- action_config_service.py: Action discovery, CRUD, assignment to jails
Benefits:
- Reduces monolithic 3100-line module into three focused modules
- Improves readability and maintainability per domain
- Clearer separation of concerns following single responsibility principle
- Easier to test domain-specific functionality in isolation
- Reduces coupling - each service only depends on its needed utilities
Changes:
- Create three new service modules under backend/app/services/
- Update backend/app/routers/config.py to import from new modules
- Update exception and function imports to source from appropriate service
- Update Architecture.md to reflect new service organization
- All existing tests continue to pass with new module structure
Relates to Task 4 of refactoring backlog in Docs/Tasks.md
Task 2: adds a new Log tab to the Configuration page.
Backend:
- New Pydantic models: Fail2BanLogResponse, ServiceStatusResponse
(backend/app/models/config.py)
- New service methods in config_service.py:
read_fail2ban_log() — queries socket for log target/level, validates the
resolved path against a safe-prefix allowlist (/var/log) to prevent
path traversal, then reads the tail of the file via the existing
_read_tail_lines() helper; optional substring filter applied server-side.
get_service_status() — delegates to health_service.probe() and appends
log level/target from the socket.
- New endpoints in routers/config.py:
GET /api/config/fail2ban-log?lines=200&filter=...
GET /api/config/service-status
Both require authentication; log endpoint returns 400 for non-file log
targets or path-traversal attempts, 502 when fail2ban is unreachable.
Frontend:
- New LogTab.tsx component:
Service Health panel (Running/Offline badge, version, jail count, bans,
failures, log level/target, offline warning banner).
Log viewer with color-coded lines (error=red, warning=yellow,
debug=grey), toolbar (filter input + debounce, lines selector, manual
refresh, auto-refresh with interval selector), truncation notice, and
auto-scroll to bottom on data updates.
fetchData uses Promise.allSettled so a log-read failure never hides the
service-health panel.
- Types: Fail2BanLogResponse, ServiceStatusResponse (types/config.ts)
- API functions: fetchFail2BanLog, fetchServiceStatus (api/config.ts)
- Endpoint constants (api/endpoints.ts)
- ConfigPage.tsx: Log tab added after existing tabs
Tests:
- Backend service tests: TestReadFail2BanLog (6), TestGetServiceStatus (2)
- Backend router tests: TestGetFail2BanLog (8), TestGetServiceStatus (3)
- Frontend: LogTab.test.tsx (8 tests)
Docs:
- Features.md: Log section added under Configuration View
- Architekture.md: config.py router and config_service.py descriptions updated
- Tasks.md: Task 2 marked done
- Backend: config_file_service.py parses jail.conf/jail.local/jail.d/*
following fail2ban merge order; discovers jails not running in fail2ban
- Backend: 3 new API endpoints (GET /jails/inactive, POST /jails/{name}/activate,
POST /jails/{name}/deactivate); moved /jails/inactive before /jails/{name}
to fix route-ordering conflict
- Frontend: ActivateJailDialog component with optional parameter overrides
- Frontend: JailsTab extended with inactive jail list and InactiveJailDetail pane
- Frontend: JailsPage JailOverviewSection shows inactive jails with toggle
- Tests: 57 service tests + 16 router tests for all new endpoints (all pass)
- Docs: Features.md, Architekture.md, Tasks.md updated; Tasks 1.1-1.5 marked done
- backend: GET /api/dashboard/bans/by-jail endpoint
- JailBanCount + BansByJailResponse Pydantic models in ban.py
- bans_by_jail() service function with origin filter support
- Route added to dashboard router
- 17 new tests (7 service, 10 router); full suite 497 passed, 83% coverage
- frontend: JailDistributionChart component
- JailBanCount / BansByJailResponse types in types/ban.ts
- dashboardBansByJail endpoint constant in api/endpoints.ts
- fetchBansByJail() in api/dashboard.ts
- useJailDistribution hook in hooks/useJailDistribution.ts
- JailDistributionChart component (horizontal bar chart, Recharts)
- DashboardPage: full-width Jail Distribution section below Top Countries
- Implement ban model, service, and router endpoints in backend
- Add ban table component and dashboard integration in frontend
- Update ban-related types and API endpoints
- Add comprehensive tests for ban service and dashboard router
- Update documentation (Features, Tasks, Architecture, Web-Design)
- Clean up old fail2ban configuration files
- Update Makefile with new commands
- Add SetupGuard component: redirects to /setup if setup not complete,
shown as spinner while loading. All routes except /setup now wrapped.
- SetupPage redirects to /login on mount when setup already done.
- Fix async blocking: offload bcrypt.hashpw and bcrypt.checkpw to
run_in_executor so they never stall the asyncio event loop.
- Hash password with SHA-256 (SubtleCrypto) before transmission; added
src/utils/crypto.ts with sha256Hex(). Backend stores bcrypt(sha256).
- Add Makefile with make up/down/restart/logs/clean targets.
- Add tests: _check_password async, concurrent bcrypt, expired session,
login-without-setup, run_setup event-loop interleaving.
- Update Architekture.md and Features.md to reflect all changes.