- Change _fail2ban_connection_handler() to return generic message instead of
leaking socket path in HTTP 502 response body
- Change _fail2ban_protocol_handler() to return generic message instead of
leaking raw exception details in HTTP 502 response body
- Full error details are still logged server-side (error=str(exc)) for debugging
- Update Backend-Development.md with error message hygiene section explaining
the pattern: generic user-friendly messages in HTTP responses, full details
in server logs only
Fixes TASK-029: Fail2BanConnectionError leaks socket path in HTTP error responses
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Create logged_task() helper in backend/app/utils/async_utils.py to wrap
fire-and-forget coroutines with exception logging
- Ensures unhandled task exceptions are always logged to structlog instead of
silently discarded (Python 3.11+ RuntimeWarning)
- Update ban_service.py to use logged_task() for geo_cache.lookup_batch()
background resolution
- Add comprehensive tests for logged_task() in test_async_utils.py
- Document fire-and-forget task conventions in Backend-Development.md
The logged_task() wrapper catches any exception raised in a background task,
logs it with full traceback context and task name, and never re-raises.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Addresses security concern where FastAPI's default behavior exposes interactive
API documentation (/docs, /redoc) without authentication, allowing attackers to
enumerate endpoints and understand API schemas.
Changes:
- Add BANGUI_ENABLE_DOCS boolean setting (default: false) to Settings
- Modify create_app() to conditionally set docs_url, redoc_url, openapi_url
- Add docs endpoints to SetupRedirectMiddleware allowlist (/api/docs, /api/redoc, /api/openapi.json)
- Set BANGUI_ENABLE_DOCS=true in Docker/compose.debug.yml for development
- Production compose files leave it unset (defaults to false, docs disabled)
- Add comprehensive tests for docs configuration
- Document the new setting in Backend-Development.md
Security Impact:
- API documentation is now disabled by default in production
- Development environments can enable docs by setting BANGUI_ENABLE_DOCS=true
- Docs endpoints are inaccessible in production without manual configuration
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Remove the early-return branch that skipped HMAC verification for unsigned tokens
- Raise ValueError if the signature separator is absent
- Update unwrap_session_token docstring to reflect mandatory signing requirement
- Add comprehensive session token signing documentation to Backend-Development.md
- Document the session token format, signing/verification pattern, and security rationale
All tokens must now carry a valid HMAC-SHA256 signature. Tokens without a
signature are rejected immediately. This removes the vulnerability where an
attacker with database access could bypass the HMAC layer by using raw tokens.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Replace non-atomic db.executescript() with explicit transaction control.
Wrap each migration's DDL statements and schema_migrations insert in a
single BEGIN IMMEDIATE ... COMMIT transaction to ensure atomicity.
Changes:
- Add _parse_migration_statements() to split migration scripts into
individual statements while handling comments and string literals
- Update _apply_migration() to wrap all statements in a single explicit
transaction with rollback on error
- Ensure _get_current_schema_version() uses execute() instead of
executescript()
- Add 9 new tests for migration atomicity and statement parsing
- Update Backend-Development.md with migration authoring guidelines
If a crash occurs between DDL execution and schema_migrations insert,
the next startup will re-apply the entire migration atomically,
preventing partial migrations and data corruption.
Test coverage: 98% on db.py (up from 55%)
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Store session tokens as one-way SHA256 hashes instead of plaintext
- Hash tokens on write (create_session) and on read (get_session, delete_session)
- Add migration to drop plaintext sessions table and recreate with token_hash column
- Update Session model: token field still contains raw token for signing
- Add test to verify tokens are hashed in database, not plaintext
- Update Architekture.md to document session token hashing
- Update Backend-Development.md with implementation pattern and best practices
Prevents direct session token hijacking if database file is exposed to attacker.
If plaintext DB was readable, sessions are invalidated by the migration anyway.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
**Issue:**
- log_target accepted arbitrary paths, allowing authenticated users to write
files as root via fail2ban (e.g., /etc/cron.d/bangui-pwned)
- fail2ban runs as root and opens files specified in log_target
**Solution:**
1. **Model layer validation:** Already existed in GlobalConfigUpdate, prevents
invalid paths before reaching service
2. **Service layer validation:** Added defensive check in update_global_config()
that validates log_target even if model validation is bypassed
3. **New validation helper:** Added validate_log_target() utility that accepts
special values (STDOUT, STDERR, SYSLOG) or paths within allowed directories
**Changes:**
- app/utils/path_utils.py: Added validate_log_target() helper
- app/services/config_service.py: Added service-layer validation before
sending command to fail2ban
- backend/tests: Fixed session_secret length issues in fixtures (min 32 chars)
- backend/tests: Added tests for valid special log targets
- Docs/Backend-Development.md: Documented log_target security requirements
**Test Coverage:**
- Model validation rejects /etc/passwd (existing test)
- Model validation accepts STDOUT, STDERR, SYSLOG special values
- Model validation accepts paths in allowed directories
- Service layer validation tested with special values
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Updated config.py to support environment-based configuration
- Added test_config.py with full test coverage
- Updated Backend-Development.md with configuration documentation
- Removed outdated tasks from Tasks.md
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Replace Path.write_text() with tempfile.NamedTemporaryFile + os.replace()
in _write_conf_file() and _create_conf_file()
- Ensures atomic operations on same filesystem (temp file in target.parent)
- Prevents config corruption if process is killed mid-write
- Follows existing pattern in jail_config_service.py
- Add proper cleanup of temp files on error with contextlib.suppress()
- Document atomic file write conventions in Backend-Development.md
This prevents fail2ban config files (especially jail.d/*.conf) from being
left in a truncated or corrupt state, which could disable active protection.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Add escape_like() helper to escape % and _ wildcards in LIKE queries
- Update fail2ban_db_repo.get_history_page() to use escaping
- Update history_archive_repo.get_archived_history() to use escaping
- Add ESCAPE clause to all LIKE queries
- Add comprehensive unit tests for escape_like function
- Add integration tests for LIKE wildcard handling
- Document LIKE escaping best practices in Backend-Development.md
Fixes TASK-017: Prevent unintended LIKE matches when IP filter contains
special characters like underscore or percent sign.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Extract path validation logic into shared helper function in
backend/app/utils/path_utils.py (validate_log_path)
- Refactor AddLogPathRequest to use the helper function
- Apply the same validation to DELETE /api/config/jails/{name}/logpath
endpoint by validating the log_path query parameter
- Return HTTP 422 with descriptive error if validation fails
- Add comprehensive unit tests for path validation
- Update Backend-Development.md with usage examples
This prevents path-traversal attacks on the delete_log_path endpoint
by ensuring all log paths are within allowlisted directories.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Add LogLevel Literal type: CRITICAL, ERROR, WARNING, NOTICE, INFO, DEBUG
- Add log_target validation to accept special values (STDOUT, STDERR, SYSLOG)
or validated file paths within allowed directories
- Update GlobalConfigResponse to use LogLevel type
- Add field_validator for log_target in both GlobalConfigUpdate and
GlobalConfigResponse following the same pattern as AddLogPathRequest
- Add @autouse fixture to test_config_service.py to mock get_settings
- Update existing tests to use uppercase log level values
- Add 12 comprehensive tests for new validation in test_models.py
- Update Features.md to document valid log_target and log_level values
- Add section to Backend-Development.md documenting Literal types and
field_validator patterns with examples
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Restrict monitored log paths to a configurable allowlist of safe directories
to prevent authenticated users from instructing fail2ban to monitor arbitrary
files on the system, which could leak contents via fail2ban logging.
Changes:
- Add 'allowed_log_dirs' setting to Settings (defaults to /var/log, /config/log)
- Add @field_validator to AddLogPathRequest to validate log paths at request time
- Validator resolves paths to canonical form and checks against allowed prefixes
- Use Path.is_relative_to() to prevent prefix bypass attacks like /var/log_evil
- Add comprehensive tests for valid/invalid paths and symlink handling
- Update Features.md and Backend-Development.md with security documentation
Security improvements:
- Blocks access to sensitive files (/etc/shadow, /etc/passwd, etc.)
- Resolves symlinks before validation to prevent escape routes
- Uses proper path comparison instead of string prefix matching
- Configurable via BANGUI_ALLOWED_LOG_DIRS environment variable
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Replace sensitive token fragments in structured logs with:
- login(): Use session_id=session.id (database row ID) instead of token_prefix
- logout(): Use token_hash (SHA256 one-way hash, first 12 chars) instead of token_prefix
This prevents partial token material leakage into log aggregation systems while
maintaining useful session correlation via hashed tokens or database IDs.
Also updated Backend-Development.md to clarify logging conventions for
sensitive data handling.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Add @field_validator for fail2ban_start_command to validate with shlex.split()
at startup, catching misconfigured commands with mismatched quotes
- Replace .split() with shlex.split() in jail_config.py line 450
- Replace .split() with shlex.split() in config_misc.py line 154
- Update Backend-Development.md with configuration documentation explaining
quoted path handling and common pitfalls
- Add comprehensive test suite (8 tests) covering valid commands, quoted paths,
and mismatched quote errors
This fix ensures commands like '/opt/my tools/fail2ban-client' start are
correctly parsed as two tokens instead of three, preventing execution failures
when the path contains spaces.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Add in-memory rate limiter with per-IP deque tracking of attempt timestamps
- Limit login attempts to 5 per 60 seconds per IP, return 429 on excess
- Add Retry-After header to rate limit responses
- Implement IP extraction utility with proxy trust validation (prevent X-Forwarded-For spoofing)
- Integrate rate limiter into auth router and dependencies
- Add 10-second asyncio.sleep on failed login attempts to further slow brute-force
- Add comprehensive tests for rate limiting (9 new tests, all passing)
- Update Features.md to document login rate limiting
- Update Backend-Development.md with rate limiting conventions and design patterns
- Fix test infrastructure issues: update password to meet complexity requirements
- Fix TestValidateSession tests to use Bearer token authentication
- All tests passing: 23 auth tests + full test suite coverage
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Validates session on app mount by calling GET /api/auth/session instead of relying
solely on cached sessionStorage. This ensures the UI state always reflects server
reality — expired or revoked sessions are detected immediately.
Changes:
- Backend: Add GET /api/auth/session endpoint (requires valid session, returns 200/401)
- Frontend: Add useSessionValidation hook for mount-time validation
- Frontend: Add SessionValidationLoading component for validation spinner
- Frontend: Update AuthProvider to call validation on mount with loading state
- Frontend: Add validateSession API function
- Docs: Update Features.md with session validation behavior
- Docs: Update Web-Development.md with session validation pattern
Handles three outcomes:
1. Valid session (200): Proceed with cached state
2. Invalid session (401): Clear sessionStorage and redirect to login
3. Network error: Don't logout (backend may be temporarily unreachable)
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Add comprehensive docstring to runtime_state.py explaining single-process
constraint, impacts in multi-worker deployments, and solution approach
- Add comprehensive docstring to session_cache.py explaining process-local
cache limitation, security implications, and Redis/database alternatives
- Update Architecture.md to clarify session cache is process-local and
describe single-worker enforcement via TASK-002
- Update Architecture.md runtime state section with detailed explanation of
per-process state and multi-worker impacts
- Add Backend-Development.md section 13.7.2 documenting session cache
pluggability pattern with example Redis implementation
- All tests pass; linting passes; type checking has pre-existing errors
This is the short-term fix for TASK-003: enforce single-worker deployment
(TASK-002) and document the constraint clearly. The long-term fix (Redis
backend) is deferred as a follow-up.
- Add _check_single_worker_mode() to startup.py that detects and rejects
multi-worker configurations, raising a clear RuntimeError with instructions
- Set BANGUI_WORKERS=1 as default in Dockerfile.backend
- Document single-worker requirement in compose.prod.yml
- Add 'Deployment Constraints' section to Architekture.md explaining why
single-worker mode is required and detailing future multi-worker support
- Add '9.1 Background Tasks and Scheduler Architecture' section to
Backend-Development.md documenting task structure and single-worker requirement
- Add comprehensive test suite (test_startup.py) covering all scenarios:
allows single worker, rejects multi-worker, validates config format,
and verifies informative error messages
This fix addresses TASK-002 which identified that in-process APScheduler is
unsafe in multi-worker deployments due to each worker creating independent
scheduler instances, causing duplicate background job execution.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Problem: Repository modules use structural typing to satisfy Protocol interfaces via
cast(). A function rename, parameter change, or signature mismatch would silently pass
mypy but fail at runtime.
Solution (Option B — minimal):
1. Aligned Protocol signatures in protocols.py with actual implementations:
- BlocklistRepository: dict[str, object] → dict[str, Any] (matches implementation)
- ImportLogRepository: dict[str, object] → ImportLogRow (typed model)
- GeoCacheRepository: dict[str, object] → GeoCacheRow; Iterable → Sequence
- HistoryArchiveRepository: dict[str, object] → dict[str, Any]
- ImportLogRepository: async compute_total_pages → sync (matches implementation)
2. Created CI validation script (backend/scripts/validate_repository_protocols.py)
that runs at build time to ensure all repository modules satisfy their Protocol
interfaces. Exit 0 if valid, 1 if any mismatch. Detects:
- Missing functions
- Parameter count mismatches
- Type annotation mismatches
- Return type mismatches
3. Updated backend/app/dependencies.py with explicit docstrings linking each
get_*_repo() provider to Backend-Development.md § 13.7.1, explaining the
module-as-Protocol pattern and that it is intentional and validated.
4. Documented the pattern in Backend-Development.md § 13.7.1:
'Repository Module Pattern — Module-as-Protocol Structural Compatibility'
explaining why the pattern works, risks (silent breakage), and how the
validation mitigates it.
5. Fixed type annotation in history_archive_repo.py:
- get_all_archived_history returns list[dict] → list[dict[str, Any]]
- Imported Any type
Benefits:
- Prevents silent breakage of repository interfaces
- Formalizes the module-as-Protocol pattern as intentional
- CI validation prevents regressions without refactoring cost
- All repository tests pass (53/53)
- mypy --strict passes on modified files
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Instead of returning a bound method (geo_cache.lookup_batch), now inject
the GeoCache instance directly into routers and services. This provides
proper runtime isolation since T-04 made GeoCache a proper object.
Changes:
- Remove get_geo_batch_lookup() dependency provider
- Add GeoCacheDep type alias for injecting GeoCache instances
- Update all routers (bans, blocklist, dashboard, jails) to use GeoCacheDep
- Update ban_service, blocklist_service, jail_service to accept GeoCache
- Update service protocols to match new signatures
- Update docstrings to reference GeoCache methods instead of module functions
All callers now call geo_cache.lookup_batch(...) directly instead of
geo_batch_lookup(...), providing real dependency injection with proper
testing isolation.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Consolidate the two divergent implementations of _since_unix from ban_service.py
and history_service.py into a single shared utility function in time_utils.py.
Changes:
- Move _since_unix to app/utils/time_utils.py with consistent time.time() approach
- Move TIME_RANGE_SLACK_SECONDS constant to app/utils/constants.py
- Update ban_service.py to import since_unix from time_utils
- Update history_service.py to import since_unix from time_utils
- Both services now use the same window boundary calculation with 60-second slack
- Add comprehensive tests for the shared since_unix function
- Document timestamp handling rationale in Backend-Development.md
This ensures dashboard and history queries return consistent row counts for the
same time range by using the same timestamp calculation and slack window across
all services.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Refactor action_config_service, filter_config_service, jail_config_service, and jail_service
- Add jail_socket utility module for socket communication
- Update test_jail_service with new test cases
- Update architecture and task documentation
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The AppState Protocol (lines 42-54) and ApplicationContext dataclass
(lines 57-69) described identical fields, creating maintenance burden.
The Protocol was only used for a single cast() in _build_app_context.
Changes:
- Remove AppState Protocol class
- Import ApplicationState from runtime_state.py
- Replace cast('AppState', request.app.state) with
cast(ApplicationState, request.app.state)
- Remove unused Protocol import
This eliminates the redundancy while maintaining the same typing
guarantees. request.app.state is set to ApplicationState instances
during app initialization in main.py.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Move session cache initialization from per-request _build_app_context to
startup lifespan handler. The session cache type is now decided once at app
startup based on settings, making _build_app_context pure (read-only).
Changes:
- Move cache initialization logic to new _update_session_cache() in main.py
- Call _update_session_cache() during lifespan startup to initialize cache
- Remove three if/elif/elif branches mutating state.session_cache from _build_app_context
- Add cache swap logic to set_runtime_settings() in runtime_state.py to handle
runtime settings changes (e.g., setup wizard updates)
- Keep app.state.session_cache initialization in create_app() for test compatibility
This ensures:
- _build_app_context is pure and doesn't mutate app state on each request
- Session cache configuration decisions are centralized at startup
- Settings changes during runtime (via setup wizard) also trigger cache swap
- Cache initialization logic is isolated in one place
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Create GeoCache class with all mutable state as instance attributes:
- _cache, _neg_cache, _dirty, _geoip_reader, _geoip_initialized, _cache_lock
- All public methods: lookup(), lookup_batch(), lookup_cached_only(), flush_dirty(), load_from_db(), clear(), etc.
Initialization & Dependency Injection:
- Instantiate GeoCache in startup.py and store on app.state.geo_cache
- Add get_geo_cache() dependency function in dependencies.py
- Inject into routes and tasks via FastAPI's dependency system
Backward Compatibility:
- Maintain module-level functions in geo_service.py as deprecated wrappers
- All old callers continue to work through _default_geo_cache instance
- Remove test-escape-hatch functions (clear_cache, clear_neg_cache moved to methods)
Background Tasks:
- Update geo_cache_flush.py and geo_re_resolve.py to receive GeoCache instance
- Tasks now operate on injected instance rather than module globals
Tests:
- Refactor test_geo_service.py with geo_cache fixture providing fresh instances
- Update patch paths to target GeoCache methods correctly
- Fix internal state assertions to access instance attributes
Documentation:
- Update Architekture.md to document GeoCache as managed stateful service
- Describe cache lifecycle (load on startup, flush periodically, re-resolve stale)
- Note process-local limitations for multi-worker deployments
Fixes violation of Single Responsibility Principle: module no longer owns both
lookup logic and cache lifecycle management. Cache is now a first-class
injectable service with transparent lifecycle.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Reorganized dashboard router with improved structure
- Enhanced ban_service with better separation of concerns
- Updated history service with cleaner logic
- Improved constants and configuration handling
- Updated documentation of completed tasks
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
After removing all try/except blocks that used HTTPException for domain
exception conversion, these imports are no longer needed in the routers.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Consolidate duplicate _ok(), _to_dict(), ensure_list(), and is_not_found_error()
functions from 6 service modules into a single canonical implementation at
backend/app/utils/fail2ban_response.py.
Changes:
- Create fail2ban_response.py with canonical implementations
- Remove local duplicates from: ban_service, jail_service, config_service,
health_service, server_service, config_file_utils
- Update all imports to use shared module
- Add comprehensive docstrings and examples
- Update Architecture.md and Backend-Development.md documentation
Benefits:
- Single source of truth for response parsing logic
- Eliminates code duplication across service layer
- Improves maintainability and consistency
- Enables centralized bug fixes and improvements
Tests: All 228 service tests passing, no regressions
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>