BanGUI/Docs/Tasks.md

# BanGUI — Task List

This document breaks the entire BanGUI project into development stages, ordered so that each stage builds on the previous one. Every task is described in prose with enough detail for a developer to begin work. References point to the relevant documentation.

Reference: `Docs/Refactoring.md` for full analysis of each issue.

---

## Open Issues

1. Move runtime application state out of `app.state`
   - Goal: Remove process-local mutable business state from FastAPI application state and centralise it in a cluster-safe, testable runtime state abstraction.
   - Issue: `app.state` is currently used to store runtime values such as `setup_complete_cached`, `server_status`, `pending_recovery`, and `last_activation`, which are process-specific and leak operational state into the framework layer.
   - Propose: Introduce a dedicated runtime state manager or cache service that owns these values and exposes them through explicit dependencies. Keep `app.state` limited to shared immutable resources and infrastructure handles.
   - Test: Verify the backend no longer stores business state in `app.state` and that runtime state is provided by the new abstraction only.
   - Status: completed

2. Eliminate direct `app.state` access from routers
   - Goal: Keep routers strictly in the HTTP translation layer by using FastAPI `Depends()` and injected services only.
   - Issue: Several routers still read or mutate `app.state` directly, including `setup.py` and `config.py`, which breaks the intended separation of concerns and makes handlers harder to test.
   - Propose: Refactor routers to consume explicit dependencies for required values and actions, and move all direct state mutations into services or dedicated state managers.
   - Test: Ensure all router handlers compile without requiring direct `app.state` access and existing endpoint tests still pass.
   - Status: completed

3. Replace process-local session cache with a pluggable abstraction
   - Goal: Make session validation consistent across multi-worker deployments by replacing the module-level in-memory cache with an injectable cache backend.
   - Issue: `_session_cache` in `app/dependencies.py` is a process-local dict, so logout invalidation and session revocation only work within a single process.
   - Propose: Define a cache interface and provide a default in-memory implementation, with the option to swap in shared cache storage (Redis, Memcached) for clustered production deployments.
   - Test: Add unit tests for the cache abstraction and verify logout/invalidation behaves correctly through the configured cache implementation.
   - Status: completed

4. Harden SQLite connection configuration and lifecycle
   - Goal: Make application database access robust under concurrent requests and background task load.
   - Issue: `get_db` opens a fresh `aiosqlite` connection per request without explicit database pragmas such as `busy_timeout` or WAL mode.
   - Propose: Configure SQLite connections in `open_db` with safe defaults (e.g. WAL, busy timeout, journal mode) and consider a centralized request-scoped access wrapper to keep connection setup consistent.
   - Test: Confirm `open_db` applies the expected pragmas and add simulated concurrency tests that surface lock / timeout failures.
   - Status: completed

5. Separate startup settings from runtime configuration mutation
   - Goal: Keep startup configuration immutable after bootstrap and handle runtime overrides through a dedicated manager.
   - Issue: `POST /api/setup` mutates `app.state.settings`, and startup logic also replaces `app.state.settings` with persisted runtime overrides, mixing immutable startup settings with mutable runtime config.
   - Propose: Introduce a runtime settings manager or configuration service that resolves effective settings from persisted overrides without mutating the startup settings object on `app.state`.
   - Test: Validate the setup flow persists runtime settings in the database and that runtime settings are resolved through the new manager instead of direct `app.state.settings` mutation.
   - Status: completed

6. Introduce explicit service and repository abstractions for swapability and testing
   - Goal: Reduce tight coupling and improve replacement/testing of core backend components.
   - Issue: Routers and higher-level services currently import concrete service modules directly, which prevents clean substitution and dependency override testing.
   - Propose: Define protocols or abstract base classes for major services and repositories, then wire concrete implementations through FastAPI dependency providers.
   - Test: Add tests that override a service dependency with a fake implementation and verify the router behavior remains correct.
   - Status: completed
   - Completed: Added service/repository protocols, injected auth/jail services via FastAPI dependencies, and added router tests for dependency overrides.

7. Move operational orchestration out of routers and into service/task layer
   - Goal: Keep routers thin and move operational control flow into service or task components.
   - Issue: `app/routers/config.py` contains operational orchestration such as activation crash tracking, pending recovery state updates, and forced health probes, which belongs in service/task code not the HTTP layer.
   - Propose: Refactor jail activation/deactivation/recovery coordination into services or task managers that manage state updates and health probe triggers on behalf of the router.
   - Test: Confirm router tests only cover HTTP translation while unit tests for the new service/task components cover the orchestration logic.
   - Status: completed

8. Decouple periodic background jobs from FastAPI application state
   - Goal: Make scheduled task runners explicit and testable by removing direct `app.state` dependency from background task code.
   - Issue: `history_sync`, `geo_re_resolve`, `geo_cache_flush`, and `blocklist_import` read shared resources directly from `app.state` and register jobs against `app.state.scheduler`, which couples task implementation to FastAPI internals.
   - Propose: Introduce a scheduler/task bootstrap abstraction that accepts injected settings, database providers, HTTP sessions, and scheduler handles, and move task resource access out of the `app.state` internals.
   - Test: Add unit tests for task registration and execution with fake resource providers and a mock scheduler, without needing a real FastAPI instance.

9. Remove brittle scheduler bootstrap logic from blocklist job registration
   - Goal: Ensure job registration is deterministic and compatible with both synchronous and asynchronous startup contexts.
   - Issue: `blocklist_import.register` uses `asyncio.get_event_loop().run_until_complete` then falls back to `asyncio.ensure_future`, which is fragile and can fail if the application is already running inside an event loop.
   - Propose: Refactor blocklist job registration to use a pure async bootstrap flow invoked from the lifespan handler, or resolve schedule configuration before starting the scheduler instead of forcing synchronous loop execution.
   - Test: Add tests validating blocklist job registration in both non-running and running event loop contexts, and verify the registration path does not raise loop state errors.

10. Standardize async offloading and thread-backed sync work behind a shared executor abstraction
   - Goal: Remove scattered `asyncio.get_event_loop().run_in_executor` / `asyncio.ensure_future` patterns and give the backend a single, consistent way to run blocking file, CPU-bound, or compatibility work.
   - Issue: Multiple services (`raw_config_io_service`, `log_service`, `jail_config_service`, `setup_service`, and others) manually call `asyncio.get_event_loop()` or `run_in_executor`, creating inconsistent async control flow and making event-loop compatibility harder to maintain.
   - Propose: Introduce a shared async offload abstraction or utility that uses `asyncio.to_thread` / dedicated thread pool executors, and refactor blocking I/O and CPU-bound helpers to consume that shared provider.
   - Test: Verify existing async service tests still pass after refactoring and add coverage for the new offload utility to ensure blocking work is properly delegated without event-loop blocking.

11. Eliminate duplicate setup persistence across bootstrap and runtime databases
   - Goal: Simplify setup persistence by establishing a single source of truth for runtime configuration data and reduce failure surface in first-run setup.
   - Issue: `setup_service.run_setup` writes the same master password and runtime settings to both the bootstrap configuration database and the runtime database, making setup logic brittle and difficult to reason about.
   - Propose: Consolidate setup persistence so initial configuration is stored in one clearly defined location, with bootstrap metadata in the startup DB and runtime configuration pulled from the single persisted source.
   - Test: Add tests that confirm setup writes each value exactly once and that startup uses a single persisted store to resolve runtime settings, removing duplicate write paths.

12. Replace custom fail2ban socket protocol duplication with a dedicated adapter over the vendored client
   - Goal: Reduce maintenance risk by aligning fail2ban integration with the bundled `fail2ban-master` codebase and isolating protocol implementation behind a swapable adapter.
   - Issue: `app.utils.fail2ban_client` reimplements low-level socket framing, command encoding, and protocol parsing rather than using the vendored fail2ban client classes, creating duplicated protocol logic and an unclear source of truth.
   - Propose: Introduce a fail2ban adapter interface and either wrap the vendored `fail2ban-client` implementation or refactor the custom client so it is the single canonical integration point. Ensure all services depend on the adapter abstraction rather than raw socket details.
   - Test: Add adapter-level unit tests and service tests that can swap in a fake fail2ban adapter, proving the backend no longer couples business logic directly to low-level socket protocol code.

13. Introduce explicit schema migration/versioning for the runtime database
   - Goal: Allow BanGUI to evolve its application database schema safely across releases and prevent startup failures caused by schema drift.
   - Issue: `app/db.py` only creates missing tables on startup and has no schema version marker or migration strategy. This leaves upgrades brittle and makes it difficult to evolve the database without manual intervention.
   - Propose: Add a lightweight migration system or schema version table, define incremental migration scripts, and run upgrades during startup before the application begins serving requests.
   - Test: Add tests that simulate an older schema version and verify startup migrates it to the latest schema, and assert that `init_db`/migration code leaves the database in the expected current state.

14. Split the monolithic config router into focused sub-routers
   - Goal: Improve maintainability, reduce cognitive load, and make HTTP routing responsibilities easier to isolate and test.
   - Issue: `app/routers/config.py` and `app/routers/file_config.py` each contain many unrelated endpoints with mixed concerns, making the HTTP layer large, hard to navigate, and harder to evolve safely.
   - Propose: Refactor configuration endpoints into smaller routers or feature-specific modules such as `jail_config`, `filter_config`, `action_config`, and `raw_config`. Keep each router focused on one vertical slice of functionality and delegate business logic to a single service.
   - Test: Verify the refactor preserves endpoint behavior and that router tests are split along the new modules, ensuring each HTTP layer unit test covers a narrow, well-defined feature.

15. Centralize fail2ban metadata resolution and avoid repeated socket discovery
   - Goal: Reduce coupling between history APIs and fail2ban socket metadata discovery, improving performance and reliability.
   - Issue: `app/utils/fail2ban_db_utils.get_fail2ban_db_path` queries fail2ban for the database path on every history request. This ties read-only history endpoints to socket availability and duplicates discovery logic across the codebase.
   - Propose: Introduce a dedicated fail2ban metadata service that resolves and caches runtime metadata such as the fail2ban DB path, refreshing it only when needed instead of every request.
   - Test: Add tests that confirm history requests can reuse cached metadata, and that the metadata service falls back cleanly when the socket is temporarily unavailable after initial resolution.

16. Introduce an explicit application lifecycle context to replace raw `app.state` access
   - Goal: Make shared infrastructure handles and runtime configuration available through a typed, injectable context instead of a loosely-typed `app.state` bag.
   - Issue: The backend currently relies on raw `app.state` access in routers, tasks, and dependencies, which hides contract boundaries and allows arbitrary mutable state to leak into the framework layer.
   - Propose: Define a dedicated `ApplicationContext` / runtime context object for shared resources such as `settings`, `http_session`, `scheduler`, and runtime caches. Provide it through explicit FastAPI dependencies and remove direct `request.app.state` reads outside a small bootstrap boundary.
   - Test: Verify the new context can satisfy existing dependencies, and add a static check or test ensuring no backend module directly accesses `request.app.state` except through the new context provider.

17. Resolve runtime configuration before creating shared startup resources
   - Goal: Ensure startup resources are initialized using the effective runtime settings, not stale bootstrap defaults.
   - Issue: `startup_shared_resources` creates shared resources like `aiohttp.ClientSession` and geo cache initialization from the initial environment-loaded settings, then later applies persisted runtime overrides to `app.state.settings`, producing a fragile startup ordering.
   - Propose: Split startup into phases that first resolve bootstrap and runtime persisted configuration, then construct shared resources and register scheduled jobs using those effective settings.
   - Test: Add startup tests asserting that when persisted runtime settings differ from bootstrap settings, the final initialized resources are built from the resolved effective settings, not the original bootstrap values.
   - Status: completed