# BanGUI — Task List This document breaks the entire BanGUI project into development stages, ordered so that each stage builds on the previous one. Every task is described in prose with enough detail for a developer to begin work. References point to the relevant documentation. Reference: `Docs/Refactoring.md` for full analysis of each issue. --- ## Open Issues --- ### Task 1 — Move `Fail2BanConnectionError` and `Fail2BanProtocolError` to `app/exceptions.py` **Found in:** `backend/app/utils/fail2ban_client.py` lines 128 and 142. Every router that catches these must currently `from app.utils.fail2ban_client import Fail2BanConnectionError`, importing a transport-layer utility directly. **Goal:** Move (or re-export) both exception classes into `app/exceptions.py` alongside the rest of the domain error hierarchy. Routers and services should import them from `app.exceptions`, not from a utility module. `fail2ban_client.py` can keep the class definitions and simply re-export from `exceptions`, or import them from there — whichever direction avoids a circular import. **Possible traps and issues:** - `fail2ban_client.py` is in the `utils` layer; `exceptions.py` is in `app`. Moving the definition to `exceptions.py` and importing it back into `fail2ban_client.py` (for raising) is the natural direction, but verify there is no circular import chain (`exceptions` → `utils` → `exceptions`). - All callers across `routers/` and `services/` must be updated to import from the new location. A global grep is needed before starting. - `main.py` registers global exception handlers for both; that import path must be updated too. **Docs changes needed:** Update `Docs/Refactoring.md` to mark this issue resolved. No user-facing API change. **Why this is needed:** All domain exceptions should live in one registry so that global exception handlers, tests, and consumers have a single import target. Having transport-layer utilities define exceptions that the entire router layer must catch defeats the purpose of `app/exceptions.py`. **Status:** Completed ✅ --- ### Task 2 — Move the five `ConfigFile*` exceptions out of `raw_config_io_service.py` **Found in:** `backend/app/services/raw_config_io_service.py` lines 66–100 (`ConfigDirError`, `ConfigFileNotFoundError`, `ConfigFileExistsError`, `ConfigFileWriteError`, `ConfigFileNameError`). `backend/app/routers/file_config.py` lines 55–61 imports them directly from the service. **Goal:** Move all five class definitions into `app/exceptions.py`. Update `raw_config_io_service.py` to import and raise them from there. Update `file_config.py` to import them from `app.exceptions`. **Possible traps and issues:** - The existing names must be preserved (they are part of the public raise/catch contract); do not rename them. - `file_config.py` currently imports from `app.services.raw_config_io_service` — that import line must change. Check whether any test files also import these classes directly from the service. - After the move, confirm that `raw_config_io_service.py` still raises them (not the old local definitions that are now gone). **Docs changes needed:** Update `Docs/Refactoring.md`. **Why this is needed:** Routers should not reach into service internals to import exception types. All domain exceptions belong in the central registry so `main.py` global handlers and any future middleware can catch them uniformly. **Status:** Completed ✅ --- ### Task 3 — Eliminate the `config_file_service.py` delegation façade **Found in:** `backend/app/services/config_file_service.py` (1 136 lines). Every public function is a one-liner that delegates to `jail_config_service`, `filter_config_service`, or `action_config_service`. Those three sub-services import `config_file_service` back via lazy imports scattered across ~10 function bodies each, creating a hidden circular dependency. **Goal:** Delete `config_file_service.py`. Update every caller (routers: `config_misc.py`, `jail_config.py`; services: anything that calls through the façade) to import the appropriate sub-service directly. Remove all lazy-import call sites (`from app.services import config_file_service as _cfs`) in `jail_config_service.py`, `filter_config_service.py`, and `action_config_service.py`. **Possible traps and issues:** - `config_file_service.py` also re-exports `start_daemon` and `wait_for_fail2ban` from `app.utils`. Find all callers of those two names and update their imports. - The lazy imports exist specifically to avoid the circular dependency — simply removing them may surface an import cycle. Trace the actual call graph before deleting the file. - `config_misc.py` router imports `config_file_service` for several operations; it will need multiple new imports to replace the façade. - Large file change — high risk of regression. Comprehensive test run required before and after. **Docs changes needed:** Update `Docs/Refactoring.md` and `Docs/Architekture.md` if it mentions this service. **Status:** Completed ✅ **Why this is needed:** Pure delegation façades add indirection with no abstraction benefit and obscure the true dependencies of the system. The hidden circular dependency via lazy imports is a structural risk — a refactor inside any of the three sub-services could easily break the cycle in unexpected ways. --- ### Task 4 — Move raw SQL query out of `history_service.py` into a repository **Found in:** `backend/app/services/history_service.py` line 70: `db.execute("SELECT MAX(timeofban) FROM history_archive")`. **Goal:** Add a `get_max_timeofban() -> datetime | None` function to `backend/app/repositories/history_repo.py` (or the appropriate history archive repository). Replace the inline `db.execute` call in the service with a call to that repository function. **Possible traps and issues:** - Check whether `history_archive` is queried via the same `aiosqlite.Connection` that the repository layer already uses. Confirm the repository file exists and is the right home (there may be separate `history_repo` and `history_archive_repo`). - The column `timeofban` must be mapped to the correct Python type (likely `datetime`) — match the convention of the surrounding repository functions. - Unit tests for `history_service` that mock the db connection will need to be updated to mock the repository call instead. **Docs changes needed:** None beyond `Refactoring.md`. **Status:** Completed ✅ **Why this is needed:** Raw SQL in the service layer bypasses the repository abstraction. It makes the service harder to test (requires a real DB schema) and harder to maintain (schema changes must be tracked in the service, not just the repository). --- ### Task 5 — Remove the reversed dependency from `blocklist_service.py` to `app.tasks` **Found in:** `backend/app/services/blocklist_service.py` line 530: `from app.tasks import blocklist_import as blocklist_import_task`. Services must not import from tasks; tasks are consumers of services, not dependencies of them. **Goal:** Invert the dependency. The functionality needed from the task (triggering or querying a scheduled blocklist import) should be expressed as a callable or scheduler reference that is passed into the service function, or the service should expose a pure data/state function that the task calls rather than the service calling into the task. **Possible traps and issues:** - Understand what exactly the service does with `blocklist_import_task` — if it schedules or triggers a job, the job scheduler (`app.state.scheduler`) should be passed in as a parameter rather than imported from tasks. - This may require adding a parameter to the affected service function and updating its callers (the router). - If the task import is conditional/lazy, it still violates the dependency rule regardless of when it resolves. **Docs changes needed:** Update `Docs/Refactoring.md` and `Docs/Architekture.md` (task→service dependency direction). **Status:** Done ✅ **Why this is needed:** Circular layer dependencies (`service → task → service`) make it impossible to test services in isolation and create hidden initialisation-order coupling that can cause import errors or subtle bugs at startup. --- ### Task 6 — Extract log reading and service status out of `config_service.py` **Found in:** `backend/app/services/config_service.py` line 687 (`read_fail2ban_log`) and line 781 (`get_service_status`). Both are unrelated to configuration management and do not belong in this service. **Goal:** Move `read_fail2ban_log` into `log_service.py` (the dedicated log-reading service). Move `get_service_status` into `health_service.py`. Update the calling routers (`config_misc.py` uses both) to import from the correct service. **Possible traps and issues:** - Check whether `log_service.py` and `health_service.py` already have similar functions; avoid duplication and reconcile signatures. - `config_misc.py` currently imports `config_service` for these; after the move it will need two additional service imports. - Existing tests targeting `config_service` that cover these functions will need to be moved to the appropriate test file. **Docs changes needed:** Update `Docs/Refactoring.md`. Adjust any architecture diagram that shows `config_service` as the home for log/status operations. **Why this is needed:** Single-responsibility principle. A configuration management service owning log reading and daemon status checks makes the service harder to reason about and violates the established service split. --- ### Task 7 — Deduplicate `get_map_color_thresholds` / `update_map_color_thresholds` **Found in:** Implementations exist in `backend/app/services/config_service.py`, `backend/app/services/setup_service.py`, and `backend/app/utils/setup_utils.py` (the latter used by both). The router `config_misc.py` calls `setup_service.get_map_color_thresholds()` for ongoing map color operations that have nothing to do with first-run setup. **Goal:** Consolidate the single authoritative implementation in one appropriate service — likely a dedicated `settings_service.py` or inside `config_service.py`. Remove the duplicate copies. Update `config_misc.py` to import from the canonical location. Remove `get_map_color_thresholds` / `set_map_color_thresholds` from `setup_utils.py` if they only exist to serve this use case. **Possible traps and issues:** - `setup_utils.py` imports from `repositories` (itself a violation — see Task 10). Consolidating here creates a dependency chain; resolve Task 10 either before or as part of this task. - `setup_service.py` is conceptually a first-run service; if callers of the consolidated function include both first-run setup and runtime settings, choose the runtime home and let `setup_service` call it, not the other way. - Tests that mock `setup_service.get_map_color_thresholds` in the router tests will need updating. **Docs changes needed:** Update `Docs/Refactoring.md`. **Why this is needed:** Triplicated implementation violates DRY and means a change to the threshold schema must be made in three places. Using `setup_service` for ongoing runtime settings is conceptually wrong and misleads maintainers. **Status:** Completed ✅ --- ### Task 8 — Move the `activate_jail` 3-step restart workflow out of `config_misc.py` router **Found in:** `backend/app/routers/config_misc.py` lines 193–197: the `restart_fail2ban` handler directly orchestrates `config_file_service.stop_daemon()` → `config_file_service.start_daemon()` → `config_file_service.wait_for_fail2ban()` as a three-step inline sequence. **Goal:** Extract this orchestration into a `restart_daemon()` function inside `config_file_service.py` (or a more appropriate service). The router handler should call that single function. **Possible traps and issues:** - Error handling differs between the three steps; the new service function must preserve the existing error semantics (what happens if `start_daemon` fails must be the same as before). - If `config_file_service.py` is being deleted (Task 3), coordinate which service owns this function after the façade is removed. - `wait_for_fail2ban` is a polling operation — confirm any timeout behaviour is preserved. **Docs changes needed:** Update `Docs/Refactoring.md`. **Why this is needed:** Multi-step orchestration workflows are business logic and do not belong in HTTP handler functions. Keeping them in the service layer makes the workflow testable in isolation and reusable if a second endpoint ever needs to restart the daemon. **Status:** Completed ✅ --- ### Task 9 — Move `sign_session_token` call out of `auth.py` router into `auth_service.login()` **Found in:** `backend/app/routers/auth.py` line 74 calls `sign_session_token(token, secret)` after calling `auth_service.login()`. The raw token is returned by the service; the signing is performed in the router. **Goal:** `auth_service.login()` should return an already-signed token. The router should receive the final cookie value from the service without needing to know about the signing step. Move the `sign_session_token` call into `auth_service.login()`. **Possible traps and issues:** - `auth_service.login()` currently has a specific return type; changing what it returns (raw token → signed token) must be reflected in the return type annotation and any tests that assert on the raw token value. - `sign_session_token` is currently a public function exported from `auth_service`. After this change it may become a private implementation detail (`_sign_session_token`). Check for any other callers before hiding it. - Tests that test `login()` in isolation and currently check for the raw token will need to be updated to verify the signed format instead (or mock `sign_session_token` separately). **Docs changes needed:** Update `Docs/Refactoring.md`. **Why this is needed:** Security-sensitive token construction is business logic. Letting the HTTP layer build cookie values means a future maintainer adding a second login endpoint could forget the signing step, creating a silent security regression. **Status:** Completed ✅ --- ### Task 10 — Remove the repository import from `utils/setup_utils.py` **Found in:** `backend/app/utils/setup_utils.py` line 5: `from app.repositories import settings_repo`. The `utils` layer must not import from `repositories`; that is the service layer's job. **Goal:** Move any function in `setup_utils.py` that depends on `settings_repo` into `setup_service.py` (which is already allowed to import repositories). The remaining pure utility functions (password hashing, etc.) can stay in `setup_utils.py`. **Possible traps and issues:** - `setup_utils.py` is imported by `auth_service.py` (`from app.utils.setup_utils import get_password_hash`). This import is fine and must be preserved; only the repository-dependent functions move. - Identify every function in `setup_utils.py` that directly or indirectly calls `settings_repo`. Move the entire call chain, not just the top-level function. - The functions being moved may already have analogues in `setup_service.py`; check for duplication before moving. **Docs changes needed:** Update `Docs/Refactoring.md` and `Docs/Architekture.md` (layer diagram). **Why this is needed:** Utils are stateless helpers with no external dependencies. Allowing them to import from the repository layer breaks the architectural contract, makes the utilities untestable without a database, and obscures what is actually a service-layer operation. **Status:** Completed ✅ --- ### Task 11 — Centralise `DbDep` — remove local redefinitions in `blocklist.py` and `geo.py` **Found in:** `backend/app/routers/blocklist.py` line 54 redefines `DbDep = Annotated[aiosqlite.Connection, Depends(get_db)]` locally. `backend/app/routers/geo.py` uses the inline form `Annotated[aiosqlite.Connection, Depends(get_db)]` directly in function signatures instead of importing `DbDep`. **Goal:** Delete the local redefinitions. Import `DbDep` from `app.dependencies` in both routers. **Possible traps and issues:** - Confirm that `DbDep` is already exported from `app/dependencies.py` (verify the exact name used there). - The inline form in `geo.py` may appear in multiple function signatures; all occurrences must be replaced. - No behaviour change — purely a refactor. Run the test suite to confirm. **Docs changes needed:** None beyond `Refactoring.md`. **Why this is needed:** A single definition of `DbDep` ensures that any future change to the db dependency (e.g. adding row factory configuration) is applied uniformly across all routers. **Status:** Completed ✅ --- ### Task 12 — Complete the protocol injection layer or remove it **Found in:** `backend/app/dependencies.py` defines `AuthServiceDep`, `JailServiceDep`, `ConfigServiceDep`, `GeoServiceDep`, `HistoryServiceDep`, `BlocklistServiceDep`, `HealthServiceDep`, `ServerServiceDep` as protocol-typed dependency aliases. Only `routers/auth.py` and `routers/jails.py` actually use these; the other 16 routers import concrete service modules directly. **Goal:** Make a deliberate decision and enforce it consistently. Option A: adopt the protocol injection pattern everywhere — update all 16 non-compliant routers to accept their service via the typed `Dep` alias. Option B: acknowledge the pattern is unused overhead and remove the protocol aliases and `cast()` wrappers from `dependencies.py`, letting all routers import concrete services directly (the current de facto standard). Option B is cheaper; Option A makes service substitution in tests easier. **Decision:** Option B applied — service protocol aliases and provider wrappers have been removed in favor of direct concrete service imports. **Status:** Completed ✅ **Possible traps and issues:** - Option A requires updating all 16 routers and their test files simultaneously; this is a broad change with high regression risk. Stage it one router at a time. - Option B means deleting `services/protocols.py` (398 lines) and all `cast("…Service", …)` calls in `dependencies.py`. Ensure nothing outside this layer references the Protocol classes (check test files). - `dependencies.py` also has repository protocol aliases (`BlocklistRepository`, `ImportLogRepository`, etc.) — decide whether those follow the same fate. **Docs changes needed:** Update `Docs/Architekture.md` to reflect the chosen pattern. Update `Docs/Refactoring.md`. **Why this is needed:** Inconsistency between `auth.py`/`jails.py` (uses protocols) and every other router (bypasses them) makes the codebase confusing and the injection layer misleading. This is the most widespread single structural inconsistency in the router layer. --- ### Task 13 — Move `ban_ip`, `unban_ip`, and `get_active_bans` from `jail_service` to `ban_service` **Status:** Completed ✅ **Found in:** `backend/app/services/jail_service.py` contains `ban_ip`, `unban_ip`, and `get_active_bans`. These operations conceptually belong in `ban_service.py`, which is the declared home for ban management. Routers `bans.py` and `blocklist.py` already import `jail_service` specifically for these functions. **Goal:** Move the three functions into `ban_service.py`. Update `bans.py` and `blocklist.py` to import from `ban_service` instead of `jail_service`. Remove the now-redundant `jail_service` import from those routers if it is no longer needed. **Possible traps and issues:** - `ban_ip` and `unban_ip` likely use the fail2ban socket through utilities shared with the rest of `jail_service`. Confirm there is no shared private helper that needs to move with them. - `get_active_bans` may overlap with existing functions in `ban_service`. Check for duplication and merge if needed. - If `jail_service` tests cover these functions, move those test cases to the `ban_service` test file. **Docs changes needed:** Update `Docs/Architekture.md` service responsibility table. Update `Docs/Refactoring.md`. **Why this is needed:** Function placement that contradicts the declared service boundary makes it harder to find behaviour during maintenance and violates the principle of least surprise. The service name should predict where operations live. --- ### Task 14 — Lift geo-enrichment closure construction out of `history.py` router **Found in:** `backend/app/routers/history.py` lines 110, 147, and 189 each build a `_enricher` async closure and pass it to `history_service` calls. The closure captures `geo_service` and `http_session` from the outer scope and adapts the geo lookup interface for the service. **Goal:** Move the closure construction into `history_service.py` or into a shared helper in `geo_service.py`. The router should pass the `http_session` and let the service build the enricher internally, or the service should accept `geo_service` as an injected callable. **Possible traps and issues:** - `history_service` already accepts an optional `geo_enricher: GeoEnricher | None` parameter — this is the hook. The router's closures are just adapters for that parameter. Either the service gains a higher-level function that takes `http_session` directly, or a factory function in `geo_service` builds the enricher. - The same pattern exists in `geo.py` and `ban_service.py` — a consistent solution should handle all three callsites, not just `history.py`. - Changing the `history_service` function signature will require updating tests. **Docs changes needed:** Update `Docs/Refactoring.md`. **Status:** Completed ✅ **Why this is needed:** Adapter/closure construction is glue code that belongs at the service boundary or in a factory, not in the HTTP handler. Routers should not need to understand the geo service's lookup interface to serve history requests. --- ### Task 15 — Fix stale activation record on failed `activate_jail` **Found in:** `backend/app/routers/jail_config.py` line 385. `record_activation(app, name)` is called unconditionally before the service call. When the service raises any exception, no handler clears `last_activation`. For the following 60 seconds, the health-check task will misattribute any fail2ban offline event to this failed activation, potentially creating a spurious `PendingRecovery`. **Goal:** Wrap the service call in a try/except that clears `last_activation` on any exception before re-raising. Alternatively, only call `record_activation` after a successful return. **Possible traps and issues:** - Moving `record_activation` to after the `await` means the timing is slightly later (by the service call duration). This is acceptable because the 60-second window is generous. - Ensure every exception branch (not just the ones currently handled) triggers the cleanup. Use a `try/finally` to clear the record on failure, not individual `except` blocks. - The unused `activation_time` return value of `record_activation` should be removed if moving the call eliminates the need for it. **Docs changes needed:** Update `Docs/Refactoring.md`. **Status:** Completed ✅ **Why this is needed:** A false `PendingRecovery` presented to the user offers a rollback for a jail that was never successfully activated. This could confuse operators into performing a rollback that modifies configuration unnecessarily. --- ### Task 16 — Add a lock to `geo_service.py` module-level mutable state **Found in:** `backend/app/services/geo_service.py` lines 99–110: `_cache: dict`, `_neg_cache: dict`, `_dirty: set`, and `_geoip_reader` are mutable module-level singletons with no concurrency control. Background tasks (`geo_cache_flush`, `geo_re_resolve`) and request handlers all read and write these concurrently. **Goal:** Add an `asyncio.Lock` protecting mutations of `_cache`, `_neg_cache`, and `_dirty`. The lock should be acquired for any write and for the copy-then-clear pattern in `flush_dirty`. Read-only accesses that are single-await-free are safe without a lock in the single-threaded asyncio model, but mutation sites must be explicit. **Possible traps and issues:** - The comment in `flush_dirty` states that no `await` between copy and clear makes it safe today; a lock is still preferable to ensure the invariant is enforced rather than relied upon implicitly. - Avoid holding the lock across network I/O (e.g. the ip-api.com fetch in `lookup`). Acquire it only around the dict/set mutation itself. - `asyncio.Lock` is not thread-safe if `run_in_executor` is used anywhere in the geo path — verify `_cache` is never read from a thread pool. - Introducing a lock is low risk but adds overhead on every cache write; profile if the geo cache is a hot path. **Docs changes needed:** Update `Docs/Refactoring.md`. **Status:** Done ✅ **Why this is needed:** The current safety is implicit and fragile. A future change that adds an `await` inside the critical section (e.g. logging to a remote sink) would silently introduce data loss in the dirty-flush path. An explicit lock documents the intent and makes the safety guarantee unconditional. --- ### Task 17 — Move crash-detection logic out of the `health_check` task **Found in:** `backend/app/tasks/health_check.py` lines 70–113. The `_run_probe_with_resources` function contains a state machine that checks `last_activation` timing, detects `online→offline` transitions, and writes `PendingRecovery` records. This is domain business logic embedded in a scheduling function. **Goal:** Extract the crash-detection state machine into a function in `backend/app/utils/runtime_state.py` or into a new `recovery_service.py`. The health-check task should call that function with the previous status, the new status, and the current runtime state — and receive back the updated state without containing the decision logic itself. **Possible traps and issues:** - The state machine references `datetime.datetime.now(UTC)` — ensure the extracted function remains deterministic/injectable for testing by accepting `now` as an optional parameter. - `runtime_state.py` already holds `record_activation`, `create_pending_recovery`, `clear_pending_recovery`. The extracted logic is a natural neighbour there. - When moving the logic, preserve the exact 60-second window (`_ACTIVATION_CRASH_WINDOW`) and the guard that prevents overwriting an unresolved record. - Tests for the health check task will need to be refactored: the task becomes trivially thin; the logic can now be unit-tested without a running scheduler. **Docs changes needed:** Update `Docs/Refactoring.md`. **Status:** Completed ✅ **Why this is needed:** Business rules about crash attribution timing should not live inside a scheduling artifact. Embedding decision logic in a background job makes it invisible, hard to test without the scheduler, and impossible to reuse from another trigger (e.g. a manual probe endpoint). --- ### Task 18 — Replace `__import__("datetime")` antipattern in `health_check.py` **Found in:** `backend/app/tasks/health_check.py` lines 165–166: `next_run_time=__import__("datetime").datetime.now(tz=__import__("datetime").timezone.utc)`. The `datetime` module is already imported at the top of the file on line 6. **Goal:** Replace the `__import__` calls with the already-imported `datetime` name: `next_run_time=datetime.datetime.now(tz=datetime.timezone.utc)`. **Possible traps and issues:** - This is a trivial one-line change. Confirm the module-level `import datetime` exists and is not inside a `TYPE_CHECKING` block. - Run the health-check related tests to confirm the scheduler still fires on startup. **Docs changes needed:** None. **Why this is needed:** `__import__()` is an implementation detail of Python's import system and should never appear in application code. It obscures the dependency, bypasses linting checks, and signals that the code was written reactively to solve a circular import that no longer exists. --- ### Task 19 — Remove the `setup_service` misuse in `config_misc.py` for runtime map-color settings **Found in:** `backend/app/routers/config_misc.py` uses `setup_service.get_map_color_thresholds()` and `setup_service.update_map_color_thresholds()` for the ongoing `/api/config/map-colors` endpoints. `setup_service` is intended for first-run setup only. **Goal:** After Task 7 consolidates the map-color functions into the canonical runtime service, update `config_misc.py` to call that service instead of `setup_service`. Ensure `setup_service` no longer exports these functions publicly. **Possible traps and issues:** - This task is a follow-on to Task 7; do not attempt it before Task 7 is complete. - If `setup_service` still needs these functions during first-run setup, it should call the canonical service rather than owning the implementation. - Router tests that mock `setup_service.get_map_color_thresholds` will need updating. **Docs changes needed:** Update `Docs/Refactoring.md`. **Status:** Completed ✅ **Why this is needed:** Using a first-run setup service for ongoing runtime operations misleads developers into modifying the wrong service when behaviour needs to change. It also creates an implicit dependency between setup state and runtime configuration. --- ### Task 20 — Add global exception handlers for domain exceptions in `main.py` **Found in:** `backend/app/main.py`. Domain exceptions raised by services (`JailNotFoundError`, `ConfigValidationError`, `ConfigWriteError`, `FilterNotFoundError`, etc.) are caught individually with repetitive try/except blocks in each router handler. There are no global `@app.exception_handler` registrations for domain exceptions beyond `Fail2BanConnectionError` and `Fail2BanProtocolError`. **Goal:** Register `@app.exception_handler` entries in `main.py` for the most commonly caught domain exception classes (at minimum: `JailNotFoundError` → 404, `ConfigValidationError` → 400, `ConfigWriteError` → 500). This allows router handlers to let these exceptions propagate naturally, removing boilerplate try/except blocks. **Possible traps and issues:** - Not all routers want the default status code; some endpoints return 409 or 400 for conditions that another endpoint maps to 500. Global handlers are only appropriate for exceptions with a consistent HTTP mapping across all consumers. - Introduce global handlers incrementally: add the handler first, then remove the try/except from one router at a time and verify the test suite still passes. - Ensure the handler response format matches the existing `{"detail": "…"}` convention so the frontend does not need updating. - `HTTPException` must remain a local catch and not be accidentally swallowed by a broad domain handler. **Docs changes needed:** Update `Docs/Refactoring.md`. **Why this is needed:** The same try/except→HTTPException conversion is duplicated across every router endpoint. Global handlers reduce boilerplate, make error mapping auditable in one place, and prevent inconsistencies where the same exception produces different status codes in different routes. --- ### Task 21 — Restrict tasks to opening their own DB connections only where justified; document the pattern **Found in:** `backend/app/tasks/blocklist_import.py`, `geo_cache_flush.py`, `geo_re_resolve.py`, and `history_sync.py`. Each task calls `open_db(settings.database_path)` directly rather than receiving a DB connection via dependency injection. This is structurally inconsistent with the request-level `get_db` dependency used by routers. **Goal:** Document in `Docs/Architekture.md` that background tasks are intentionally outside FastAPI's dependency injection scope and must manage their own DB connections. Add a short shared helper (e.g. `tasks/_db.py` or reuse `open_db`) with a consistent context-manager pattern so all four tasks open/close connections the same way. If a task already has correct cleanup (try/finally close), confirm it and leave it; do not change what works. **Possible traps and issues:** - Do not attempt to force APScheduler background jobs through FastAPI's DI system — this is not supported without a running request context and would require thread-unsafe hacks. - Focus on consistency: all tasks should use the same open/close pattern and handle exceptions the same way. - If a task currently leaks a connection on exception, fix the cleanup; if it is correct, only add the documentation. **Docs changes needed:** Add a "Background Tasks and Database Access" section to `Docs/Architekture.md` explaining why tasks own their connections and how to write a new task. **Why this is needed:** Without explanation, the inconsistency between router DI-provided connections and task-managed connections looks like an oversight. Documentation prevents future developers from incorrectly trying to inject a DB connection into a task via `Depends`, which would fail silently at runtime.