Files
BanGUI/Docs/Tasks.md
Lukas fdfd24508f Refactor backend services and routers
- Reorganized dashboard router with improved structure
- Enhanced ban_service with better separation of concerns
- Updated history service with cleaner logic
- Improved constants and configuration handling
- Updated documentation of completed tasks

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-04-23 16:06:10 +02:00

26 KiB
Raw Blame History

T-03 · Centralise _DEFAULT_PAGE_SIZE constant

Where found: backend/app/routers/dashboard.py:45, routers/history.py:34, services/ban_service.py:70, services/history_service.py:49

Why this is needed: Four independent definitions can drift. The router default and service default are currently coincidentally aligned at 100, but nothing enforces this.

Goal: Single definition in app/utils/constants.py, imported everywhere.

What to do:

  1. Add DEFAULT_PAGE_SIZE: Final[int] = 100 and MAX_PAGE_SIZE: Final[int] = 500 to app/utils/constants.py.
  2. Replace all four local _DEFAULT_PAGE_SIZE and _MAX_PAGE_SIZE declarations with imports.

Possible traps and issues: None significant. Pure search-and-replace.

Docs changes needed: None.

Doc references: app/utils/constants.py


T-04 · Encapsulate geo_service module-level mutable state in a class

Where found: backend/app/services/geo_service.py — module globals _cache, _neg_cache, _dirty, _geoip_reader, _geoip_initialized, _cache_lock

Why this is needed: Module-level mutable state is invisible to callers, cannot be injected, and requires test-escape-hatch functions (clear_cache()) that exist only because the state can't be reset otherwise. It also violates SRP — the module owns both the geo lookup logic and the cache lifecycle.

Goal: GeoCache class with instance state, instantiated once at startup and stored on app.state (consistent with InMemorySessionCache).

What to do:

  1. Create class GeoCache in geo_service.py or a new app/services/geo_cache.py with the cache dict, neg-cache, dirty set, and lock as instance attributes.
  2. Expose lookup, lookup_batch, flush_dirty, load_from_db, clear, etc. as methods.
  3. Instantiate in startup.py or main.py lifespan and store as app.state.geo_cache.
  4. Update dependencies.py to inject GeoCache rather than returning geo_service.lookup_batch directly.
  5. Remove clear_cache() and clear_neg_cache() module-level functions (move to instance methods).

Possible traps and issues:

  • Background tasks (geo_cache_flush.py, geo_re_resolve.py) reference module-level functions. They need to receive the GeoCache instance.
  • geo_service.is_cached() is called from ban_service — update call sites.
  • Many tests likely patch geo_service._cache or call geo_service.clear_cache() — all tests need updating.

Docs changes needed: Docs/Architekture.md — document GeoCache as a managed stateful service alongside session cache.

Doc references: Docs/Architekture.md, Docs/Backend-Development.md


T-05 · Remove app.state mutation from _build_app_context in dependencies.py

Where found: backend/app/dependencies.py_build_app_context() mutates state.session_cache on every request

Why this is needed: A function named _build_app_context (a "build" / read operation) has a side effect of mutating app.state. Startup configuration decisions belong in the lifespan handler, not in the per-request dependency graph.

Goal: _build_app_context is pure (read-only). Session cache type is decided once at startup in main.py.

What to do:

  1. Move the session cache swap logic (_session_cache_enabled check → replace NoOpSessionCache with InMemorySessionCache) to the lifespan handler in main.py.
  2. Remove the three if/elif/elif branches mutating state.session_cache from _build_app_context.
  3. Ensure settings-change flow (runtime settings update) also triggers a cache swap if needed.

Possible traps and issues:

  • If runtime_settings can change mid-process (setup wizard updates settings), the cache swap must still happen. Identify where runtime_settings is set and add a swap there.
  • Tests that call _build_app_context directly may rely on the mutation side effect.

Docs changes needed: None.

Doc references: backend/app/dependencies.py, backend/app/main.py


T-06 · Eliminate AppState Protocol / ApplicationContext dataclass redundancy

Where found: backend/app/dependencies.pyAppState Protocol (lines ~4060) and ApplicationContext dataclass (lines ~6275) describe identical fields.

Why this is needed: Every new field must be added to both. The Protocol is only used for a single cast() call inside _build_app_context. Maintenance burden with no benefit.

Goal: One typed representation. Remove AppState Protocol; cast directly or use ApplicationContext.

What to do:

  1. Delete the AppState Protocol class.
  2. Replace state = cast("AppState", request.app.state) with state = request.app.state (type: ignore or use ApplicationState directly since it's the concrete type set in create_app).
  3. Access fields from state directly using the ApplicationState / RuntimeState types.

Possible traps and issues:

  • mypy / pyright may report type errors on request.app.state accesses — use cast(ApplicationState, request.app.state) once, then access typed.
  • Ensure all fields accessed in _build_app_context are present on ApplicationState.

Docs changes needed: None.

Doc references: backend/app/dependencies.py, backend/app/utils/runtime_state.py


T-07 · Break cross-service import: jail_config_service imports jail_service

Where found: backend/app/services/jail_config_service.pyimport app.services.jail_service as jail_service

Why this is needed: Services at the same layer should not depend on each other. Shared logic should move to a lower-level utility. This creates a dependency cycle risk and makes both services harder to test independently.

Goal: Shared socket operations extracted to app/utils/ or app/utils/jail_socket.py. No service imports a sibling service.

What to do:

  1. Identify which functions from jail_service are called by jail_config_service.
  2. Extract shared low-level socket helpers to app/utils/jail_socket.py (or extend fail2ban_client.py).
  3. Update both services to import from the utility layer.

Possible traps and issues:

  • Some jail_service functions that are called may themselves import geo, config, or other services — trace the full dependency graph before extracting.
  • APScheduler tasks reference jail_service — ensure they still work after any reorganisation.

Docs changes needed: Docs/Architekture.md — add rule: services must not import sibling services.

Doc references: Docs/Architekture.md


T-08 · _SOCKET_TIMEOUT defined 6×constants.py constant unused

Where found: backend/app/utils/constants.py:16 (defines FAIL2BAN_SOCKET_TIMEOUT_SECONDS = 5.0 but is never imported); ban_service.py (5.0), jail_service.py (10.0), config_service.py (10.0), server_service.py (10.0), log_service.py (10.0), jail_config_service.py (10.0), config_file_utils.py (10.0)

Why this is needed: constants.py has a module docstring saying "import from this module rather than hard-coding values" — and then nothing imports the socket timeout. The values also disagree: ban_service uses 5.0 while all others use 10.0, creating silent inconsistency in timeout behaviour across endpoints.

Goal: One constant in constants.py, imported everywhere. Decide on a single default (or two named constants for fast/slow operations).

What to do:

  1. In constants.py, define FAIL2BAN_SOCKET_TIMEOUT_FAST: Final[float] = 5.0 (for health/metadata probes) and FAIL2BAN_SOCKET_TIMEOUT: Final[float] = 10.0 (for command operations) — or a single value if appropriate.
  2. Replace all local _SOCKET_TIMEOUT float literals with imports from constants.
  3. Confirm ban_service should actually be 5.0 or 10.0 (intentional vs accidental discrepancy).

Possible traps and issues:

  • Changing ban_service from 5.0 to 10.0 (or vice versa) changes observable timeout behaviour. Decide deliberately.
  • fail2ban_metadata_service.py also has socket_timeout: float = 5.0 hardcoded inline — include this in the sweep.

Docs changes needed: None.

Doc references: backend/app/utils/constants.py


T-09 · _since_unix has two divergent implementations — latent window-boundary bug

Where found: backend/app/services/ban_service.py:393 (uses time.time(), applies _TIME_RANGE_SLACK_SECONDS = 60); backend/app/services/history_service.py:53 (uses datetime.now(UTC).timestamp(), no slack)

Why this is needed: The ban_service docstring explicitly explains why time.time() is used and why the 60-second slack is needed for consistency with fail2ban's timestamp storage. history_service quietly omits both, so history queries return a slightly different window boundary than ban queries for the same TimeRange parameter. Inconsistent windows cause "same data, different count" bugs in the UI.

Goal: Single _since_unix function in a shared utility module, with documented rationale, used by both services.

What to do:

  1. Move _since_unix (with the time.time() + slack approach) to app/utils/time_utils.py (which already exists).
  2. Delete both local implementations.
  3. Import from time_utils in both services.
  4. Decide consciously whether history queries should also use the slack (probably yes for consistency).

Possible traps and issues:

  • Adding the 60-second slack to history queries may change row counts in tests that assert exact counts against seeded timestamps.
  • Confirm _TIME_RANGE_SLACK_SECONDS belongs in constants.py.

Docs changes needed: Docs/Backend-Development.md — document the timestamp handling rationale.

Doc references: backend/app/utils/time_utils.py, backend/app/utils/constants.py


T-10 · get_geo_batch_lookup is false injectability — module function pointer injection

Where found: backend/app/dependencies.pyget_geo_batch_lookup() returns geo_service.lookup_batch (a module-level function)

Why this is needed: The dependency provider exists to give the appearance of injectable geo lookup, but because geo_service uses module-level global state (T-04), tests that inject a different callable into routers still have the global cache active. The abstraction provides type-level indirection without runtime isolation.

Goal: Once T-04 is done (GeoCache as an object), inject the GeoCache instance and call methods on it directly. The GeoBatchLookup callable protocol becomes a method reference on the injected instance.

What to do:

  1. Complete T-04 first.
  2. Update get_geo_batch_lookup to retrieve GeoCache from app.state and return its lookup_batch method.
  3. Or inject GeoCache directly and let routers call .lookup_batch() on it.

Possible traps and issues: Blocked on T-04.

Docs changes needed: None beyond T-04.

Doc references: backend/app/dependencies.py


T-11 · Repositories injected as module references via cast() — structural type-safety gap

Where found: backend/app/dependencies.pyget_session_repo(), get_blocklist_repo(), get_settings_repo(), get_import_log_repo(), get_history_archive_repo(), get_geo_cache_repo(), get_fail2ban_db_repo() all return the module itself cast to the Protocol type.

Why this is needed: The cast() call is a signal that the type system is being overridden. Modules pass Protocol structural checks only because their top-level async def functions happen to match the Protocol method signatures. This is fragile — a module rename, a function rename, or an added required parameter will silently pass mypy but fail at runtime.

Goal: Repository modules become proper singleton instances, or the dependency providers are acknowledged as module-adapters with explicit documentation.

What to do (option A — correct):

  1. Convert each repository module's functions into a class with the same method signatures.
  2. Instantiate singletons at startup and store on app.state or as module-level instances.
  3. Update dependency providers to return the instance without cast().

What to do (option B — minimal):

  1. Document in each get_*_repo provider why the module-as-Protocol pattern is intentional.
  2. Add a CI check (or mypy plugin) that validates structural compatibility doesn't silently break.

Possible traps and issues:

  • Option A is a significant refactor affecting all repository call sites.
  • Option B risks the pattern silently breaking in future.

Docs changes needed: Docs/Backend-Development.md — document repository injection pattern and why it works.

Doc references: Docs/Backend-Development.md, backend/app/repositories/protocols.py


T-12 · Apply useListData consistently across all data-fetching hooks

Where found: frontend/src/hooks/useJailList.ts, useJailDetail.ts, useServerStatus.ts, useBanTrend.ts, useDashboardCountryData.ts — all re-implement abort-controller / loading / error state manually. useListData.ts exists and is used by useBlocklists, useJailConfigs, useActionList, useFilterList.

Why this is needed: At least 5 hooks implement the same 40-line pattern. Any fix to the pattern (e.g. abort-guard in .finally()) must be applied to every copy independently. useHistory has a real bug because of this (see T-18).

Goal: All hooks that load a list and need refresh semantics use useListData or a shared base.

What to do:

  1. Audit all hooks for the manual abort-controller pattern.
  2. Refactor useJailList first (cleanest candidate — no mutations).
  3. For hooks with side-effects beyond listing (e.g. useJailDetail), split into data hook + command hook (see T-13) and use useListData for the data half.
  4. Extend useListData if needed to support onSuccess callbacks returning non-array data (e.g. total).

Possible traps and issues:

  • useListData currently requires selector: (response) → TItem[]. Hooks that expose total alongside items need onSuccess to capture it — the onSuccess callback already exists in UseListDataOptions.
  • useServerStatus has a polling interval and window-focus refetch that useListData does not support — may need a usePolledData variant or extension.

Docs changes needed: None.

Doc references: frontend/src/hooks/useListData.ts


T-13 · Split useJailDetail — SRP violation (read state + write commands in one hook)

Where found: frontend/src/hooks/useJailDetail.ts

Why this is needed: The hook manages fetch state AND exposes 8 write operations (start, stop, reload, setIdle, addIp, removeIp, toggleIgnoreSelf, load). Read concerns and command concerns are independent. The consumer (JailDetailPage) passes them separately through props anyway, so the UI doesn't depend on them being co-located.

Goal: useJailData(name) for reading, useJailCommands(name, onSuccess) for mutations.

What to do:

  1. Create useJailData(name): { jail, ignoreList, ignoreSelf, loading, error, refresh } using useListData or the abort-controller pattern.
  2. Create useJailCommands(name, onSuccess: () => void): { start, stop, reload, setIdle, addIp, removeIp, toggleIgnoreSelf } — each command calls the API and then calls onSuccess() to trigger a refresh.
  3. In JailDetailPage, call both hooks and pass results to child components.
  4. Delete useJailDetail.ts.

Possible traps and issues:

  • JailDetailPage destructures all properties from useJailDetail in a single line — update the destructuring.
  • Commands currently call load() directly. The onSuccess callback pattern keeps them decoupled from the data hook's internals.

Docs changes needed: None.

Doc references: frontend/src/hooks/useJailDetail.ts, frontend/src/pages/JailDetailPage.tsx


T-14 · Move jail detail sub-sections from pages/jail/ to components/jail/

Where found: frontend/src/pages/jail/JailInfoSection, PatternsSection, BantimeEscalationSection, IgnoreListSection, jailDetailPageStyles.ts. BannedIpsSection is in frontend/src/components/jail/ (correct location).

Why this is needed: The project convention is pages/ for route-level components and components/ for reusable UI. Sub-sections are reusable UI building blocks, not route components. BannedIpsSection already lives in the right place. The inconsistency makes the directory structure misleading.

Goal: All *Section components for jail detail under components/jail/. pages/jail/ deleted or contains only route-level entry points.

What to do:

  1. Move JailInfoSection, PatternsSection, BantimeEscalationSection, IgnoreListSection, and jailDetailPageStyles.ts to frontend/src/components/jail/.
  2. Update imports in JailDetailPage.tsx.
  3. Remove empty pages/jail/ directory.

Possible traps and issues:

  • Confirm no other page imports from pages/jail/ before deleting it.
  • Update any path-based test imports.

Docs changes needed: Docs/Web-Development.md — document the pages/ vs components/ convention explicitly.

Doc references: Docs/Web-Development.md


T-15 · Replace window event bus for session expiry with React context callback

Where found: frontend/src/api/client.tswindow.dispatchEvent(new Event(SESSION_EXPIRED_EVENT)); frontend/src/providers/AuthProvider.tsxwindow.addEventListener(SESSION_EXPIRED_EVENT, ...)

Why this is needed: Using window as a side-channel bypasses React's component tree, breaks in non-browser environments (SSR, test environments without full JSDOM), and creates an invisible coupling between the API client and the auth provider. It also fires globally — any code anywhere that dispatches bangui:session-expired on window will trigger a logout, with no tracing possible.

Goal: The API client receives an onUnauthorized callback injected at the provider level, called directly instead of via a DOM event.

What to do:

  1. Add an onUnauthorized callback to the API client (e.g. a module-level setter setUnauthorizedHandler(fn)).
  2. In AuthProvider, call setUnauthorizedHandler(handleSessionExpired) on mount and reset it on unmount.
  3. In client.ts, call the handler directly instead of window.dispatchEvent.
  4. Remove SESSION_EXPIRED_EVENT string constant and the window.addEventListener in AuthProvider.

Possible traps and issues:

  • The module-level handler setter is still global state — an alternative is to pass the handler as a parameter to request() via a context object, but that changes the API signature more significantly.
  • Tests that mock window.dispatchEvent need updating.
  • SSR / Vitest environments that already mock window may need adjustment.

Docs changes needed: None.

Doc references: frontend/src/api/client.ts, frontend/src/providers/AuthProvider.tsx


T-16 · Centralise PAGE_SIZE frontend constants

Where found: frontend/src/hooks/useBans.ts:14 (PAGE_SIZE = 100); frontend/src/pages/HistoryPage.tsx:45 (PAGE_SIZE = 50)

Why this is needed: Page sizes can silently diverge from backend defaults. If the backend changes _DEFAULT_PAGE_SIZE, the frontend won't know. Having multiple files define the same concept differently is also misleading.

Goal: All pagination constants in frontend/src/utils/constants.ts.

What to do:

  1. Add BAN_PAGE_SIZE = 100, HISTORY_PAGE_SIZE = 50 to frontend/src/utils/constants.ts (create it if it doesn't exist).
  2. Replace local const PAGE_SIZE = ... in each hook/page with imports.

Possible traps and issues: Trivial. Verify test snapshots don't hard-code the old inline constant.

Docs changes needed: None.

Doc references: frontend/src/utils/constants.ts


T-17 · useHistory is missing abort-signal guards — stale state update bug

Where found: frontend/src/hooks/useHistory.ts.then(), .catch(), .finally() callbacks update state without checking abortRef.current.signal.aborted

Why this is needed: Every other data-fetching hook in the codebase guards all state-update callbacks against aborted signals. useHistory does not. If the component unmounts mid-request, setItems, setTotal, setLoading will all fire on an unmounted component. In React 18 this is a no-op but it still indicates a broken invariant and handleFetchError could misclassify the abort as a real error (depends on whether fetch threw AbortError or the API module swallowed it).

Goal: All callbacks in useHistory check the abort signal before mutating state.

What to do:

  1. Capture the controller in a local variable inside load() (already done: abortRef.current = new AbortController()).
  2. In .then(): add if (abortRef.current.signal.aborted) return; before setItems(...).
  3. In .catch(): add the same guard before handleFetchError(...).
  4. In .finally(): add if (!abortRef.current.signal.aborted) before setLoading(false).

Possible traps and issues:

  • abortRef.current may have been replaced by a new controller before the callback fires. Capture the controller in a closure variable at the top of load(): const controller = abortRef.current.

Docs changes needed: None.

Doc references: frontend/src/hooks/useHistory.ts


T-18 · Merge useDashboardCountryData and useMapData — near-identical hooks

Where found: frontend/src/hooks/useDashboardCountryData.ts and frontend/src/hooks/useMapData.ts

Why this is needed: Both hooks call fetchBansByCountry, maintain the same state shape (countries, countryNames, bans, total, loading, error), and implement the same abort-controller pattern. The only behavioural difference is that useMapData adds a 300ms debounce. Any bug fix must be applied to both.

Goal: A single useBansByCountry base hook; useMapData adds the debounce on top.

What to do:

  1. Create useBansByCountry(range, origin, source, countryCode?) — the shared fetch logic without debounce.
  2. Refactor useDashboardCountryData to wrap useBansByCountry.
  3. Refactor useMapData to wrap useBansByCountry and add the debounce layer.
  4. Keep the existing hook names as thin wrappers to preserve call sites.

Possible traps and issues:

  • useMapData returns { data } (the full response object) whereas useDashboardCountryData unpacks countries, countryNames, bans, total. Normalise the return shape before collapsing.
  • Tests for each hook test them independently — update or merge.

Docs changes needed: None.

Doc references: frontend/src/hooks/useDashboardCountryData.ts, frontend/src/hooks/useMapData.ts


T-19 · Move DashboardFilterProvider — page-scoped provider in wrong directory

Where found: frontend/src/providers/DashboardFilterProvider.tsx — instantiated only inside DashboardPage.tsx

Why this is needed: The providers/ directory implies app-wide providers (alongside AuthProvider, ThemeProvider, TimezoneProvider). DashboardFilterProvider wraps only DashboardPageContent and is not used anywhere else. Its placement implies reuse that doesn't exist, misleading future contributors about its scope.

Goal: Co-located with its only consumer.

What to do:

  1. Move DashboardFilterProvider.tsx to frontend/src/pages/ (alongside DashboardPage.tsx) or to frontend/src/pages/dashboard/ if the page is split into a subdirectory.
  2. Update imports in DashboardPage.tsx and any tests.

Possible traps and issues: Only DashboardPage.tsx imports it — confirm with grep before moving.

Docs changes needed: Docs/Web-Development.md — document what belongs in providers/ (app-wide) vs co-located.

Doc references: Docs/Web-Development.md


T-20 · Replace inline style={{}} objects with makeStyles classes

Where found: frontend/src/pages/map/MapBansTable.tsx (multiple), pages/JailDetailPage.tsx, pages/HistoryPage.tsx, pages/history/IpDetailView.tsx, components/WorldMap.tsx, components/TopCountriesPieChart.tsx, components/TopCountriesBarChart.tsx

Why this is needed: The project uses Fluent UI's makeStyles with atomic CSS caching. Inline style={{}} objects are allocated on every render, bypass the atomic CSS cache, and are inconsistent with the established pattern. Exceptions are acceptable only for truly dynamic values (e.g. tooltip left/top that change on mouse move) — static layout values must use makeStyles.

Goal: All static layout properties moved to makeStyles. Inline styles only for genuinely dynamic values.

What to do:

  1. Audit each file listed above.
  2. For static display: flex, gap, margin, padding — move to makeStyles in that component's style block.
  3. Keep inline style only where the value is truly dynamic at runtime (e.g. WorldMap tooltip position, TopCountriesBarChart chart height).

Possible traps and issues:

  • MapBansTable.tsx has several consecutive inline style objects for its pagination row — these can be collapsed into one named class.
  • Some components use both tokens.* values and inline styles — ensure makeStyles is imported where it isn't already.

Docs changes needed: Docs/Web-Development.md — add styling rule: use makeStyles for all static styles; inline style only for runtime-dynamic values.

Doc references: Docs/Web-Development.md, Fluent UI v9 docs on makeStyles


T-21 · Fail2BanMetadataService has inline socket timeout hardcoded

Where found: backend/app/services/fail2ban_metadata_service.py:64socket_timeout: float = 5.0

Why this is needed: Part of the same constant duplication as T-08. This file doesn't use _SOCKET_TIMEOUT as a module constant — it hardcodes 5.0 inline as a local variable. Should use the constant from constants.py once T-08 is done.

Goal: After T-08, replace with FAIL2BAN_SOCKET_TIMEOUT_FAST import.

What to do: Covered by T-08 sweep.

Possible traps and issues: None — dependent on T-08.

Docs changes needed: None.

Doc references: backend/app/services/fail2ban_metadata_service.py