Files
BanGUI/Docs/Tasks.md
Lukas e86ab6dad1 10) Implement explicit startup DAG for resource initialization
- Created StartupDAG class to orchestrate startup stages with explicit dependencies
- Defined 6 startup stages: WORKER_MODE → DATABASE → GEO_CACHE → HTTP_SESSION → SCHEDULER → TASKS
- Each stage has prerequisites, error handling, and rollback support
- Refactored startup_shared_resources() to use the DAG
- Added StartupContext for resource tracking and failure management
- Partial failures automatically roll back all completed resources in reverse order
- Added health checks to verify all resources initialized successfully
- Comprehensive test coverage: 15 DAG unit tests + 3 integration tests + 6 existing tests
- Documented startup DAG in Architekture.md with detailed stage descriptions and failure modes

This replaces implicit ordering with explicit dependency tracking, making lifecycle
changes safe and failure modes predictable. Hidden order dependencies no longer exist.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-04-28 08:08:05 +02:00

21 KiB

10) Startup sequence depends on implicit ordering

  • Where found:
  • Why this is needed:
    • Hidden order dependencies make lifecycle changes risky.
  • Goal:
    • Explicitly define resource/task startup graph and required prerequisites.
  • What to do:
    • Document dependency graph.
    • Add startup assertions and health checks for each stage.
  • Possible traps and issues:
    • Partial startup failures may leave stale scheduler or open sessions.
  • Docs changes needed:
    • Add startup DAG and failure-mode behavior.
  • Doc references:

11) Logging semantics are inconsistent across backend modules

  • Where found:
  • Why this is needed:
    • Uneven level usage reduces observability quality.
  • Goal:
    • Standardize logging levels and event naming.
  • What to do:
    • Define logging conventions.
    • Align service/task logging with that convention.
  • Possible traps and issues:
    • Excessive log volume if levels are set too low globally.
  • Docs changes needed:
    • Add logging policy and examples.
  • Doc references:

12) Prop drilling in jail overview page

  • Where found:
  • Why this is needed:
    • Large prop chains increase coupling and refactor cost.
  • Goal:
    • Move jail state/actions into dedicated context or controller hook.
  • What to do:
    • Introduce JailContext (or equivalent local state container).
    • Remove multi-hop prop forwarding.
  • Possible traps and issues:
    • Context overuse can trigger broad rerenders if not split.
  • Docs changes needed:
    • Add frontend state ownership notes.
  • Doc references:

13) Config page is over-centralized

  • Where found:
  • Why this is needed:
    • Tab orchestration and UI concerns are too concentrated.
  • Goal:
    • Decompose page into focused route/tab controllers.
  • What to do:
    • Split tab state/routing logic from rendering components.
    • Extract domain-specific subcontainers.
  • Possible traps and issues:
    • Shared state sync across tabs can regress.
  • Docs changes needed:
    • Add config page composition map.
  • Doc references:

14) Error boundary granularity is too coarse


15) Fragmented async error UX handling in components

  • Where found:
  • Why this is needed:
    • Localized ad-hoc error handling leads to inconsistent user feedback.
  • Goal:
    • Centralized error reporting + consistent UI feedback channels.
  • What to do:
    • Introduce notification/error service.
    • Standardize form operation error patterns.
  • Possible traps and issues:
    • Duplicate messaging if local and global handlers fire together.
  • Docs changes needed:
    • Add frontend error handling guideline.
  • Doc references:

16) API usage pattern is inconsistent across components/hooks


17) Weak typed error contracts in generic hooks

  • Where found:
  • Why this is needed:
    • Unknown-only error handling weakens actionable UX and diagnostics.
  • Goal:
    • Standardize API error typing and hook-level error model.
  • What to do:
    • Introduce discriminated error payloads.
    • Map ApiError and network errors consistently.
  • Possible traps and issues:
    • Type expansion can touch many call sites.
  • Docs changes needed:
    • Add typed error model examples.
  • Doc references:

18) Duplicate polling/list loading behavior across hooks

  • Where found:
  • Why this is needed:
    • Duplication multiplies maintenance bugs.
  • Goal:
    • Share a composable base for fetch lifecycle, cancellation, and polling.
  • What to do:
    • Extract shared primitives and keep hook-specific selectors minimal.
  • Possible traps and issues:
    • Generic abstractions can become too complex.
  • Docs changes needed:
    • Add hook architecture overview.
  • Doc references:

19) Provider dependency chain is implicit

  • Where found:
  • Why this is needed:
    • Order-sensitive providers can fail silently during future refactors.
  • Goal:
    • Make provider dependency order explicit and tested.
  • What to do:
    • Document provider order rationale.
    • Add provider composition tests.
  • Possible traps and issues:
    • Hidden assumptions in auth/timezone/theme initialization.
  • Docs changes needed:
    • Add provider order contract.
  • Doc references:

20) Loading UX lacks progressive/skeleton states

  • Where found:
  • Why this is needed:
    • Blank loading regions reduce perceived responsiveness.
  • Goal:
    • Introduce standardized loading placeholders.
  • What to do:
    • Build shared skeleton components for tables/charts/forms.
  • Possible traps and issues:
    • Skeleton mismatch with actual layout can cause content shift.
  • Docs changes needed:
    • Add loading UX component guidance.
  • Doc references:

21) Silent auth error swallow in fetch error utility

  • Where found:
  • Why this is needed:
    • Silent return can drop actionable errors when no auth handler is registered.
  • Goal:
    • Ensure auth errors are handled deterministically with fallback logging/handling.
  • What to do:
    • Enforce handler registration or fallback behavior.
  • Possible traps and issues:
    • Duplicate redirect flows if multiple auth handlers exist.
  • Docs changes needed:
    • Add auth error handling contract for utilities.
  • Doc references:

22) Magic strings are scattered in frontend storage keys


23) No global cancellation policy on route transitions

  • Where found:
  • Why this is needed:
    • Many hooks cancel individually, but route-wide cancellation remains inconsistent.
  • Goal:
    • Provide a global request lifecycle cancellation mechanism.
  • What to do:
    • Introduce navigation-aware cancellation context/manager.
  • Possible traps and issues:
    • Over-cancel can break long-lived background fetches unintentionally.
  • Docs changes needed:
    • Add request lifecycle policy.
  • Doc references:

24) API response wrapper shape is inconsistent


25) No canonical snake_case/camelCase serialization policy


26) Pagination contract is not standardized across endpoints


27) Error response body shape is inconsistent

  • Where found:
  • Why this is needed:
    • Frontend cannot reliably branch on machine-readable error codes.
  • Goal:
    • Standard error response schema with code + detail + metadata.
  • What to do:
    • Add shared error model and update handlers.
  • Possible traps and issues:
    • Legacy consumers parsing detail strings may break.
  • Docs changes needed:
    • Add backend error schema and mapping table.
  • Doc references:

28) Login failure delay can enable app-layer DoS

  • Where found:
  • Why this is needed:
    • Fixed 10-second await for invalid login attempts can amplify load impact.
  • Goal:
    • Keep brute-force resistance without exhausting request capacity.
  • What to do:
    • Replace fixed sleep with limiter-backed penalty strategy and concurrency protection.
  • Possible traps and issues:
    • Too little penalty weakens brute-force protection.
  • Docs changes needed:
    • Document authentication throttling strategy.
  • Doc references:

29) Blocklist URL validation has DNS-rebinding window


30) Setup persistence is non-atomic across DB contexts

  • Where found:
  • Why this is needed:
    • Partial commits during setup can leave inconsistent state.
  • Goal:
    • Make setup operations transactional and crash-safe.
  • What to do:
    • Introduce staged setup state and transaction boundaries.
  • Possible traps and issues:
    • SQLite transaction handling across multiple DB files is limited.
  • Docs changes needed:
    • Add setup state machine and rollback behavior.
  • Doc references:

31) Fire-and-forget reschedule may fail silently

  • Where found:
  • Why this is needed:
    • Schedule update requests can succeed while background reschedule fails.
  • Goal:
    • Make schedule updates deterministic and observable.
  • What to do:
    • Await reschedule path or persist task outcome status and surface errors.
  • Possible traps and issues:
    • Blocking request path might add latency if scheduler is busy.
  • Docs changes needed:
    • Document scheduling reliability guarantees.
  • Doc references:

32) RateLimiter cleanup function is not scheduled/used

  • Where found:
  • Why this is needed:
    • Rate limiter state can grow over long runtimes.
  • Goal:
    • Ensure periodic cleanup or bounded memory strategy.
  • What to do:
    • Add scheduled cleanup or auto-eviction structure.
  • Possible traps and issues:
    • Cleanup cadence too frequent can add overhead.
  • Docs changes needed:
    • Add operational notes for auth throttling lifecycle.
  • Doc references:

33) Trusted proxy configuration is hardcoded in auth router

  • Where found:
  • Why this is needed:
    • Incorrect client IP extraction can break per-IP rate limiting behind proxies.
  • Goal:
    • Move trusted proxies to validated runtime config.
  • What to do:
    • Add settings for trusted proxy IPs/CIDRs.
    • Validate and use these in client IP extraction.
  • Possible traps and issues:
    • Over-trusting headers can enable spoofing.
  • Docs changes needed:
    • Add reverse-proxy deployment configuration section.
  • Doc references:

34) Setup redirect allowlist uses broad prefix matching

  • Where found:
  • Why this is needed:
    • Prefix-based allow rules are fragile for future route additions.
  • Goal:
    • Use exact path or route-level allow policy.
  • What to do:
    • Replace startswith matching with explicit allowlist checks.
  • Possible traps and issues:
    • API docs and setup flow paths must remain reachable.
  • Docs changes needed:
    • Add setup guard route policy documentation.
  • Doc references:

35) API client sends JSON and CSRF header for every request method

  • Where found:
  • Why this is needed:
    • Extra headers on GET increase unnecessary CORS preflights and noise.
  • Goal:
    • Apply headers by method/body requirements.
  • What to do:
    • Only set Content-Type for requests with JSON body.
    • Send CSRF header for mutating cookie-authenticated requests only.
  • Possible traps and issues:
    • CSRF protection assumptions must still hold for all mutating paths.
  • Docs changes needed:
    • Update frontend API client contract and CSRF notes.
  • Doc references:

36) Polling continues when tab is not visible

  • Where found:
  • Why this is needed:
    • Unnecessary backend load and client resource usage in background tabs.
  • Goal:
    • Pause/reduce polling when page is hidden.
  • What to do:
    • Add visibility-aware polling strategy and optional backoff.
  • Possible traps and issues:
    • Data may appear stale immediately after tab restore if refresh is delayed.
  • Docs changes needed:
    • Add frontend polling lifecycle policy.
  • Doc references:

37) Multi-worker safety check depends on one environment variable

  • Where found:
  • Why this is needed:
    • Other process managers can still launch multiple workers without this variable.
  • Goal:
    • Enforce scheduler single-executor safety regardless of launcher.
  • What to do:
    • Add robust single-run lock/leader mechanism for scheduler ownership.
  • Possible traps and issues:
    • Locking strategy must be reliable in container orchestration.
  • Docs changes needed:
    • Expand deployment constraints and supported run modes.
  • Doc references:

38) History archive query paths may need explicit indexing plan


39) No explicit DI container strategy for backend service graph

  • Where found:
  • Why this is needed:
    • Dependency construction and lifecycle are partly implicit.
  • Goal:
    • Define a clear dependency wiring pattern for services and repositories.
  • What to do:
    • Create service composition root pattern and document usage.
  • Possible traps and issues:
    • Over-engineering if container abstraction is too heavy for current size.
  • Docs changes needed:
    • Add dependency wiring chapter.
  • Doc references:

40) Frontend and backend observability are not aligned

  • Where found:
  • Why this is needed:
    • Backend uses structured logging while frontend error telemetry is mostly local and ad-hoc.
  • Goal:
    • Define unified error telemetry and correlation approach.
  • What to do:
    • Introduce frontend error reporting pipeline and request correlation IDs.
  • Possible traps and issues:
    • PII/sensitive payload leakage risk in client-side telemetry.
  • Docs changes needed:
    • Add observability and privacy-safe logging guidelines.
  • Doc references: