This commit is contained in:
2026-04-06 21:02:48 +02:00
parent 95f72018f7
commit ca4b0ed324

View File

@@ -8,73 +8,41 @@ Reference: `Docs/Refactoring.md` for full analysis of each issue.
## Open Issues ## Open Issues
--- ### Backend Architecture Review Findings
### Backend Architecture - `backend/app/routers/*` often reads config directly from `request.app.state.settings` instead of using dependency injection. This bypasses the dependency layer and creates hidden coupling between routers and application state.
- Fix: replace direct `request.app.state.settings` access with `SettingsDep` or other explicit dependencies such as `ServerStatusDep` and `PendingRecoveryDep` in router function signatures.
- Expected outcome: routers become easier to unit test, composition is more explicit, and shared state access is only available through documented FastAPI dependencies.
- **Replace the single shared SQLite connection.** ✅ - Several utility modules under `backend/app/utils/` import service layer code (`app.services.*`). Utilities should remain low-level helpers and not depend on higher-level service logic.
- Current startup code opens one `aiosqlite.Connection` and reuses it for every request. - Fix: move service-dependent helpers into `app/services/` or extract shared logic into a new `app/helpers/` layer, keeping `app/utils/` purely independent.
- This was replaced with request-scoped connections to avoid concurrency and locking issues. - Expected outcome: lower coupling between utility and service layers, cleaner dependency direction, and better maintainability.
- Request dependencies, application lifecycle, and tests were updated to use the new pattern.
- **Refactor dependency wiring and shared resource management.** ✅ - Background task modules in `backend/app/tasks/` check `app.state.db` before opening a DB connection, but no code sets `app.state.db`. That means the fast-path branch is dead and the DB handling is misleading.
- Remove hidden module-level import coupling between routers, services, and shared utilities. - Fix: remove the unused `app.state.db` branch and always open/close a dedicated task-local connection, or intentionally add a shared DB connection to `app.state` and manage its lifecycle.
- Introduce explicit factories or providers for shared resources such as DB, HTTP client session, scheduler, and settings. - Expected outcome: background jobs have predictable DB lifecycle, avoid hidden bugs from stale connection assumptions, and task code is simpler.
- Ensure routers depend on injected providers rather than global state or dynamic imports.
- **Harden fail2ban integration.** ✅ - `backend/app/dependencies.py` contains an in-memory process-local session cache for auth tokens. This optimization is valid for a single-process server, but it is not cluster-safe for multi-worker or distributed deployments.
- Remove the `sys.path` hack that locates `fail2ban-master` at runtime. - Fix: either document the single-process limitation clearly, or replace `_session_cache` with an external shared cache (Redis/Memcached) or eliminate it if eventual cluster support is required.
- Replace it with a deterministic packaging or configuration model so the backend does not depend on repository layout. - Expected outcome: authentication behavior is consistent across deployment modes, and session invalidation works correctly in multi-worker setups.
- Refactor `Fail2BanClient` so concurrency control is instance-based and not backed by hidden module globals.
- **Improve startup / setup guard behavior.** ✅ - `backend/app/main.py` uses local imports inside `_lifespan()` to avoid circular dependencies, indicating that startup logic is tightly coupled with services.
- Convert `SetupRedirectMiddleware` from an on-demand DB check into a startup/initialisation guard where possible. - Fix: evaluate whether the startup initialization can be moved into a dedicated `startup.py` or split service initialization into smaller modules; keep import order simple and explicit.
- Cache setup completion in a safe way and provide an explicit invalidation path if the application state changes. - Expected outcome: cleaner startup code with lower coupling, fewer hidden circular import risks, and easier maintenance.
- Reduce middleware responsibility and avoid DB access during normal request dispatch.
- **Make deployment configuration explicit.** ### Recommended refactors for an AI agent
- Move hard-coded environment assumptions such as CORS origins into settings.
- Ensure `fail2ban_socket`, `fail2ban_config_dir`, and startup commands are fully configurable via `Settings`.
- Document production-ready defaults separately from development defaults.
### Reliability and Resilience - Standardise dependency injection in routers by using `SettingsDep`, `ServerStatusDep`, `PendingRecoveryDep`, and other dependency definitions from `backend/app/dependencies.py`.
- Refactor `backend/app/utils/` so it does not import business-layer services. Move cross-layer helpers to the appropriate layer or introduce a shared helper package if needed.
- Simplify background task DB management in `backend/app/tasks/`: remove the dead `app.state.db` logic or implement a real shared connection and document its lifecycle.
- Document auth cache semantics in the code and project docs. If cluster deployments are intended, replace the process-local cache with a shared cache or remove it.
- Inspect `backend/app/main.py` startup wiring and reduce local import usage by extracting startup responsibilities into clearer components.
- **Add backend lifecycle tests for resource cleanup.** ✅ ### Goals and expectations for the fix
- Verify startup opens and initialises DB, HTTP session, scheduler, and geo cache correctly.
- Verify shutdown closes those resources cleanly.
- **Add concurrency/regression coverage for DB and fail2ban socket use.** - Preserve existing functionality while reducing hidden coupling.
- Add tests that simulate multiple concurrent requests using the same DB dependency. - Improve testability of routers and background tasks by making dependencies explicit.
- Add tests around fail2ban socket retries, protocol errors, and rate limiting. - Make the application startup and shared-state behavior easy to reason about.
- Ensure backend architecture is stable for future refactors, especially around authentication, config handling, and scheduled jobs.
- Provide enough detail so an AI agent can make the changes safely without altering business behavior.
- **Improve state caching and invalidation.** ✅
- Add tests for session cache invalidation on logout.
- Add tests for setup completion caching so stale state is never served.
### Backend Feature Work
- **Document and implement backend-safe environment-driven CORS.** ✅
- Add support for production and local development origins through configuration.
- Avoid a hardcoded Vite origin in the core app factory.
- **Centralise scheduler job registration.** ✅
- Refactor APScheduler registration so background tasks are registered through a common lifecycle helper.
- Ensure jobs can be discovered, replaced, and tested without requiring implicit `app.state` side effects.
- **Strengthen fail2ban error handling and reporting.** ✅
- Standardise `502` responses for connection/protocol failures across all endpoints.
- Add structured logging for retries and fatal socket failures.
- Ensure the UI can distinguish offline fail2ban from internal backend failures.
- **Improve documentation of backend responsibilities.** ✅
- Keep `Docs/Tasks.md` aligned with the backend architecture review.
- Add references to the backend modules, resource lifecycle, and dependency model in the documentation.
### Priority Execution Plan
1. ✅ Fix the global SQLite connection pattern and tests.
2. ✅ Refactor dependency injection / explicit shared resources.
3. ✅ Harden fail2ban client concurrency and packaging.
4. ✅ Convert setup guard to a safer startup-driven model.
5. ✅ Add deployment-safe configuration and production-ready CORS.
6. ✅ Add lifecycle and concurrency regression tests.