- Add external_logging_init_failures counter - Add external_log_required flag, raise if init fails and required - Health endpoint: add external_logging status check - Blocklist service: enrich with metadata fields, update import logic - Health check task: add runtime_state dependency, fix return typing - Metrics: add Histogram for request latencies - Frontend: align BlocklistImportLogSection props - Docs: update deployment guide, remove stale tasks Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
5.6 KiB
Issue #64: MEDIUM - External Logging Failure Silently Swallowed
Where found:
backend/app/main.py:192-213
Why this is needed: When Datadog, Papertrail, or Elasticsearch log handler initialization fails, the error is caught, logged as a warning to stdout, and the application continues. In production this means critical logs may never reach the monitoring system, and operators will not know until an incident occurs.
Goal: External logging failures are surfaced to operators at deployment time.
What to do:
- Promote the warning to an error and expose it via the health endpoint (Issue #57).
- Add a startup check: if
EXTERNAL_LOG_REQUIRED=trueand initialization fails, abort startup. - Emit a metric/alert on initialization failure.
Possible traps and issues:
- Making startup fail on logging issues may be too strict for some environments; make
EXTERNAL_LOG_REQUIREDoptional and default tofalse.
Docs changes needed:
Docs/Deployment.md: documentEXTERNAL_LOG_REQUIREDand the health check for logging status.
Doc references:
backend/app/main.pylogging initialization block
Issue #65: MEDIUM - Abort Selector Inconsistency in useFetchData
Where found:
frontend/src/hooks/useFetchData.ts:124-131
Why this is needed:
When a request is aborted, refresh() returns the raw response without running the selector() function. In non-aborted paths the selector runs. Callers of refresh() receive different types depending on the abort state, making the return type unreliable and causing subtle state shape mismatches.
Goal:
refresh() returns a consistent type regardless of abort state.
What to do:
- On abort, return
null(or a typed sentinel) instead of the raw response, so callers can handle the aborted case explicitly. - Update the TypeScript return type of
refresh()to reflect the nullable result.
Possible traps and issues:
- Existing callers that ignore the return value are unaffected; callers that use it need to handle
null.
Docs changes needed:
frontend/src/hooks/README.md: document thenullreturn on abort.
Doc references:
frontend/src/hooks/README.md
Issue #67: LOW - Default Page Size Inconsistently Applied Across Routers
Where found:
backend/app/routers/history.py:80-84– usesDEFAULT_PAGE_SIZEconstant- Multiple other routers – may hardcode page size values
Why this is needed:
Endpoints with different default page sizes create an inconsistent API experience and make it hard to reason about server load. A client that does not pass page_size gets different result counts from different endpoints.
Goal: All paginated endpoints use the same default page size driven by a single constant.
What to do:
- Audit all
page_sizeQuery parameters across routers. - Replace all hardcoded defaults with
DEFAULT_PAGE_SIZEfromconstants.py. - Add a linting check or unit test that asserts no hardcoded page size defaults exist in routers.
Possible traps and issues:
- Some endpoints may intentionally use a different page size for performance reasons; document exceptions explicitly.
Docs changes needed:
- API reference: document the default page size and how to override it.
Doc references:
backend/app/utils/constants.py–DEFAULT_PAGE_SIZE
Issue #68: LOW - No Reserved Keyword Validation for Jail Names
Where found:
backend/app/models/jail.py– jail name validated against alphanumeric regex onlybackend/app/routers/jail_config.py
Why this is needed:
Fail2ban uses reserved jail names and command keywords (e.g., all, status, purge). A user-created jail with a reserved name could shadow fail2ban built-in commands or produce confusing behavior when management commands are issued.
Goal: Reject jail names that conflict with fail2ban reserved words at model validation time.
What to do:
- Define a
FAIL2BAN_RESERVED_JAIL_NAMESset inconstants.py. - Add a Pydantic validator on the jail name field that rejects reserved words.
- Return a 422 with a descriptive error message.
Possible traps and issues:
- The reserved word list may change across fail2ban versions; source it from fail2ban documentation and version-gate if necessary.
Docs changes needed:
- API reference: document the list of reserved jail names.
Doc references:
- Fail2ban documentation on reserved jail identifiers
Issue #69: LOW - Jail Names Echoed in Error Messages Without Sanitization
Where found:
backend/app/exceptions.py:138,351– jail names interpolated directly into error strings
Why this is needed:
Although Python's repr() provides basic escaping, user-supplied jail names are reflected back in error messages. If these messages are ever rendered in an HTML context (e.g., a future admin UI or email notification), they become XSS vectors. They also act as confirmation oracles when combined with timing attacks.
Goal: Error messages referencing user input are sanitized before inclusion.
What to do:
- Pass user-supplied values through a dedicated
sanitize_for_display()helper before interpolation. - Ensure the helper strips or escapes HTML special characters.
- For API responses, always return the original (validated) field name rather than the raw user input.
Possible traps and issues:
- Over-escaping in JSON responses is not needed (JSON is not HTML); apply sanitization only at HTML render boundaries.
Docs changes needed:
CONTRIBUTING.md: document the rule that user input must not be echoed raw in messages.
Doc references:
backend/app/exceptions.py