Implement frontend and backend observability alignment

Align frontend and backend error observability with correlation IDs and
structured telemetry for distributed tracing across systems.

Backend changes:
- Add CorrelationIdMiddleware to generate/extract correlation IDs
- Include correlation_id in all ErrorResponse objects
- Store correlation ID in structlog contextvars for automatic inclusion in logs
- Add correlation ID to response headers (X-Correlation-ID)

Frontend changes:
- API client automatically generates session-scoped UUID4 and includes
  X-Correlation-ID header in all requests
- Extract correlation ID from API error responses
- Update error handlers to use telemetry with correlation IDs
- Add telemetry logging to ErrorBoundary, PageErrorBoundary, SectionErrorBoundary
- Implement redaction utilities for privacy-safe logging of sensitive data

Documentation:
- Add observability guidelines to Web-Development.md
  * Correlation ID usage patterns
  * Privacy & security best practices
  * Telemetry event structure
  * Redaction utilities for sensitive data
- Add distributed tracing architecture section to Architecture.md
  * Correlation ID flow across frontend/backend
  * Example troubleshooting scenario
  * Implementation details for future enhancements

Testing:
- Add comprehensive tests for correlation middleware
- Update error boundary tests to verify telemetry integration
- Verify TypeScript and ESLint pass with no warnings

Fixes: Issue #40 - Frontend and backend observability are not aligned

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This commit is contained in:
2026-04-30 18:32:19 +02:00
parent 9a43123b3a
commit 3d1a6f5538
16 changed files with 916 additions and 54 deletions

View File

@@ -1608,7 +1608,88 @@ it("should render a row for each ban", () => {
---
## 15. Git & Workflow
## 15. Error Observability & Telemetry
Frontend errors must be reported with correlation IDs to enable distributed tracing across frontend and backend systems. This allows engineers to correlate errors in the UI with their corresponding backend logs.
### Correlation IDs
- **Automatic:** The API client automatically generates a **session-scoped UUID4** on first use and includes it in the `X-Correlation-ID` header for every request.
- **Backend responds:** The backend includes the correlation ID in the response header and in error responses (`correlation_id` field).
- **Frontend extraction:** Error handlers automatically extract the correlation ID and log it with telemetry events for debugging.
### Error Telemetry
Use the `telemetry.ts` utilities to log errors with correlation IDs:
```ts
import { recordError, recordWarning, redact } from "../utils/telemetry";
// Log API errors with correlation ID
try {
const data = await api.get("/jails");
} catch (error) {
const correlationId = (error as ApiError).correlationId;
recordError(
"fetch_jails_failed",
error instanceof Error ? error : new Error(String(error)),
{ endpoint: "/jails" },
correlationId
);
}
// Log validation errors
if (!validateEmail(email)) {
recordWarning(
"invalid_email_format",
`Email format invalid: ${redact(email)}`,
{ field: "email" }
);
}
```
### Privacy & Security
**NEVER log sensitive data:**
- Passwords, tokens, session IDs
- Personal information (names, email addresses, IP addresses)
- Configuration secrets or API keys
- Request/response bodies containing passwords
**Redact sensitive fields before logging:**
```ts
import { redact, redactObject } from "../utils/telemetry";
// Redact URLs with query parameters
const safeUrl = redact("https://api.example.com/login?password=secret");
// Result: "https://api.example.com/login?password=[REDACTED]"
// Redact object fields
const safeConfig = redactObject({
apiKey: "sk-1234567890",
username: "john@example.com",
serverUrl: "https://internal.api.example.com",
});
// Result: { apiKey: "[REDACTED]", username: "[REDACTED]", serverUrl: "..." }
```
### Telemetry Event Structure
All telemetry events are structured with:
- `event`: Machine-readable event name in snake_case (e.g., `"auth_error"`, `"component_render_error"`)
- `severity`: One of `"debug"`, `"info"`, `"warning"`, `"error"`, `"critical"`
- `correlation_id`: UUID for distributed tracing (optional, but recommended for errors)
- `message`: Human-readable description (optional)
- `context`: Structured data bag for additional context (no PII)
- `timestamp`: ISO 8601 timestamp
- `error`: Error instance for stack traces (if applicable)
This mirrors the backend structlog format, enabling consistent log analysis across frontend and backend.
---
## 16. Git & Workflow
- **Branch naming:** `feature/<short-description>`, `fix/<short-description>`, `chore/<short-description>`.
- **Commit messages:** imperative tense, max 72 chars first line (`Add ban table component`, `Fix date formatting in dashboard`).