Add Application Performance Monitoring (APM) with Prometheus metrics

- Backend: Implement Prometheus metrics collection
  - Add prometheus-client dependency
  - Create metrics utility module with HTTP request tracking counters, histograms, gauges
  - Implement MetricsMiddleware to track request latency, count, and active requests
  - Add /metrics endpoint to expose metrics in Prometheus text format
  - Normalize paths to prevent cardinality explosion (e.g., /api/{id} for UUIDs)
  - Exclude /metrics and /health from detailed tracking

- Frontend: Add web vitals and API metrics collection
  - Install web-vitals library (v4.0.0) for Core Web Vitals tracking
  - Create metrics utility module for FCP, LCP, CLS, INP, TTFB collection
  - Implement useTrackedFetch hook for automatic API call metrics (method, endpoint, status, duration)
  - Initialize web vitals tracking in App component on mount
  - Provide exportMetrics() for sending metrics to backend

- Testing:
  - Add comprehensive backend metrics tests (9 tests, 100% coverage)
  - Add comprehensive frontend metrics tests (10 tests)
  - All tests passing

- Documentation:
  - Expand Docs/Observability.md with complete APM section
  - Include metrics reference, integration examples (Prometheus, Datadog, NewRelic)
  - Add troubleshooting guide and best practices for cardinality management
  - Update Tasks.md to mark APM task as complete

Metrics exposed:
- bangui_http_requests_total: HTTP request count by method, endpoint, status
- bangui_http_request_duration_seconds: Request latency histogram
- bangui_http_active_requests: Active request gauge
- Web Vitals: CLS, FCP, INP, LCP, TTFB with ratings
- API metrics: endpoint, method, status, duration, timestamp

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This commit is contained in:
2026-05-01 18:33:14 +02:00
parent 37078b742b
commit 1af67eb0ce
14 changed files with 969 additions and 74 deletions

View File

@@ -45,6 +45,7 @@ from app.exceptions import (
)
from app.middleware.correlation import CorrelationIdMiddleware
from app.middleware.csrf import CsrfMiddleware
from app.middleware.metrics import MetricsMiddleware
from app.middleware.rate_limit import RateLimitMiddleware
from app.models.response import ErrorResponse
from app.routers import (
@@ -58,6 +59,7 @@ from app.routers import (
health,
history,
jails,
metrics,
server,
setup,
)
@@ -950,6 +952,7 @@ def create_app(settings: Settings | None = None) -> FastAPI:
app.add_middleware(CorrelationIdMiddleware)
app.add_middleware(SecurityHeadersMiddleware)
app.add_middleware(SetupRedirectMiddleware)
app.add_middleware(MetricsMiddleware)
app.add_middleware(CsrfMiddleware)
app.add_middleware(
RateLimitMiddleware,
@@ -995,6 +998,7 @@ def create_app(settings: Settings | None = None) -> FastAPI:
app.add_exception_handler(Exception, _unhandled_exception_handler)
# --- Routers ---
app.include_router(metrics.router)
app.include_router(health.router)
app.include_router(setup.router)
app.include_router(auth.router)