Files
BanGUI/Docs/Tasks.md
Lukas 67b26a3ef7 Refactor pagination with cursor-based support and standardized response format
- Implement cursor-based pagination in pagination.py
- Update response models to standardize pagination structure
- Add cursor pagination utilities for repositories
- Update HistoryArchiveRepository and ImportLogRepository with new pagination
- Add comprehensive tests for cursor pagination
- Update documentation for backend development and task tracking

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-01 17:54:05 +02:00

6.1 KiB

[MEDIUM] Inefficient database pagination uses OFFSET

Where found

  • backend/app/utils/pagination.py — uses OFFSET (page-1) * page_size

Why this is needed

OFFSET scans and discards N rows to fetch N+limit. Last page on 10M row table: 15 seconds ⚠️

Goal

Implement keyset pagination (cursor-based) for large result sets.

What to do

  1. Short-term: Add database indexes on sort columns
  2. Long-term: Implement cursor-based pagination using WHERE instead of OFFSET
  3. Frontend sends cursor (last row ID) instead of page number

Possible traps and issues

  • Cursor must be deterministic
  • API contract changes
  • Cursor format must be opaque to client

Docs changes needed

  • Update Docs/Backend-Development.md § Database Performance

Doc references

  • Docs/Backend-Development.md (database performance)

[MEDIUM] Session secret rotation not implemented

Where found

  • backend/app/config.py — single session_secret with no rotation support

Why this is needed

If secret leaks, all sessions compromised. No way to invalidate old sessions.

Goal

Support gradual secret rotation without forcing logout.

What to do

  1. Store multiple secrets: current and previous
  2. Accept tokens signed with either key
  3. Re-sign tokens with current secret on validation

Possible traps and issues

  • Rotation strategy must be documented
  • Metrics needed to track secret usage

Docs changes needed

  • Update Docs/Backend-Development.md § Session Management

Doc references

  • Docs/Backend-Development.md

[MEDIUM] No CORS configuration

Where found

  • backend/app/main.py — no CORS middleware added

Why this is needed

If frontend on different origin, cross-origin requests blocked without CORS configuration.

Goal

Add CORS middleware with proper origin whitelisting.

What to do

  1. Add CORS middleware with specific origin whitelist
  2. Make configurable via environment variable
  3. Default to localhost for development

Possible traps and issues

  • allow_origins=["*"] defeats CORS security
  • Credentials require specific origins, not wildcard
  • Missing config silently fails in browser

Docs changes needed

  • Update Docs/Deployment.md § CORS Configuration

Doc references

  • Docs/Deployment.md

[MEDIUM] Input validation missing for regex patterns (ReDoS)

Where found

  • backend/app/routers/config.py — regex validation accepts arbitrary patterns without timeout

Why this is needed

Malicious regex causes catastrophic backtracking (ReDoS). Attacker sends pattern → compilation hangs → DoS.

Goal

Add timeout and complexity limits to regex validation.

What to do

  1. Add timeout to regex compilation (2 seconds recommended)
  2. Add length limit (reject patterns > 1000 characters)
  3. Use signal.alarm() (Unix) or timeout library

Possible traps and issues

  • signal.alarm() Unix-only
  • Some valid complex regexes may timeout
  • Frontend should also validate (defense in depth)

Docs changes needed

  • Update API docs to document regex validation limits

Doc references

  • backend/app/routers/config.py

[MEDIUM] No structured logging to external system

Where found

  • Logs only go to stdout/file, no external aggregation

Why this is needed

Can't search across instances, historical logs lost on instance recycle.

Goal

Ship logs to centralized logging platform.

What to do

  1. Short-term: Ensure structlog JSON output is valid (already done)
  2. Long-term: Ship to logging platform (ELK, Datadog, Papertrail)

Possible traps and issues

  • External logging adds latency
  • Sensitive data must not be logged
  • Log volume can be massive

Docs changes needed

  • Add Docs/Observability.md section on logging

Doc references

  • Docs/Observability.md (new)

[MEDIUM] No Application Performance Monitoring (APM)

Where found

  • Backend: no metrics collection, latency tracking
  • Frontend: no error tracking, performance metrics
  • No observability into request performance

Why this is needed

Without metrics, blind in production: API slow? Unknown. Which endpoints fail most? Unknown.

Goal

Add comprehensive metrics collection and monitoring.

What to do

  1. Backend metrics:

    • Add Prometheus metrics: request count, latency, active requests
    • Expose /metrics endpoint
  2. Frontend metrics:

    • Page load time, FCP, LCP using web-vitals
    • API error rates and latencies
  3. Aggregation:

    • Prometheus + Grafana, or Datadog/NewRelic

Possible traps and issues

  • Metrics collection has performance cost
  • Cardinality explosion with tags
  • PII in metrics

Docs changes needed

  • Add Docs/Observability.md

Doc references

  • Docs/Observability.md (new)

[LOW] Frontend charts not memoized

Where found

  • frontend/src/components/TopCountriesPieChart.tsx
  • frontend/src/components/TopCountriesBarChart.tsx

Why this is needed

Charts re-render on every parent update, Recharts reprocesses 5000+ points.

Goal

Memoize chart components.

What to do

  1. Wrap with React.memo with custom comparison
  2. Ensure data objects are stable

Possible traps and issues

  • Shallow comparison might not be enough
  • Memoization has memory cost

Docs changes needed

  • No documentation changes

Doc references

  • frontend/src/components/TopCountriesChart.tsx

[LOW] No request deduplication on frontend

Where found

  • frontend/src/hooks/useFetchData.ts — each call launches new request
  • User clicks "Refresh" twice → two identical requests

Why this is needed

Duplicates waste bandwidth, cause race conditions (response 2 arrives first, then response 1 overwrites with stale data).

Goal

Deduplicate identical in-flight requests.

What to do

  1. Implement request cache
  2. Clear cache entry when response received
  3. Use in useFetchData

Possible traps and issues

  • Cache must be cleared on data mutation
  • Stale data in cache possible if not careful

Docs changes needed

  • No documentation changes

Doc references

  • frontend/src/hooks/useFetchData.ts