Commit Graph

4 Commits

Author SHA1 Message Date
0133489920 Update observability docs and task utilities
- Add Observability.md documentation
- Standardize task logging with correlation_id support
- Add log_sanitizer utility for PII masking
- Update Tasks.md tracking
- Update geo_cache tasks and other task modules with correlation_id

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-03 11:52:09 +02:00
52f237d5d4 Make background tasks idempotent - prevent duplicate bans on retry
CRITICAL FIX: Background tasks (especially blocklist_import) crashed mid-execution,
leaving partial state. On retry, the same bans were applied again, causing duplicates.

Solution: Content-hash based operation tracking for blocklist imports:
- Added import_runs table (migration 6) to track operations by source + content hash
- Before banning, check if this exact content has already been imported
- If completed: skip banning (already done), optionally re-warm cache
- If new or failed: proceed with ban and mark as completed or failed

Changes:
- Database: Migration 6 adds import_runs table with operation state tracking
- Model: Added ImportRunEntry for import run records
- Repository: New import_run_repo module with CRUD operations
- Workflow: Updated blocklist_import_workflow to check operation history before banning
- Dependencies: Registered import_run_repo for dependency injection
- Tests: Added test_import_source_idempotent_on_retry and test_import_source_different_content_not_reused
- Documentation: Added Task Idempotency section to Backend-Development.md

Verification:
- All 7 import tests pass (5 existing + 2 new idempotency tests)
- Type checking: mypy --strict 
- Linting: ruff 
- No API changes, backwards compatible via automatic migration

Fixes: Background tasks not idempotent #CRITICAL

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-04-30 21:54:14 +02:00
3bd9848a08 Implement global rate limiter and refactor auth middleware
- Add global rate limiter utility with configurable limits and cleanup
- Move rate limiting logic to middleware for consistent application
- Update auth routes to use new rate limiter
- Add comprehensive tests for rate limiter functionality
- Update documentation with backend development guidelines and tasks

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-04-30 21:26:31 +02:00
c2dd9f5f55 Add scheduled cleanup for rate limiter (#32)
Implement periodic cleanup of expired rate-limiter entries to prevent
unbounded memory growth during long runtimes.

Changes:
- Create rate_limiter_cleanup task that calls cleanup_expired() every 30 minutes
- Register the task in the startup DAG alongside other background jobs
- Update rate_limiter module documentation with operational notes about the
  cleanup lifecycle and memory management strategy

The cleanup is conservative and only removes IPs with no recent attempts
(all timestamps outside the rate-limit window), so active IPs are preserved.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-04-29 19:28:45 +02:00