Commit Graph

20 Commits

Author SHA1 Message Date
52f237d5d4 Make background tasks idempotent - prevent duplicate bans on retry
CRITICAL FIX: Background tasks (especially blocklist_import) crashed mid-execution,
leaving partial state. On retry, the same bans were applied again, causing duplicates.

Solution: Content-hash based operation tracking for blocklist imports:
- Added import_runs table (migration 6) to track operations by source + content hash
- Before banning, check if this exact content has already been imported
- If completed: skip banning (already done), optionally re-warm cache
- If new or failed: proceed with ban and mark as completed or failed

Changes:
- Database: Migration 6 adds import_runs table with operation state tracking
- Model: Added ImportRunEntry for import run records
- Repository: New import_run_repo module with CRUD operations
- Workflow: Updated blocklist_import_workflow to check operation history before banning
- Dependencies: Registered import_run_repo for dependency injection
- Tests: Added test_import_source_idempotent_on_retry and test_import_source_different_content_not_reused
- Documentation: Added Task Idempotency section to Backend-Development.md

Verification:
- All 7 import tests pass (5 existing + 2 new idempotency tests)
- Type checking: mypy --strict 
- Linting: ruff 
- No API changes, backwards compatible via automatic migration

Fixes: Background tasks not idempotent #CRITICAL

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-04-30 21:54:14 +02:00
18036d53bf Fix issue #31: Make schedule reschedule deterministic and observable
Replace fire-and-forget reschedule pattern with proper async/await:
- Changed reschedule() from fire-and-forget to awaitable async function
- Errors are now properly propagated instead of silently failing
- Added structured logging for reschedule start and completion
- Schedule updates are now deterministic and observable to callers

Changes:
- app/tasks/blocklist_import.py: Convert reschedule to async, remove asyncio.ensure_future
- tests/test_tasks/test_blocklist_import.py: Add tests for error propagation and logging
- Docs/Features.md: Document scheduling reliability guarantees

All 15 blocklist_import tests pass with 100% coverage.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-04-29 19:24:55 +02:00
e2560f5db0 TASK-032: Implement geo_cache retention policy and cleanup
Add automatic cleanup of stale geolocation cache entries to prevent
unbounded database growth. Resolves the issue where unique IP addresses
accumulated indefinitely in the geo_cache table, degrading query performance.

## Changes

### Database Schema (Migration 3)
- Add 'last_seen' column to geo_cache table tracking last reference time
- Existing entries default to current timestamp

### Repository Layer (geo_cache_repo.py)
- Update upsert_entry() to set/refresh last_seen on insert/update
- Update upsert_neg_entry() to set/refresh last_seen on negative cache hits
- Update bulk_upsert_entries() to set/refresh last_seen in batch operations
- Add delete_stale_entries(db, cutoff_iso) -> int for purging old entries

### Background Task (geo_cache_cleanup.py)
- New APScheduler task that runs nightly (24-hour interval)
- Calculates cutoff as 90 days ago from current time (UTC)
- Deletes all entries with last_seen older than cutoff
- Logs operation results (info when deleted > 0, debug when 0 deleted)
- Configurable retention period via GEO_CACHE_RETENTION_DAYS constant

### Application Startup (startup.py)
- Register geo_cache_cleanup task in scheduler during app startup
- Placed after geo_cache_flush in task registration order

### Tests
- Add delete_stale_entries test cases covering:
  * Removal of old entries beyond cutoff
  * No deletion when all entries are recent
  * Empty table edge case
- Update existing test fixtures to include last_seen column
- Add full test suite for cleanup task registration and execution

### Documentation
- Architekture.md: Document cleanup task, update schema/diagram
- Backend-Development.md: Add retention policy documentation

## Behavior

When an IP is accessed, its last_seen is refreshed. After 90 days of no
access, an IP is purged by the nightly cleanup. On next encounter, the IP
is re-resolved from MaxMind MMDB or ip-api.com (if configured).

This is acceptable because:
1. Stale geolocation data may become inaccurate over time
2. Re-resolution cost is minimal compared to unbounded storage growth
3. Active IPs maintain fresh data through their last_seen updates

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-04-26 19:24:34 +02:00
a5b55d1248 Add session cleanup task and update documentation
- Implement session_cleanup task for removing expired sessions
- Add comprehensive tests for session cleanup functionality
- Update architecture and task documentation
- Integrate cleanup task into application startup

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-04-26 12:49:13 +02:00
d467190eb1 Refactor geo caching and service layer tests
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-04-25 18:15:31 +02:00
5e5d7c34b2 Document task DB access and unify background task DB handling 2026-04-17 17:18:49 +02:00
7a1cb0c46c Extract health-check crash-detection logic into runtime state helper 2026-04-17 16:58:24 +02:00
73cc212e28 Invert blocklist scheduler dependency to task callback 2026-04-15 21:31:08 +02:00
58bb769a35 Refactor history sync into history_service and update docs/tests 2026-04-14 15:09:58 +02:00
ae81a8f5be Refactor periodic tasks to use injected scheduler resources 2026-04-11 20:32:36 +02:00
9cba5a9fcb Refactor blocklist import registration to async startup flow 2026-04-11 20:07:00 +02:00
e21f153946 Remove dead task DB fast-path and update task tests 2026-04-07 20:00:13 +02:00
ffaa14f864 Switch dashboard/map/history views to archive source for long-term data
Update fail2ban dbpurgeage to 648000 and history sync backfill/pagination for archive-based 7.5 day history.
2026-04-05 20:21:54 +02:00
876af46955 history archive router precedence + endpoint/source tests + history sync register test + task status update 2026-03-24 21:06:58 +01:00
a442836c5c refactor: complete Task 2/3 geo decouple + exceptions centralization; mark as done 2026-03-22 14:24:25 +01:00
1c0bac1353 refactor: improve backend type safety and import organization
- Add TYPE_CHECKING guards for runtime-expensive imports (aiohttp, aiosqlite)
- Reorganize imports to follow PEP 8 conventions
- Convert TypeAlias to modern PEP 695 type syntax (where appropriate)
- Use Sequence/Mapping from collections.abc for type hints (covariant)
- Replace string literals with cast() for improved type inference
- Fix casting of Fail2BanResponse and TypedDict patterns
- Add IpLookupResult TypedDict for precise return type annotation
- Reformat overlong lines for readability (120 char limit)
- Add asyncio_mode and filterwarnings to pytest config
- Update test fixtures with improved type hints

This improves mypy type checking and makes type relationships explicit.
2026-03-22 14:24:24 +01:00
bdcdd5d672 Fix geo_re_resolve async mocks and mark tasks complete 2026-03-22 14:24:24 +01:00
0966f347c4 feat: Task 3 — invalid jail config recovery (pre-validation, crash detection, rollback)
- Backend: extend activate_jail() with pre-validation and 4-attempt post-reload
  health probe; add validate_jail_config() and rollback_jail() service functions
- Backend: new endpoints POST /api/config/jails/{name}/validate,
  GET /api/config/pending-recovery, POST /api/config/jails/{name}/rollback
- Backend: extend JailActivationResponse with fail2ban_running + validation_warnings;
  add JailValidationIssue, JailValidationResult, PendingRecovery, RollbackResponse models
- Backend: health_check task tracks last_activation and creates PendingRecovery
  record when fail2ban goes offline within 60 s of an activation
- Backend: add fail2ban_start_command setting (configurable start cmd for rollback)
- Frontend: ActivateJailDialog — pre-validation on open, crash-detected callback,
  extended spinner text during activation+verify
- Frontend: JailsTab — Validate Config button for inactive jails, validation
  result panels (blocking errors + advisory warnings)
- Frontend: RecoveryBanner component — polls pending-recovery, shows full-width
  alert with Disable & Restart / View Logs buttons
- Frontend: MainLayout — mount RecoveryBanner at layout level
- Tests: 19 new backend service tests (validate, rollback, filter/action parsing)
  + 6 health_check crash-detection tests + 11 router tests; 5 RecoveryBanner
  frontend tests; fix mock setup in existing activate_jail tests
2026-03-14 14:13:07 +01:00
d6da81131f Add tests for background tasks and fail2ban client utility
- tests/test_tasks/test_blocklist_import.py: 14 tests, 96% coverage
- tests/test_tasks/test_health_check.py: 12 tests, 100% coverage
- tests/test_tasks/test_geo_cache_flush.py: 8 tests, 100% coverage
- tests/test_services/test_fail2ban_client.py: 24 new tests, 96% coverage

Total: 50 new tests (628 → 678 passing). Overall coverage 85% → 87%.
ruff, mypy --strict, tsc, and eslint all clean.
2026-03-13 10:29:22 +01:00
ea35695221 Add better jail configuration: file CRUD, enable/disable, log paths
Task 4 (Better Jail Configuration) implementation:
- Add fail2ban_config_dir setting to app/config.py
- New file_config_service: list/view/edit/create jail.d, filter.d, action.d files
  with path-traversal prevention and 512 KB content size limit
- New file_config router: GET/PUT/POST endpoints for jail files, filter files,
  and action files; PUT .../enabled for toggle on/off
- Extend config_service with delete_log_path() and add_log_path()
- Add DELETE /api/config/jails/{name}/logpath and POST /api/config/jails/{name}/logpath
- Extend geo router with re-resolve endpoint; add geo_re_resolve background task
- Update blocklist_service with revised scheduling helpers
- Update Docker compose files with BANGUI_FAIL2BAN_CONFIG_DIR env var and
  rw volume mount for the fail2ban config directory
- Frontend: new Jail Files, Filters, Actions tabs in ConfigPage; file editor
  with accordion-per-file, editable textarea, save/create; add/delete log paths
- Frontend: types in types/config.ts; API calls in api/config.ts and api/endpoints.ts
- 63 new backend tests (test_file_config_service, test_file_config, test_geo_re_resolve)
- 6 new frontend tests in ConfigPageLogPath.test.tsx
- ruff, mypy --strict, tsc --noEmit, eslint: all clean; 617 backend tests pass
2026-03-12 20:08:33 +01:00