Make background tasks idempotent - prevent duplicate bans on retry
CRITICAL FIX: Background tasks (especially blocklist_import) crashed mid-execution, leaving partial state. On retry, the same bans were applied again, causing duplicates. Solution: Content-hash based operation tracking for blocklist imports: - Added import_runs table (migration 6) to track operations by source + content hash - Before banning, check if this exact content has already been imported - If completed: skip banning (already done), optionally re-warm cache - If new or failed: proceed with ban and mark as completed or failed Changes: - Database: Migration 6 adds import_runs table with operation state tracking - Model: Added ImportRunEntry for import run records - Repository: New import_run_repo module with CRUD operations - Workflow: Updated blocklist_import_workflow to check operation history before banning - Dependencies: Registered import_run_repo for dependency injection - Tests: Added test_import_source_idempotent_on_retry and test_import_source_different_content_not_reused - Documentation: Added Task Idempotency section to Backend-Development.md Verification: - All 7 import tests pass (5 existing + 2 new idempotency tests) - Type checking: mypy --strict ✅ - Linting: ruff ✅ - No API changes, backwards compatible via automatic migration Fixes: Background tasks not idempotent #CRITICAL Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This commit is contained in:
62
backend/app/tasks/timeout_utils.py
Normal file
62
backend/app/tasks/timeout_utils.py
Normal file
@@ -0,0 +1,62 @@
|
||||
"""Timeout protection utilities for background tasks.
|
||||
|
||||
Provides helpers to wrap async task functions with asyncio.wait_for() timeout
|
||||
protection. Ensures tasks complete within bounded time or fail gracefully with
|
||||
proper logging and error handling.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import asyncio
|
||||
import time
|
||||
from collections.abc import Awaitable
|
||||
from typing import TypeVar
|
||||
|
||||
import structlog
|
||||
|
||||
log: structlog.stdlib.BoundLogger = structlog.get_logger()
|
||||
|
||||
T = TypeVar("T")
|
||||
|
||||
|
||||
async def run_with_timeout(
|
||||
task_name: str,
|
||||
coro: Awaitable[T],
|
||||
timeout_seconds: int,
|
||||
) -> T:
|
||||
"""Run an async coroutine with timeout protection.
|
||||
|
||||
Args:
|
||||
task_name: Human-readable name of the task for logging.
|
||||
coro: The coroutine to execute.
|
||||
timeout_seconds: Maximum seconds to wait before timeout.
|
||||
|
||||
Raises:
|
||||
asyncio.TimeoutError: If the task exceeds the timeout.
|
||||
|
||||
Returns:
|
||||
The return value of the coroutine.
|
||||
"""
|
||||
start_time = time.monotonic()
|
||||
try:
|
||||
result: T = await asyncio.wait_for(coro, timeout=timeout_seconds)
|
||||
elapsed = time.monotonic() - start_time
|
||||
if elapsed > timeout_seconds * 0.8:
|
||||
log.warning(
|
||||
"task_approaching_timeout",
|
||||
task_name=task_name,
|
||||
timeout_seconds=timeout_seconds,
|
||||
elapsed_seconds=round(elapsed, 2),
|
||||
usage_percent=round((elapsed / timeout_seconds) * 100, 1),
|
||||
)
|
||||
return result
|
||||
except TimeoutError:
|
||||
elapsed = time.monotonic() - start_time
|
||||
log.warning(
|
||||
"task_timeout",
|
||||
task_name=task_name,
|
||||
timeout_seconds=timeout_seconds,
|
||||
elapsed_seconds=round(elapsed, 2),
|
||||
)
|
||||
raise
|
||||
|
||||
Reference in New Issue
Block a user