CRITICAL FIX: Background tasks (especially blocklist_import) crashed mid-execution, leaving partial state. On retry, the same bans were applied again, causing duplicates. Solution: Content-hash based operation tracking for blocklist imports: - Added import_runs table (migration 6) to track operations by source + content hash - Before banning, check if this exact content has already been imported - If completed: skip banning (already done), optionally re-warm cache - If new or failed: proceed with ban and mark as completed or failed Changes: - Database: Migration 6 adds import_runs table with operation state tracking - Model: Added ImportRunEntry for import run records - Repository: New import_run_repo module with CRUD operations - Workflow: Updated blocklist_import_workflow to check operation history before banning - Dependencies: Registered import_run_repo for dependency injection - Tests: Added test_import_source_idempotent_on_retry and test_import_source_different_content_not_reused - Documentation: Added Task Idempotency section to Backend-Development.md Verification: - All 7 import tests pass (5 existing + 2 new idempotency tests) - Type checking: mypy --strict ✅ - Linting: ruff ✅ - No API changes, backwards compatible via automatic migration Fixes: Background tasks not idempotent #CRITICAL Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
63 lines
1.7 KiB
Python
63 lines
1.7 KiB
Python
"""Timeout protection utilities for background tasks.
|
|
|
|
Provides helpers to wrap async task functions with asyncio.wait_for() timeout
|
|
protection. Ensures tasks complete within bounded time or fail gracefully with
|
|
proper logging and error handling.
|
|
"""
|
|
|
|
from __future__ import annotations
|
|
|
|
import asyncio
|
|
import time
|
|
from collections.abc import Awaitable
|
|
from typing import TypeVar
|
|
|
|
import structlog
|
|
|
|
log: structlog.stdlib.BoundLogger = structlog.get_logger()
|
|
|
|
T = TypeVar("T")
|
|
|
|
|
|
async def run_with_timeout(
|
|
task_name: str,
|
|
coro: Awaitable[T],
|
|
timeout_seconds: int,
|
|
) -> T:
|
|
"""Run an async coroutine with timeout protection.
|
|
|
|
Args:
|
|
task_name: Human-readable name of the task for logging.
|
|
coro: The coroutine to execute.
|
|
timeout_seconds: Maximum seconds to wait before timeout.
|
|
|
|
Raises:
|
|
asyncio.TimeoutError: If the task exceeds the timeout.
|
|
|
|
Returns:
|
|
The return value of the coroutine.
|
|
"""
|
|
start_time = time.monotonic()
|
|
try:
|
|
result: T = await asyncio.wait_for(coro, timeout=timeout_seconds)
|
|
elapsed = time.monotonic() - start_time
|
|
if elapsed > timeout_seconds * 0.8:
|
|
log.warning(
|
|
"task_approaching_timeout",
|
|
task_name=task_name,
|
|
timeout_seconds=timeout_seconds,
|
|
elapsed_seconds=round(elapsed, 2),
|
|
usage_percent=round((elapsed / timeout_seconds) * 100, 1),
|
|
)
|
|
return result
|
|
except TimeoutError:
|
|
elapsed = time.monotonic() - start_time
|
|
log.warning(
|
|
"task_timeout",
|
|
task_name=task_name,
|
|
timeout_seconds=timeout_seconds,
|
|
elapsed_seconds=round(elapsed, 2),
|
|
)
|
|
raise
|
|
|