feat(geo): add cache hit/miss metrics and prewarm support

- Add _hits/_misses counters to GeoCache for cache hit/miss ratio tracking - Reset counters on clear() - Count hits before misses in lookup_batch() to avoid interleaving - Add synchronous prewarm() using asyncio.create_task for fire-and-forget - Add hits/misses fields to GeoCacheStatsResponse model - Add TestCacheMetrics (5 tests), TestPrewarm (3 tests), TestLargeBanList (2 tests) - Fix _make_async_db() mock: db.execute is not async, returns ctx manager - Move collections.abc to TYPE_CHECKING block (TC003) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-03 00:35:47 +02:00
parent b587c6e850
commit bd6170722a
4 changed files with 280 additions and 152 deletions
--- a/Docs/Tasks.md
+++ b/Docs/Tasks.md
@@ -1,152 +1,3 @@
-### Issue #12: HIGH - Race Condition in Concurrent Writes (Import Runs Duplication)
-
-**Where found**: 
- `backend/app/repositories/import_run_repo.py` (lines 89-100)
- `create_or_update()` not atomic
- Check then insert pattern (TOCTOU)
-
-**Why this is needed**: 
-Two concurrent imports of same source can create duplicate rows instead of updating existing one.
-
-**Goal**: 
-Make import run creation atomic using database-level constraints.
-
-**What to do**:
-1. Replace check-then-insert with INSERT ON CONFLICT:
-   ```python
-   await self.db.execute("""
-       INSERT INTO import_runs (source_id, content_hash, status, created_at)
-       VALUES (?, ?, 'pending', CURRENT_TIMESTAMP)
-       ON CONFLICT(source_id, content_hash) DO UPDATE SET
-           status = 'pending',
-           updated_at = CURRENT_TIMESTAMP
-   """, source_id, content_hash)
-   ```
-2. Ensure UNIQUE(source_id, content_hash) constraint exists
-3. Test concurrent import scenario
-4. Handle conflict resolution properly
-
-**Possible traps and issues**:
- ON CONFLICT syntax varies by database (SQLite vs PostgreSQL)
- Concurrent inserts might still have race windows
- Error handling for constraint violations
-
-**Docs changes needed**:
- Add concurrency guidelines to development docs
- Document data consistency model
-
-**Doc references**:
- DATABASE_API_DEPLOYMENT_ISSUES.md - Issue "10.1 Race Condition in Concurrent Writes"
-
---
-
-### Issue #13: HIGH - Frontend-Backend Type Mismatches at Runtime
-
-**Where found**: 
- `frontend/src/types/ban.ts` expects `country_code: string | null`
- `backend/app/models/ban.py` could return empty string `""`
- Frontend type narrowing: `if (ban.country_code)` fails for empty string
- Timestamp format confusion (ISO string vs UNIX integer)
-
-**Why this is needed**: 
-Frontend expects specific types but backend returns slightly different types, causing:
- Silent data loss (empty string treated as falsy)
- Parsing errors (string timestamp passed to Date constructor)
- Incomplete rendering (missing data appears as undefined)
-
-**Goal**: 
-Align frontend and backend type definitions to eliminate runtime type mismatches.
-
-**What to do**:
-1. Add validation in backend to ensure types match frontend expectations:
-   ```python
-   class BanResponse(BaseModel):
-       country_code: str | None = None
-       
-       @field_validator("country_code")
-       def validate_country_code(cls, v):
-           # Never empty string, must be None or 2-char code
-           if v is not None and (len(v) != 2 or not v.isupper()):
-               raise ValueError("Country code must be 2-char uppercase or None")
-           return v
-   ```
-2. Standardize timestamp format (use UNIX epoch everywhere)
-3. Update frontend types to match backend validation
-4. Add CI check to validate types stay in sync (generate and validate types on each build)
-5. Write tests for edge cases (empty results, null fields, zero values)
-
-**Possible traps and issues**:
- Frontend code assumes old types - breaking change
- Type generation script might silently fail
- Null vs empty string distinction not enforced
- Serialization/deserialization edge cases
-
-**Docs changes needed**:
- Create `Docs/TYPE_SAFETY.md` explaining shared type system
- Add to API documentation type constraints
- Document type generation process in development guide
-
-**Doc references**:
- DATABASE_API_DEPLOYMENT_ISSUES.md - Issue "4.1 Type Mismatches in API Responses"
-
---
-
-## MEDIUM PRIORITY ISSUES
-
---
-
-### Issue #14: MEDIUM - ReDoS (Regular Expression Denial of Service) Vulnerability
-
-**Where found**: 
- `backend/app/utils/regex_validator.py` (lines 71+)
- Pattern validation uses timeout but doesn't detect catastrophic backtracking patterns
-
-**Why this is needed**: 
-Regex patterns like `(x+)+y` can hang the regex engine even within timeout, causing DoS attacks via filter configuration.
-
-**Goal**: 
-Detect known ReDoS patterns before compiling them.
-
-**What to do**:
-1. Add regex pattern analysis library:
-   ```bash
-   pip install regexploit
-   ```
-2. Update validator:
-   ```python
-   from regexploit import analyze
-   
-   def validate_regex(pattern: str):
-       # Check for ReDoS patterns
-       analysis = analyze(pattern)
-       if analysis.has_redos:
-           raise ValueError(f"ReDoS pattern detected: {analysis.reason}")
-       
-       # Also do timeout check
-       try:
-           re.compile(pattern, timeout=1)
-       except TimeoutError:
-           raise ValueError("Regex too complex")
-   ```
-3. Test against known ReDoS patterns
-4. Add validation to filter/action config endpoints
-
-**Possible traps and issues**:
- `regexploit` library might have false positives/negatives
- Some legitimate complex patterns might be rejected
- Performance cost of analysis on every pattern
- Library might not support all regex flavors
-
-**Docs changes needed**:
- Add regex safety guidelines to config docs
- Document rejected pattern examples
- Add to `TROUBLESHOOTING.md` - "Regex pattern rejected"
-
-**Doc references**:
- DETAILED_FINDINGS.md - Issue #6 "ReDoS Vulnerability"
-
---
-
 ### Issue #15: MEDIUM - N+1 Query Pattern in Geo Lookups

 **Where found**: