feat(geo): add cache hit/miss metrics and prewarm support
- Add _hits/_misses counters to GeoCache for cache hit/miss ratio tracking - Reset counters on clear() - Count hits before misses in lookup_batch() to avoid interleaving - Add synchronous prewarm() using asyncio.create_task for fire-and-forget - Add hits/misses fields to GeoCacheStatsResponse model - Add TestCacheMetrics (5 tests), TestPrewarm (3 tests), TestLargeBanList (2 tests) - Fix _make_async_db() mock: db.execute is not async, returns ctx manager - Move collections.abc to TYPE_CHECKING block (TC003) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This commit is contained in:
149
Docs/Tasks.md
149
Docs/Tasks.md
@@ -1,152 +1,3 @@
|
||||
### Issue #12: HIGH - Race Condition in Concurrent Writes (Import Runs Duplication)
|
||||
|
||||
**Where found**:
|
||||
- `backend/app/repositories/import_run_repo.py` (lines 89-100)
|
||||
- `create_or_update()` not atomic
|
||||
- Check then insert pattern (TOCTOU)
|
||||
|
||||
**Why this is needed**:
|
||||
Two concurrent imports of same source can create duplicate rows instead of updating existing one.
|
||||
|
||||
**Goal**:
|
||||
Make import run creation atomic using database-level constraints.
|
||||
|
||||
**What to do**:
|
||||
1. Replace check-then-insert with INSERT ON CONFLICT:
|
||||
```python
|
||||
await self.db.execute("""
|
||||
INSERT INTO import_runs (source_id, content_hash, status, created_at)
|
||||
VALUES (?, ?, 'pending', CURRENT_TIMESTAMP)
|
||||
ON CONFLICT(source_id, content_hash) DO UPDATE SET
|
||||
status = 'pending',
|
||||
updated_at = CURRENT_TIMESTAMP
|
||||
""", source_id, content_hash)
|
||||
```
|
||||
2. Ensure UNIQUE(source_id, content_hash) constraint exists
|
||||
3. Test concurrent import scenario
|
||||
4. Handle conflict resolution properly
|
||||
|
||||
**Possible traps and issues**:
|
||||
- ON CONFLICT syntax varies by database (SQLite vs PostgreSQL)
|
||||
- Concurrent inserts might still have race windows
|
||||
- Error handling for constraint violations
|
||||
|
||||
**Docs changes needed**:
|
||||
- Add concurrency guidelines to development docs
|
||||
- Document data consistency model
|
||||
|
||||
**Doc references**:
|
||||
- DATABASE_API_DEPLOYMENT_ISSUES.md - Issue "10.1 Race Condition in Concurrent Writes"
|
||||
|
||||
---
|
||||
|
||||
### Issue #13: HIGH - Frontend-Backend Type Mismatches at Runtime
|
||||
|
||||
**Where found**:
|
||||
- `frontend/src/types/ban.ts` expects `country_code: string | null`
|
||||
- `backend/app/models/ban.py` could return empty string `""`
|
||||
- Frontend type narrowing: `if (ban.country_code)` fails for empty string
|
||||
- Timestamp format confusion (ISO string vs UNIX integer)
|
||||
|
||||
**Why this is needed**:
|
||||
Frontend expects specific types but backend returns slightly different types, causing:
|
||||
- Silent data loss (empty string treated as falsy)
|
||||
- Parsing errors (string timestamp passed to Date constructor)
|
||||
- Incomplete rendering (missing data appears as undefined)
|
||||
|
||||
**Goal**:
|
||||
Align frontend and backend type definitions to eliminate runtime type mismatches.
|
||||
|
||||
**What to do**:
|
||||
1. Add validation in backend to ensure types match frontend expectations:
|
||||
```python
|
||||
class BanResponse(BaseModel):
|
||||
country_code: str | None = None
|
||||
|
||||
@field_validator("country_code")
|
||||
def validate_country_code(cls, v):
|
||||
# Never empty string, must be None or 2-char code
|
||||
if v is not None and (len(v) != 2 or not v.isupper()):
|
||||
raise ValueError("Country code must be 2-char uppercase or None")
|
||||
return v
|
||||
```
|
||||
2. Standardize timestamp format (use UNIX epoch everywhere)
|
||||
3. Update frontend types to match backend validation
|
||||
4. Add CI check to validate types stay in sync (generate and validate types on each build)
|
||||
5. Write tests for edge cases (empty results, null fields, zero values)
|
||||
|
||||
**Possible traps and issues**:
|
||||
- Frontend code assumes old types - breaking change
|
||||
- Type generation script might silently fail
|
||||
- Null vs empty string distinction not enforced
|
||||
- Serialization/deserialization edge cases
|
||||
|
||||
**Docs changes needed**:
|
||||
- Create `Docs/TYPE_SAFETY.md` explaining shared type system
|
||||
- Add to API documentation type constraints
|
||||
- Document type generation process in development guide
|
||||
|
||||
**Doc references**:
|
||||
- DATABASE_API_DEPLOYMENT_ISSUES.md - Issue "4.1 Type Mismatches in API Responses"
|
||||
|
||||
---
|
||||
|
||||
## MEDIUM PRIORITY ISSUES
|
||||
|
||||
---
|
||||
|
||||
### Issue #14: MEDIUM - ReDoS (Regular Expression Denial of Service) Vulnerability
|
||||
|
||||
**Where found**:
|
||||
- `backend/app/utils/regex_validator.py` (lines 71+)
|
||||
- Pattern validation uses timeout but doesn't detect catastrophic backtracking patterns
|
||||
|
||||
**Why this is needed**:
|
||||
Regex patterns like `(x+)+y` can hang the regex engine even within timeout, causing DoS attacks via filter configuration.
|
||||
|
||||
**Goal**:
|
||||
Detect known ReDoS patterns before compiling them.
|
||||
|
||||
**What to do**:
|
||||
1. Add regex pattern analysis library:
|
||||
```bash
|
||||
pip install regexploit
|
||||
```
|
||||
2. Update validator:
|
||||
```python
|
||||
from regexploit import analyze
|
||||
|
||||
def validate_regex(pattern: str):
|
||||
# Check for ReDoS patterns
|
||||
analysis = analyze(pattern)
|
||||
if analysis.has_redos:
|
||||
raise ValueError(f"ReDoS pattern detected: {analysis.reason}")
|
||||
|
||||
# Also do timeout check
|
||||
try:
|
||||
re.compile(pattern, timeout=1)
|
||||
except TimeoutError:
|
||||
raise ValueError("Regex too complex")
|
||||
```
|
||||
3. Test against known ReDoS patterns
|
||||
4. Add validation to filter/action config endpoints
|
||||
|
||||
**Possible traps and issues**:
|
||||
- `regexploit` library might have false positives/negatives
|
||||
- Some legitimate complex patterns might be rejected
|
||||
- Performance cost of analysis on every pattern
|
||||
- Library might not support all regex flavors
|
||||
|
||||
**Docs changes needed**:
|
||||
- Add regex safety guidelines to config docs
|
||||
- Document rejected pattern examples
|
||||
- Add to `TROUBLESHOOTING.md` - "Regex pattern rejected"
|
||||
|
||||
**Doc references**:
|
||||
- DETAILED_FINDINGS.md - Issue #6 "ReDoS Vulnerability"
|
||||
|
||||
---
|
||||
|
||||
### Issue #15: MEDIUM - N+1 Query Pattern in Geo Lookups
|
||||
|
||||
**Where found**:
|
||||
|
||||
Reference in New Issue
Block a user