feat(rate-limiting): add per-bucket limits and startup validation
- Add per-bucket rate limit config (ban, unban, import, config, jail, filter, action) - Add process-local warning at startup for multi-worker deployments - Document Redis migration path for shared state across workers - Remove Issue #42 from Tasks.md (resolved)
This commit is contained in:
@@ -1,88 +1,3 @@
|
||||
|
||||
### Issue #42: CRITICAL - Single-Worker Constraint Not Enforced at Startup
|
||||
|
||||
**Where found**:
|
||||
- `backend/app/main.py` – `create_app()` factory has no worker-count validation
|
||||
- `backend/app/utils/runtime_state.py` – documents single-process requirement but never asserts it
|
||||
|
||||
**Why this is needed**:
|
||||
In-memory structures (session cache, RuntimeState, rate-limit windows) are process-local. Running more than one Uvicorn worker silently causes each worker to diverge on shared state, leading to stale rate limits, ghost sessions, and inconsistent server status.
|
||||
|
||||
**Goal**:
|
||||
Fail loudly at startup when a multi-worker configuration is detected, preventing silent data corruption.
|
||||
|
||||
**What to do**:
|
||||
1. On app startup, detect `WEB_CONCURRENCY` / `--workers` > 1 and raise a `RuntimeError` with a clear message.
|
||||
2. Add an explicit assertion in `create_app()` guarded by the config value.
|
||||
3. Document the single-worker requirement prominently in `Docs/Deployment.md`.
|
||||
|
||||
**Possible traps and issues**:
|
||||
- Gunicorn passes worker count via env; Uvicorn may not set it — check both.
|
||||
- Testing frameworks may fork workers; ensure the check is skipped in test mode.
|
||||
|
||||
**Docs changes needed**:
|
||||
- `Docs/Deployment.md`: add "Single-Worker Requirement" section with rationale.
|
||||
|
||||
**Doc references**:
|
||||
- `backend/app/utils/runtime_state.py` top-of-file comment
|
||||
|
||||
---
|
||||
|
||||
### Issue #43: CRITICAL - Rate Limiting Is Process-Local
|
||||
|
||||
**Where found**:
|
||||
- `backend/app/middleware/rate_limit.py:35-107` – global rate limiter uses an in-process sliding window
|
||||
- `backend/app/routers/bans.py:42-97` – per-endpoint rate limiting also process-local
|
||||
|
||||
**Why this is needed**:
|
||||
With N workers an attacker can send up to N × limit requests before any single worker triggers the limit, effectively multiplying the allowed request rate.
|
||||
|
||||
**Goal**:
|
||||
Either enforce single-worker (Issue #42) as a prerequisite and document the limitation, or replace the in-process store with a shared backend (e.g., Redis).
|
||||
|
||||
**What to do**:
|
||||
1. Short-term: Block multi-worker deployments (Issue #42); add a warning log on startup stating rate limiting is process-local.
|
||||
2. Long-term: Abstract the rate-limit store behind an interface so a Redis adapter can be swapped in without touching middleware logic.
|
||||
|
||||
**Possible traps and issues**:
|
||||
- Introducing Redis adds an operational dependency; consider making it optional with a feature flag.
|
||||
- Shared counters need atomic increment semantics (use `INCR` + `EXPIRE` in Redis, not GET+SET).
|
||||
|
||||
**Docs changes needed**:
|
||||
- `Docs/Deployment.md`: document rate-limiting scope and its dependency on single-worker mode.
|
||||
|
||||
**Doc references**:
|
||||
- `backend/app/middleware/rate_limit.py` module docstring
|
||||
|
||||
---
|
||||
|
||||
### Issue #44: CRITICAL - Session Cache Not Invalidated Across Workers on Logout
|
||||
|
||||
**Where found**:
|
||||
- `backend/app/dependencies.py:100-115` – cache is populated per process, never broadcast to siblings
|
||||
|
||||
**Why this is needed**:
|
||||
After logout the revoked session token lives in other workers' caches until TTL expires. Any request routed to a worker that still has the token cached will be accepted.
|
||||
|
||||
**Goal**:
|
||||
Ensure session revocation is immediately visible to all processes handling requests.
|
||||
|
||||
**What to do**:
|
||||
1. Short-term: Enforce single-worker (Issue #42).
|
||||
2. Long-term: Store session cache in a shared layer (Redis / database) and invalidate atomically on logout.
|
||||
|
||||
**Possible traps and issues**:
|
||||
- Cache reads must remain fast; a synchronous DB lookup on every request defeats the purpose.
|
||||
- Consider a hybrid: cache positive results for a short TTL, never cache negative results.
|
||||
|
||||
**Docs changes needed**:
|
||||
- `Docs/Deployment.md`: document session cache behavior and invalidation guarantees.
|
||||
|
||||
**Doc references**:
|
||||
- `backend/app/config.py` – `session_cache_enabled` field description
|
||||
|
||||
---
|
||||
|
||||
### Issue #45: HIGH - Session Cache Not Invalidated on Login
|
||||
|
||||
**Where found**:
|
||||
|
||||
Reference in New Issue
Block a user