- Cache setup_completed flag in app.state._setup_complete_cached after first successful is_setup_complete() call; all subsequent API requests skip the DB query entirely (one-way transition, cleared on restart). - Add in-memory session token TTL cache (10 s) in require_auth; the second request with the same token within the window skips session_repo.get_session. - Call invalidate_session_cache() on logout so revoked tokens are evicted immediately rather than waiting for TTL expiry. - Add clear_session_cache() for test isolation. - 5 new tests covering the cached fast-path for both optimisations. - 460 tests pass, 83% coverage, zero ruff/mypy warnings.
462 lines
20 KiB
Markdown
462 lines
20 KiB
Markdown
# Backend Development — Rules & Guidelines
|
||
|
||
Rules and conventions every backend developer must follow. Read this before writing your first line of code.
|
||
|
||
---
|
||
|
||
## 1. Language & Typing
|
||
|
||
- **Python 3.12+** is the minimum version.
|
||
- **Every** function, method, and variable must have explicit type annotations — no exceptions.
|
||
- Use `str`, `int`, `float`, `bool`, `None` for primitives.
|
||
- Use `list[T]`, `dict[K, V]`, `set[T]`, `tuple[T, ...]` (lowercase, built-in generics) — never `typing.List`, `typing.Dict`, etc.
|
||
- Use `T | None` instead of `Optional[T]`.
|
||
- Use `TypeAlias`, `TypeVar`, `Protocol`, and `NewType` when they improve clarity.
|
||
- Return types are **mandatory** — including `-> None`.
|
||
- Never use `Any` unless there is no other option and a comment explains why.
|
||
- Run `mypy --strict` (or `pyright` in strict mode) — the codebase must pass with zero errors.
|
||
|
||
```python
|
||
# Good
|
||
def get_jail_by_name(name: str) -> Jail | None:
|
||
...
|
||
|
||
# Bad — missing types
|
||
def get_jail_by_name(name):
|
||
...
|
||
```
|
||
|
||
---
|
||
|
||
## 2. Core Libraries
|
||
|
||
| Purpose | Library | Notes |
|
||
|---|---|---|
|
||
| Web framework | **FastAPI** | Async endpoints only. |
|
||
| Data validation & settings | **Pydantic v2** | All request/response bodies and config models. |
|
||
| Async HTTP client | **aiohttp** (`ClientSession`) | For external calls (blocklists, IP lookups). |
|
||
| Scheduling | **APScheduler 4.x** (async) | Blocklist imports, periodic health checks. |
|
||
| Structured logging | **structlog** | Every log call must use structlog — never `print()` or `logging` directly. |
|
||
| Database | **aiosqlite** | Async SQLite access for the application database. |
|
||
| Testing | **pytest** + **pytest-asyncio** + **httpx** (`AsyncClient`) | Every feature needs tests. |
|
||
| Mocking | **unittest.mock** / **pytest-mock** | Isolate external dependencies. |
|
||
| Date & time | **datetime** (stdlib) — always timezone-aware | Use `datetime.datetime.now(datetime.UTC)`. Never naive datetimes. |
|
||
| IP / Network | **ipaddress** (stdlib) | Validate and normalise IPs and CIDR ranges. |
|
||
| Environment / config | **pydantic-settings** | Load `.env` and environment variables into typed models. |
|
||
| fail2ban integration | **fail2ban client** (bundled) | Use the local copy at [`./fail2ban-master`](../fail2ban-master). Import from [`./fail2ban-master/fail2ban/client`](../fail2ban-master/fail2ban/client) to communicate with the fail2ban socket. Do **not** install fail2ban as a pip package. |
|
||
|
||
### fail2ban Client Usage
|
||
|
||
The repository ships with a vendored copy of fail2ban located at `./fail2ban-master`.
|
||
All communication with the fail2ban daemon must go through the client classes found in `./fail2ban-master/fail2ban/client`.
|
||
Add the project root to `sys.path` (or configure it in `pyproject.toml` as a path dependency) so that `from fail2ban.client ...` resolves to the bundled copy.
|
||
|
||
```python
|
||
import sys
|
||
from pathlib import Path
|
||
|
||
# Ensure the bundled fail2ban is importable
|
||
sys.path.insert(0, str(Path(__file__).resolve().parents[2] / "fail2ban-master"))
|
||
|
||
from fail2ban.client.csocket import CSSocket # noqa: E402
|
||
```
|
||
|
||
### Libraries you must NOT use
|
||
|
||
- `requests` — use `aiohttp` (async).
|
||
- `flask` — we use FastAPI.
|
||
- `celery` — we use APScheduler.
|
||
- `print()` for logging — use `structlog`.
|
||
- `json.loads` / `json.dumps` on Pydantic models — use `.model_dump()` / `.model_validate()`.
|
||
|
||
---
|
||
|
||
## 3. Project Structure
|
||
|
||
```
|
||
backend/
|
||
├── app/
|
||
│ ├── __init__.py
|
||
│ ├── main.py # FastAPI app factory, lifespan
|
||
│ ├── config.py # Pydantic settings
|
||
│ ├── dependencies.py # FastAPI dependency providers
|
||
│ ├── models/ # Pydantic schemas (request, response, domain)
|
||
│ ├── routers/ # FastAPI routers grouped by feature
|
||
│ ├── services/ # Business logic — one service per domain
|
||
│ ├── repositories/ # Database access layer
|
||
│ ├── tasks/ # APScheduler jobs
|
||
│ └── utils/ # Helpers, constants, shared types
|
||
├── tests/
|
||
│ ├── conftest.py
|
||
│ ├── test_routers/
|
||
│ ├── test_services/
|
||
│ └── test_repositories/
|
||
├── pyproject.toml
|
||
└── .env.example
|
||
```
|
||
|
||
- **Routers** receive requests, validate input via Pydantic, and delegate to **services**.
|
||
- **Services** contain business logic and call **repositories** or external clients.
|
||
- **Repositories** handle raw database queries — nothing else.
|
||
- Never put business logic inside routers or repositories.
|
||
|
||
---
|
||
|
||
## 4. FastAPI Conventions
|
||
|
||
- Use **async def** for every endpoint — no sync endpoints.
|
||
- Every endpoint must declare explicit **response models** (`response_model=...`).
|
||
- Use **Pydantic models** for request bodies and query parameters — never raw dicts.
|
||
- Use **Depends()** for dependency injection (database sessions, services, auth).
|
||
- Group endpoints into routers by feature domain (`routers/jails.py`, `routers/bans.py`, …).
|
||
- Use appropriate HTTP status codes: `201` for creation, `204` for deletion with no body, `404` for not found, etc.
|
||
- Use **HTTPException** or custom exception handlers — never return error dicts manually.
|
||
- **GET endpoints are read-only — never call `db.commit()` or execute INSERT/UPDATE/DELETE inside a GET handler.** If a GET path produces side-effects (e.g., caching resolved data), that write belongs in a background task, a scheduled flush, or a separate POST endpoint. Users and HTTP caches assume GET is idempotent and non-mutating.
|
||
|
||
```python
|
||
# Good — pass db=None on GET so geo_service never commits
|
||
result = await geo_service.lookup_batch(ips, http_session, db=None)
|
||
|
||
# Bad — triggers INSERT + COMMIT per IP inside a GET handler
|
||
result = await geo_service.lookup_batch(ips, http_session, db=app_db)
|
||
```
|
||
|
||
```python
|
||
from fastapi import APIRouter, Depends, HTTPException, status
|
||
from app.models.jail import JailResponse, JailListResponse
|
||
from app.services.jail_service import JailService
|
||
|
||
router: APIRouter = APIRouter(prefix="/api/jails", tags=["Jails"])
|
||
|
||
@router.get("/", response_model=JailListResponse)
|
||
async def list_jails(service: JailService = Depends()) -> JailListResponse:
|
||
jails: list[JailResponse] = await service.get_all_jails()
|
||
return JailListResponse(jails=jails)
|
||
```
|
||
|
||
---
|
||
|
||
## 5. Pydantic Models
|
||
|
||
- Every model inherits from `pydantic.BaseModel`.
|
||
- Use `model_config = ConfigDict(strict=True)` where appropriate.
|
||
- Field names use **snake_case** in Python, export as **camelCase** to the frontend via alias generators if needed.
|
||
- Validate at the boundary — once data enters a Pydantic model it is trusted.
|
||
- Use `Field(...)` with descriptions for every field to keep auto-generated docs useful.
|
||
- Separate **request models**, **response models**, and **domain (internal) models** — do not reuse one model for all three.
|
||
|
||
```python
|
||
from pydantic import BaseModel, Field
|
||
from datetime import datetime
|
||
|
||
class BanResponse(BaseModel):
|
||
ip: str = Field(..., description="Banned IP address")
|
||
jail: str = Field(..., description="Jail that issued the ban")
|
||
banned_at: datetime = Field(..., description="UTC timestamp of the ban")
|
||
expires_at: datetime | None = Field(None, description="UTC expiry, None if permanent")
|
||
ban_count: int = Field(..., ge=1, description="Number of times this IP was banned")
|
||
```
|
||
|
||
---
|
||
|
||
## 6. Async Rules
|
||
|
||
- **Never** call blocking / synchronous I/O in an async function — no `time.sleep()`, no synchronous file reads, no `requests.get()`.
|
||
- Use `aiohttp.ClientSession` for HTTP calls, `aiosqlite` for database access.
|
||
- Use `asyncio.TaskGroup` (Python 3.11+) when you need to run independent coroutines concurrently.
|
||
- Long-running startup/shutdown logic goes into the **FastAPI lifespan** context manager.
|
||
- **Never call `db.commit()` inside a loop.** With aiosqlite, every commit serialises through a background thread and forces an `fsync`. N rows × 1 commit = N fsyncs. Accumulate all writes in the loop, then issue a single `db.commit()` once after the loop ends. The difference between 5,000 commits and 1 commit can be seconds vs milliseconds.
|
||
|
||
```python
|
||
# Good — one commit for the whole batch
|
||
for ip, info in results.items():
|
||
await db.execute(INSERT_SQL, (ip, info.country_code, ...))
|
||
await db.commit() # ← single fsync
|
||
|
||
# Bad — one fsync per row
|
||
for ip, info in results.items():
|
||
await db.execute(INSERT_SQL, (ip, info.country_code, ...))
|
||
await db.commit() # ← fsync on every iteration
|
||
```
|
||
- **Prefer `executemany()` over calling `execute()` in a loop** when inserting or updating multiple rows with the same SQL template. aiosqlite passes the entire batch to SQLite in one call, reducing Python↔thread overhead on top of the single-commit saving.
|
||
|
||
```python
|
||
# Good
|
||
await db.executemany(INSERT_SQL, [(ip, cc, cn, asn, org) for ip, info in results.items()])
|
||
await db.commit()
|
||
```
|
||
- Shared resources (DB connections, HTTP sessions) are created once during startup and closed during shutdown — never inside request handlers.
|
||
|
||
```python
|
||
from contextlib import asynccontextmanager
|
||
from collections.abc import AsyncGenerator
|
||
from fastapi import FastAPI
|
||
import aiohttp
|
||
import aiosqlite
|
||
|
||
@asynccontextmanager
|
||
async def lifespan(app: FastAPI) -> AsyncGenerator[None]:
|
||
# Startup
|
||
app.state.http_session = aiohttp.ClientSession()
|
||
app.state.db = await aiosqlite.connect("bangui.db")
|
||
yield
|
||
# Shutdown
|
||
await app.state.http_session.close()
|
||
await app.state.db.close()
|
||
```
|
||
|
||
---
|
||
|
||
## 7. Logging
|
||
|
||
- Use **structlog** for every log message.
|
||
- Bind contextual key-value pairs — never format strings manually.
|
||
- Log levels: `debug` for development detail, `info` for operational events, `warning` for recoverable issues, `error` for failures, `critical` for fatal problems.
|
||
- Never log sensitive data (passwords, tokens, session IDs).
|
||
|
||
```python
|
||
import structlog
|
||
|
||
log: structlog.stdlib.BoundLogger = structlog.get_logger()
|
||
|
||
async def ban_ip(ip: str, jail: str) -> None:
|
||
log.info("banning_ip", ip=ip, jail=jail)
|
||
try:
|
||
await _execute_ban(ip, jail)
|
||
log.info("ip_banned", ip=ip, jail=jail)
|
||
except BanError as exc:
|
||
log.error("ban_failed", ip=ip, jail=jail, error=str(exc))
|
||
raise
|
||
```
|
||
|
||
---
|
||
|
||
## 8. Error Handling
|
||
|
||
- Define **custom exception classes** for domain errors (e.g., `JailNotFoundError`, `BanFailedError`).
|
||
- Catch specific exceptions — never bare `except:` or `except Exception:` without re-raising.
|
||
- Map domain exceptions to HTTP status codes via FastAPI **exception handlers** registered on the app.
|
||
- Always log errors with context before raising.
|
||
|
||
```python
|
||
class JailNotFoundError(Exception):
|
||
def __init__(self, name: str) -> None:
|
||
self.name: str = name
|
||
super().__init__(f"Jail '{name}' not found")
|
||
|
||
# In main.py
|
||
@app.exception_handler(JailNotFoundError)
|
||
async def jail_not_found_handler(request: Request, exc: JailNotFoundError) -> JSONResponse:
|
||
return JSONResponse(status_code=404, content={"detail": f"Jail '{exc.name}' not found"})
|
||
```
|
||
|
||
---
|
||
|
||
## 9. Testing
|
||
|
||
- **Every** new feature or bug fix must include tests.
|
||
- Tests live in `tests/` mirroring the `app/` structure.
|
||
- Use `pytest` with `pytest-asyncio` for async tests.
|
||
- Use `httpx.AsyncClient` to test FastAPI endpoints (not `TestClient` which is sync).
|
||
- Mock external dependencies (fail2ban socket, aiohttp calls) — tests must never touch real infrastructure.
|
||
- Aim for **>80 % line coverage** — critical paths (auth, banning, scheduling) must be 100 %.
|
||
- Test names follow `test_<unit>_<scenario>_<expected>` pattern.
|
||
|
||
```python
|
||
import pytest
|
||
from httpx import AsyncClient, ASGITransport
|
||
from app.main import create_app
|
||
|
||
@pytest.fixture
|
||
async def client() -> AsyncClient:
|
||
app = create_app()
|
||
transport: ASGITransport = ASGITransport(app=app)
|
||
async with AsyncClient(transport=transport, base_url="http://test") as ac:
|
||
yield ac
|
||
|
||
@pytest.mark.asyncio
|
||
async def test_list_jails_returns_200(client: AsyncClient) -> None:
|
||
response = await client.get("/api/jails/")
|
||
assert response.status_code == 200
|
||
data: dict = response.json()
|
||
assert "jails" in data
|
||
```
|
||
|
||
---
|
||
|
||
## 10. Code Style & Tooling
|
||
|
||
| Tool | Purpose |
|
||
|---|---|
|
||
| **Ruff** | Linter and formatter (replaces black, isort, flake8). |
|
||
| **mypy** or **pyright** | Static type checking in strict mode. |
|
||
| **pre-commit** | Run ruff + type checker before every commit. |
|
||
|
||
- Line length: **120 characters** max.
|
||
- Strings: use **double quotes** (`"`).
|
||
- Imports: sorted by ruff — stdlib → third-party → local, one import per line.
|
||
- No unused imports, no unused variables, no `# type: ignore` without explanation.
|
||
- Docstrings in **Google style** on every public function, class, and module.
|
||
|
||
---
|
||
|
||
## 11. Configuration & Secrets
|
||
|
||
- All configuration lives in **environment variables** loaded through **pydantic-settings**.
|
||
- Secrets (master password hash, session key) are **never** committed to the repository.
|
||
- Provide a `.env.example` with all keys and placeholder values.
|
||
- Validate config at startup — fail fast with a clear error if a required value is missing.
|
||
|
||
```python
|
||
from pydantic_settings import BaseSettings
|
||
from pydantic import Field
|
||
|
||
class Settings(BaseSettings):
|
||
database_path: str = Field("bangui.db", description="Path to SQLite database")
|
||
fail2ban_socket: str = Field("/var/run/fail2ban/fail2ban.sock", description="fail2ban socket path")
|
||
session_secret: str = Field(..., description="Secret key for session signing")
|
||
log_level: str = Field("info", description="Logging level")
|
||
|
||
model_config = {"env_prefix": "BANGUI_", "env_file": ".env"}
|
||
```
|
||
|
||
---
|
||
|
||
## 12. Git & Workflow
|
||
|
||
- **Branch naming:** `feature/<short-description>`, `fix/<short-description>`, `chore/<short-description>`.
|
||
- **Commit messages:** imperative tense, max 72 chars first line (`Add jail reload endpoint`, `Fix ban history query`).
|
||
- Every merge request must pass: ruff, type checker, all tests.
|
||
- Do not merge with failing CI.
|
||
- Keep pull requests small and focused — one feature or fix per PR.
|
||
|
||
---
|
||
|
||
## 13. Coding Principles
|
||
|
||
These principles are **non-negotiable**. Every backend contributor must internalise and apply them daily.
|
||
|
||
### 13.1 Clean Code
|
||
|
||
- Write code that **reads like well-written prose** — a new developer should understand intent without asking.
|
||
- **Meaningful names** — variables, functions, and classes must reveal their purpose. Avoid abbreviations (`cnt`, `mgr`, `tmp`) unless universally understood.
|
||
- **Small functions** — each function does exactly one thing. If you need a comment to explain a block inside a function, extract it into its own function.
|
||
- **No magic numbers or strings** — use named constants.
|
||
- **Boy Scout Rule** — leave every file cleaner than you found it.
|
||
- **Avoid deep nesting** — prefer early returns (guard clauses) to keep the happy path at the top indentation level.
|
||
|
||
```python
|
||
# Good — guard clause, clear name, one job
|
||
async def get_active_ban(ip: str, jail: str) -> Ban:
|
||
ban: Ban | None = await repo.find_ban(ip=ip, jail=jail)
|
||
if ban is None:
|
||
raise BanNotFoundError(ip=ip, jail=jail)
|
||
if ban.is_expired():
|
||
raise BanExpiredError(ip=ip, jail=jail)
|
||
return ban
|
||
|
||
# Bad — nested, vague name
|
||
async def check(ip, j):
|
||
b = await repo.find_ban(ip=ip, jail=j)
|
||
if b:
|
||
if not b.is_expired():
|
||
return b
|
||
else:
|
||
raise Exception("expired")
|
||
else:
|
||
raise Exception("not found")
|
||
```
|
||
|
||
### 13.2 Separation of Concerns (SoC)
|
||
|
||
- Each module, class, and function must have a **single, well-defined responsibility**.
|
||
- **Routers** → HTTP layer only (parse requests, return responses).
|
||
- **Services** → business logic and orchestration.
|
||
- **Repositories** → data access and persistence.
|
||
- **Models** → data shapes and validation.
|
||
- **Tasks** → scheduled background jobs.
|
||
- Never mix layers — a router must not execute SQL, and a repository must not raise `HTTPException`.
|
||
|
||
### 13.3 Single Responsibility Principle (SRP)
|
||
|
||
- A class or module should have **one and only one reason to change**.
|
||
- If a service handles both ban management *and* email notifications, split it into `BanService` and `NotificationService`.
|
||
|
||
### 13.4 Don't Repeat Yourself (DRY)
|
||
|
||
- Extract shared logic into utility functions, base classes, or dependency providers.
|
||
- If the same block of code appears in more than one place, **refactor it** into a single source of truth.
|
||
- But don't over-abstract — premature DRY that couples unrelated features is worse than a little duplication (see **Rule of Three**: refactor when something appears a third time).
|
||
|
||
### 13.5 KISS — Keep It Simple, Stupid
|
||
|
||
- Choose the simplest solution that works correctly.
|
||
- Avoid clever tricks, premature optimisation, and over-engineering.
|
||
- If a standard library function does the job, prefer it over a custom implementation.
|
||
|
||
### 13.6 YAGNI — You Aren't Gonna Need It
|
||
|
||
- Do **not** build features, abstractions, or config options "just in case".
|
||
- Implement what is required **now**. Extend later when a real need emerges.
|
||
|
||
### 13.7 Dependency Inversion Principle (DIP)
|
||
|
||
- High-level modules (services) must not depend on low-level modules (repositories) directly. Both should depend on **abstractions** (protocols / interfaces).
|
||
- Use FastAPI's `Depends()` to inject implementations — this makes swapping and testing trivial.
|
||
|
||
```python
|
||
from typing import Protocol
|
||
|
||
class BanRepository(Protocol):
|
||
async def find_ban(self, ip: str, jail: str) -> Ban | None: ...
|
||
async def save_ban(self, ban: Ban) -> None: ...
|
||
|
||
class SqliteBanRepository:
|
||
"""Concrete implementation — depends on aiosqlite."""
|
||
async def find_ban(self, ip: str, jail: str) -> Ban | None: ...
|
||
async def save_ban(self, ban: Ban) -> None: ...
|
||
```
|
||
|
||
### 13.8 Composition over Inheritance
|
||
|
||
- Favour **composing** small, focused objects over deep inheritance hierarchies.
|
||
- Use mixins or protocols only when a clear "is-a" relationship exists; otherwise, pass collaborators as constructor arguments.
|
||
|
||
### 13.9 Fail Fast
|
||
|
||
- Validate inputs as early as possible — at the API boundary with Pydantic, at service entry with assertions or domain checks.
|
||
- Raise specific exceptions immediately rather than letting bad data propagate silently.
|
||
|
||
### 13.10 Law of Demeter (Principle of Least Knowledge)
|
||
|
||
- A function should only call methods on:
|
||
1. Its own object (`self`).
|
||
2. Objects passed as parameters.
|
||
3. Objects it creates.
|
||
- Avoid long accessor chains like `request.state.db.cursor().execute(...)` — wrap them in a meaningful method.
|
||
|
||
### 13.11 Defensive Programming
|
||
|
||
- Never trust external input — validate and sanitise everything that crosses a boundary (HTTP request, file, socket, environment variable).
|
||
- Handle edge cases explicitly: empty lists, `None` values, negative numbers, empty strings.
|
||
- Use type narrowing and exhaustive pattern matching (`match` / `case`) to eliminate impossible states.
|
||
|
||
---
|
||
|
||
## 14. Quick Reference — Do / Don't
|
||
|
||
| Do | Don't |
|
||
|---|---|
|
||
| Type every function, variable, return | Leave types implicit |
|
||
| Use `async def` for I/O | Use sync functions for I/O |
|
||
| Validate with Pydantic at the boundary | Pass raw dicts through the codebase |
|
||
| Log with structlog + context keys | Use `print()` or format strings in logs |
|
||
| Write tests for every feature | Ship untested code |
|
||
| Use `aiohttp` for HTTP calls | Use `requests` |
|
||
| Handle errors with custom exceptions | Use bare `except:` |
|
||
| Keep routers thin, logic in services | Put business logic in routers |
|
||
| Use `datetime.now(datetime.UTC)` | Use naive datetimes |
|
||
| Run ruff + mypy before committing | Push code that doesn't pass linting |
|
||
| Keep GET endpoints read-only (no `db.commit()`) | Call `db.commit()` / INSERT inside GET handlers |
|
||
| Batch DB writes; issue one `db.commit()` after the loop | Commit inside a loop (1 fsync per row) |
|
||
| Use `executemany()` for bulk inserts | Call `execute()` + `commit()` per row in a loop | |