# Backend Development — Rules & Guidelines Rules and conventions every backend developer must follow. Read this before writing your first line of code. --- ## 1. Language & Typing - **Python 3.12+** is the minimum version. - **Every** function, method, and variable must have explicit type annotations — no exceptions. - Use `str`, `int`, `float`, `bool`, `None` for primitives. - Use `list[T]`, `dict[K, V]`, `set[T]`, `tuple[T, ...]` (lowercase, built-in generics) — never `typing.List`, `typing.Dict`, etc. - Use `T | None` instead of `Optional[T]`. - Use `TypeAlias`, `TypeVar`, `Protocol`, and `NewType` when they improve clarity. - Return types are **mandatory** — including `-> None`. - Never use `Any` unless there is no other option and a comment explains why. - Run `mypy --strict` (or `pyright` in strict mode) — the codebase must pass with zero errors. ```python # Good def get_jail_by_name(name: str) -> Jail | None: ... # Bad — missing types def get_jail_by_name(name): ... ``` --- ## 2. Core Libraries | Purpose | Library | Notes | |---|---|---| | Web framework | **FastAPI** | Async endpoints only. | | Data validation & settings | **Pydantic v2** | All request/response bodies and config models. | | Async HTTP client | **aiohttp** (`ClientSession`) | For external calls (blocklists, IP lookups). | | Scheduling | **APScheduler 4.x** (async) | Blocklist imports, periodic health checks. | | Structured logging | **structlog** | Every log call must use structlog — never `print()` or `logging` directly. | | Database | **aiosqlite** | Async SQLite access for the application database. | | Testing | **pytest** + **pytest-asyncio** + **httpx** (`AsyncClient`) | Every feature needs tests. | | Mocking | **unittest.mock** / **pytest-mock** | Isolate external dependencies. | | Date & time | **datetime** (stdlib) — always timezone-aware | Use `datetime.datetime.now(datetime.UTC)`. Never naive datetimes. | | IP / Network | **ipaddress** (stdlib) | Validate and normalise IPs and CIDR ranges. | | Environment / config | **pydantic-settings** | Load `.env` and environment variables into typed models. | | fail2ban integration | **fail2ban client** (bundled) | Use the local copy at [`./fail2ban-master`](../fail2ban-master). Import from [`./fail2ban-master/fail2ban/client`](../fail2ban-master/fail2ban/client) to communicate with the fail2ban socket. Do **not** install fail2ban as a pip package. | ### fail2ban Client Usage The repository ships with a vendored copy of fail2ban located at `./fail2ban-master`. All communication with the fail2ban daemon must go through the client classes found in `./fail2ban-master/fail2ban/client`. Add the project root to `sys.path` (or configure it in `pyproject.toml` as a path dependency) so that `from fail2ban.client ...` resolves to the bundled copy. ```python import sys from pathlib import Path # Ensure the bundled fail2ban is importable sys.path.insert(0, str(Path(__file__).resolve().parents[2] / "fail2ban-master")) from fail2ban.client.csocket import CSSocket # noqa: E402 ``` ### Libraries you must NOT use - `requests` — use `aiohttp` (async). - `flask` — we use FastAPI. - `celery` — we use APScheduler. - `print()` for logging — use `structlog`. - `json.loads` / `json.dumps` on Pydantic models — use `.model_dump()` / `.model_validate()`. --- ## 3. Project Structure ``` backend/ ├── app/ │ ├── __init__.py │ ├── main.py # FastAPI app factory, lifespan │ ├── config.py # Pydantic settings │ ├── dependencies.py # FastAPI dependency providers │ ├── models/ # Pydantic schemas (request, response, domain) │ ├── routers/ # FastAPI routers grouped by feature │ ├── services/ # Business logic — one service per domain │ ├── repositories/ # Database access layer │ ├── tasks/ # APScheduler jobs │ └── utils/ # Helpers, constants, shared types ├── tests/ │ ├── conftest.py │ ├── test_routers/ │ ├── test_services/ │ └── test_repositories/ ├── pyproject.toml └── .env.example ``` - **Routers** receive requests, validate input via Pydantic, and delegate to **services**. - **Services** contain business logic and call **repositories** or external clients. - **Repositories** handle raw database queries — nothing else. - Never put business logic inside routers or repositories. --- ## 4. FastAPI Conventions - Use **async def** for every endpoint — no sync endpoints. - Every endpoint must declare explicit **response models** (`response_model=...`). - Use **Pydantic models** for request bodies and query parameters — never raw dicts. - Use **Depends()** for dependency injection (database sessions, services, auth). - Group endpoints into routers by feature domain (`routers/jails.py`, `routers/bans.py`, …). - Use appropriate HTTP status codes: `201` for creation, `204` for deletion with no body, `404` for not found, etc. - Use **HTTPException** or custom exception handlers — never return error dicts manually. - **GET endpoints are read-only — never call `db.commit()` or execute INSERT/UPDATE/DELETE inside a GET handler.** If a GET path produces side-effects (e.g., caching resolved data), that write belongs in a background task, a scheduled flush, or a separate POST endpoint. Users and HTTP caches assume GET is idempotent and non-mutating. ```python # Good — pass db=None on GET so geo_service never commits result = await geo_service.lookup_batch(ips, http_session, db=None) # Bad — triggers INSERT + COMMIT per IP inside a GET handler result = await geo_service.lookup_batch(ips, http_session, db=app_db) ``` ```python from fastapi import APIRouter, Depends, HTTPException, status from app.models.jail import JailResponse, JailListResponse from app.services.jail_service import JailService router: APIRouter = APIRouter(prefix="/api/jails", tags=["Jails"]) @router.get("/", response_model=JailListResponse) async def list_jails(service: JailService = Depends()) -> JailListResponse: jails: list[JailResponse] = await service.get_all_jails() return JailListResponse(jails=jails) ``` --- ## 5. Pydantic Models - Every model inherits from `pydantic.BaseModel`. - Use `model_config = ConfigDict(strict=True)` where appropriate. - Field names use **snake_case** in Python, export as **camelCase** to the frontend via alias generators if needed. - Validate at the boundary — once data enters a Pydantic model it is trusted. - Use `Field(...)` with descriptions for every field to keep auto-generated docs useful. - Separate **request models**, **response models**, and **domain (internal) models** — do not reuse one model for all three. ```python from pydantic import BaseModel, Field from datetime import datetime class BanResponse(BaseModel): ip: str = Field(..., description="Banned IP address") jail: str = Field(..., description="Jail that issued the ban") banned_at: datetime = Field(..., description="UTC timestamp of the ban") expires_at: datetime | None = Field(None, description="UTC expiry, None if permanent") ban_count: int = Field(..., ge=1, description="Number of times this IP was banned") ``` --- ## 6. Async Rules - **Never** call blocking / synchronous I/O in an async function — no `time.sleep()`, no synchronous file reads, no `requests.get()`. - Use `aiohttp.ClientSession` for HTTP calls, `aiosqlite` for database access. - Use `asyncio.TaskGroup` (Python 3.11+) when you need to run independent coroutines concurrently. - Long-running startup/shutdown logic goes into the **FastAPI lifespan** context manager. - **Never call `db.commit()` inside a loop.** With aiosqlite, every commit serialises through a background thread and forces an `fsync`. N rows × 1 commit = N fsyncs. Accumulate all writes in the loop, then issue a single `db.commit()` once after the loop ends. The difference between 5,000 commits and 1 commit can be seconds vs milliseconds. ```python # Good — one commit for the whole batch for ip, info in results.items(): await db.execute(INSERT_SQL, (ip, info.country_code, ...)) await db.commit() # ← single fsync # Bad — one fsync per row for ip, info in results.items(): await db.execute(INSERT_SQL, (ip, info.country_code, ...)) await db.commit() # ← fsync on every iteration ``` - **Prefer `executemany()` over calling `execute()` in a loop** when inserting or updating multiple rows with the same SQL template. aiosqlite passes the entire batch to SQLite in one call, reducing Python↔thread overhead on top of the single-commit saving. ```python # Good await db.executemany(INSERT_SQL, [(ip, cc, cn, asn, org) for ip, info in results.items()]) await db.commit() ``` - Shared resources (DB connections, HTTP sessions) are created once during startup and closed during shutdown — never inside request handlers. ```python from contextlib import asynccontextmanager from collections.abc import AsyncGenerator from fastapi import FastAPI import aiohttp import aiosqlite @asynccontextmanager async def lifespan(app: FastAPI) -> AsyncGenerator[None]: # Startup app.state.http_session = aiohttp.ClientSession() app.state.db = await aiosqlite.connect("bangui.db") yield # Shutdown await app.state.http_session.close() await app.state.db.close() ``` --- ## 7. Logging - Use **structlog** for every log message. - Bind contextual key-value pairs — never format strings manually. - Log levels: `debug` for development detail, `info` for operational events, `warning` for recoverable issues, `error` for failures, `critical` for fatal problems. - Never log sensitive data (passwords, tokens, session IDs). ```python import structlog log: structlog.stdlib.BoundLogger = structlog.get_logger() async def ban_ip(ip: str, jail: str) -> None: log.info("banning_ip", ip=ip, jail=jail) try: await _execute_ban(ip, jail) log.info("ip_banned", ip=ip, jail=jail) except BanError as exc: log.error("ban_failed", ip=ip, jail=jail, error=str(exc)) raise ``` --- ## 8. Error Handling - Define **custom exception classes** for domain errors (e.g., `JailNotFoundError`, `BanFailedError`). - Catch specific exceptions — never bare `except:` or `except Exception:` without re-raising. - Map domain exceptions to HTTP status codes via FastAPI **exception handlers** registered on the app. - Always log errors with context before raising. ```python class JailNotFoundError(Exception): def __init__(self, name: str) -> None: self.name: str = name super().__init__(f"Jail '{name}' not found") # In main.py @app.exception_handler(JailNotFoundError) async def jail_not_found_handler(request: Request, exc: JailNotFoundError) -> JSONResponse: return JSONResponse(status_code=404, content={"detail": f"Jail '{exc.name}' not found"}) ``` --- ## 9. Testing - **Every** new feature or bug fix must include tests. - Tests live in `tests/` mirroring the `app/` structure. - Use `pytest` with `pytest-asyncio` for async tests. - Use `httpx.AsyncClient` to test FastAPI endpoints (not `TestClient` which is sync). - Mock external dependencies (fail2ban socket, aiohttp calls) — tests must never touch real infrastructure. - Aim for **>80 % line coverage** — critical paths (auth, banning, scheduling) must be 100 %. - Test names follow `test___` pattern. ```python import pytest from httpx import AsyncClient, ASGITransport from app.main import create_app @pytest.fixture async def client() -> AsyncClient: app = create_app() transport: ASGITransport = ASGITransport(app=app) async with AsyncClient(transport=transport, base_url="http://test") as ac: yield ac @pytest.mark.asyncio async def test_list_jails_returns_200(client: AsyncClient) -> None: response = await client.get("/api/jails/") assert response.status_code == 200 data: dict = response.json() assert "jails" in data ``` --- ## 10. Code Style & Tooling | Tool | Purpose | |---|---| | **Ruff** | Linter and formatter (replaces black, isort, flake8). | | **mypy** or **pyright** | Static type checking in strict mode. | | **pre-commit** | Run ruff + type checker before every commit. | - Line length: **120 characters** max. - Strings: use **double quotes** (`"`). - Imports: sorted by ruff — stdlib → third-party → local, one import per line. - No unused imports, no unused variables, no `# type: ignore` without explanation. - Docstrings in **Google style** on every public function, class, and module. --- ## 11. Configuration & Secrets - All configuration lives in **environment variables** loaded through **pydantic-settings**. - Secrets (master password hash, session key) are **never** committed to the repository. - Provide a `.env.example` with all keys and placeholder values. - Validate config at startup — fail fast with a clear error if a required value is missing. ```python from pydantic_settings import BaseSettings from pydantic import Field class Settings(BaseSettings): database_path: str = Field("bangui.db", description="Path to SQLite database") fail2ban_socket: str = Field("/var/run/fail2ban/fail2ban.sock", description="fail2ban socket path") session_secret: str = Field(..., description="Secret key for session signing") log_level: str = Field("info", description="Logging level") model_config = {"env_prefix": "BANGUI_", "env_file": ".env"} ``` --- ## 12. Git & Workflow - **Branch naming:** `feature/`, `fix/`, `chore/`. - **Commit messages:** imperative tense, max 72 chars first line (`Add jail reload endpoint`, `Fix ban history query`). - Every merge request must pass: ruff, type checker, all tests. - Do not merge with failing CI. - Keep pull requests small and focused — one feature or fix per PR. --- ## 13. Coding Principles These principles are **non-negotiable**. Every backend contributor must internalise and apply them daily. ### 13.1 Clean Code - Write code that **reads like well-written prose** — a new developer should understand intent without asking. - **Meaningful names** — variables, functions, and classes must reveal their purpose. Avoid abbreviations (`cnt`, `mgr`, `tmp`) unless universally understood. - **Small functions** — each function does exactly one thing. If you need a comment to explain a block inside a function, extract it into its own function. - **No magic numbers or strings** — use named constants. - **Boy Scout Rule** — leave every file cleaner than you found it. - **Avoid deep nesting** — prefer early returns (guard clauses) to keep the happy path at the top indentation level. ```python # Good — guard clause, clear name, one job async def get_active_ban(ip: str, jail: str) -> Ban: ban: Ban | None = await repo.find_ban(ip=ip, jail=jail) if ban is None: raise BanNotFoundError(ip=ip, jail=jail) if ban.is_expired(): raise BanExpiredError(ip=ip, jail=jail) return ban # Bad — nested, vague name async def check(ip, j): b = await repo.find_ban(ip=ip, jail=j) if b: if not b.is_expired(): return b else: raise Exception("expired") else: raise Exception("not found") ``` ### 13.2 Separation of Concerns (SoC) - Each module, class, and function must have a **single, well-defined responsibility**. - **Routers** → HTTP layer only (parse requests, return responses). - **Services** → business logic and orchestration. - **Repositories** → data access and persistence. - **Models** → data shapes and validation. - **Tasks** → scheduled background jobs. - Never mix layers — a router must not execute SQL, and a repository must not raise `HTTPException`. ### 13.3 Single Responsibility Principle (SRP) - A class or module should have **one and only one reason to change**. - If a service handles both ban management *and* email notifications, split it into `BanService` and `NotificationService`. ### 13.4 Don't Repeat Yourself (DRY) - Extract shared logic into utility functions, base classes, or dependency providers. - If the same block of code appears in more than one place, **refactor it** into a single source of truth. - But don't over-abstract — premature DRY that couples unrelated features is worse than a little duplication (see **Rule of Three**: refactor when something appears a third time). ### 13.5 KISS — Keep It Simple, Stupid - Choose the simplest solution that works correctly. - Avoid clever tricks, premature optimisation, and over-engineering. - If a standard library function does the job, prefer it over a custom implementation. ### 13.6 YAGNI — You Aren't Gonna Need It - Do **not** build features, abstractions, or config options "just in case". - Implement what is required **now**. Extend later when a real need emerges. ### 13.7 Dependency Inversion Principle (DIP) - High-level modules (services) must not depend on low-level modules (repositories) directly. Both should depend on **abstractions** (protocols / interfaces). - Use FastAPI's `Depends()` to inject implementations — this makes swapping and testing trivial. ```python from typing import Protocol class BanRepository(Protocol): async def find_ban(self, ip: str, jail: str) -> Ban | None: ... async def save_ban(self, ban: Ban) -> None: ... class SqliteBanRepository: """Concrete implementation — depends on aiosqlite.""" async def find_ban(self, ip: str, jail: str) -> Ban | None: ... async def save_ban(self, ban: Ban) -> None: ... ``` ### 13.8 Composition over Inheritance - Favour **composing** small, focused objects over deep inheritance hierarchies. - Use mixins or protocols only when a clear "is-a" relationship exists; otherwise, pass collaborators as constructor arguments. ### 13.9 Fail Fast - Validate inputs as early as possible — at the API boundary with Pydantic, at service entry with assertions or domain checks. - Raise specific exceptions immediately rather than letting bad data propagate silently. ### 13.10 Law of Demeter (Principle of Least Knowledge) - A function should only call methods on: 1. Its own object (`self`). 2. Objects passed as parameters. 3. Objects it creates. - Avoid long accessor chains like `request.state.db.cursor().execute(...)` — wrap them in a meaningful method. ### 13.11 Defensive Programming - Never trust external input — validate and sanitise everything that crosses a boundary (HTTP request, file, socket, environment variable). - Handle edge cases explicitly: empty lists, `None` values, negative numbers, empty strings. - Use type narrowing and exhaustive pattern matching (`match` / `case`) to eliminate impossible states. --- ## 14. Quick Reference — Do / Don't | Do | Don't | |---|---| | Type every function, variable, return | Leave types implicit | | Use `async def` for I/O | Use sync functions for I/O | | Validate with Pydantic at the boundary | Pass raw dicts through the codebase | | Log with structlog + context keys | Use `print()` or format strings in logs | | Write tests for every feature | Ship untested code | | Use `aiohttp` for HTTP calls | Use `requests` | | Handle errors with custom exceptions | Use bare `except:` | | Keep routers thin, logic in services | Put business logic in routers | | Use `datetime.now(datetime.UTC)` | Use naive datetimes | | Run ruff + mypy before committing | Push code that doesn't pass linting | | Keep GET endpoints read-only (no `db.commit()`) | Call `db.commit()` / INSERT inside GET handlers | | Batch DB writes; issue one `db.commit()` after the loop | Commit inside a loop (1 fsync per row) | | Use `executemany()` for bulk inserts | Call `execute()` + `commit()` per row in a loop |