Files
BanGUI/Docs/Backend-Development.md
2026-02-28 20:52:29 +01:00

430 lines
18 KiB
Markdown

# Backend Development — Rules & Guidelines
Rules and conventions every backend developer must follow. Read this before writing your first line of code.
---
## 1. Language & Typing
- **Python 3.12+** is the minimum version.
- **Every** function, method, and variable must have explicit type annotations — no exceptions.
- Use `str`, `int`, `float`, `bool`, `None` for primitives.
- Use `list[T]`, `dict[K, V]`, `set[T]`, `tuple[T, ...]` (lowercase, built-in generics) — never `typing.List`, `typing.Dict`, etc.
- Use `T | None` instead of `Optional[T]`.
- Use `TypeAlias`, `TypeVar`, `Protocol`, and `NewType` when they improve clarity.
- Return types are **mandatory** — including `-> None`.
- Never use `Any` unless there is no other option and a comment explains why.
- Run `mypy --strict` (or `pyright` in strict mode) — the codebase must pass with zero errors.
```python
# Good
def get_jail_by_name(name: str) -> Jail | None:
...
# Bad — missing types
def get_jail_by_name(name):
...
```
---
## 2. Core Libraries
| Purpose | Library | Notes |
|---|---|---|
| Web framework | **FastAPI** | Async endpoints only. |
| Data validation & settings | **Pydantic v2** | All request/response bodies and config models. |
| Async HTTP client | **aiohttp** (`ClientSession`) | For external calls (blocklists, IP lookups). |
| Scheduling | **APScheduler 4.x** (async) | Blocklist imports, periodic health checks. |
| Structured logging | **structlog** | Every log call must use structlog — never `print()` or `logging` directly. |
| Database | **aiosqlite** | Async SQLite access for the application database. |
| Testing | **pytest** + **pytest-asyncio** + **httpx** (`AsyncClient`) | Every feature needs tests. |
| Mocking | **unittest.mock** / **pytest-mock** | Isolate external dependencies. |
| Date & time | **datetime** (stdlib) — always timezone-aware | Use `datetime.datetime.now(datetime.UTC)`. Never naive datetimes. |
| IP / Network | **ipaddress** (stdlib) | Validate and normalise IPs and CIDR ranges. |
| Environment / config | **pydantic-settings** | Load `.env` and environment variables into typed models. |
| fail2ban integration | **fail2ban client** (bundled) | Use the local copy at [`./fail2ban-master`](../fail2ban-master). Import from [`./fail2ban-master/fail2ban/client`](../fail2ban-master/fail2ban/client) to communicate with the fail2ban socket. Do **not** install fail2ban as a pip package. |
### fail2ban Client Usage
The repository ships with a vendored copy of fail2ban located at `./fail2ban-master`.
All communication with the fail2ban daemon must go through the client classes found in `./fail2ban-master/fail2ban/client`.
Add the project root to `sys.path` (or configure it in `pyproject.toml` as a path dependency) so that `from fail2ban.client ...` resolves to the bundled copy.
```python
import sys
from pathlib import Path
# Ensure the bundled fail2ban is importable
sys.path.insert(0, str(Path(__file__).resolve().parents[2] / "fail2ban-master"))
from fail2ban.client.csocket import CSSocket # noqa: E402
```
### Libraries you must NOT use
- `requests` — use `aiohttp` (async).
- `flask` — we use FastAPI.
- `celery` — we use APScheduler.
- `print()` for logging — use `structlog`.
- `json.loads` / `json.dumps` on Pydantic models — use `.model_dump()` / `.model_validate()`.
---
## 3. Project Structure
```
backend/
├── app/
│ ├── __init__.py
│ ├── main.py # FastAPI app factory, lifespan
│ ├── config.py # Pydantic settings
│ ├── dependencies.py # FastAPI dependency providers
│ ├── models/ # Pydantic schemas (request, response, domain)
│ ├── routers/ # FastAPI routers grouped by feature
│ ├── services/ # Business logic — one service per domain
│ ├── repositories/ # Database access layer
│ ├── tasks/ # APScheduler jobs
│ └── utils/ # Helpers, constants, shared types
├── tests/
│ ├── conftest.py
│ ├── test_routers/
│ ├── test_services/
│ └── test_repositories/
├── pyproject.toml
└── .env.example
```
- **Routers** receive requests, validate input via Pydantic, and delegate to **services**.
- **Services** contain business logic and call **repositories** or external clients.
- **Repositories** handle raw database queries — nothing else.
- Never put business logic inside routers or repositories.
---
## 4. FastAPI Conventions
- Use **async def** for every endpoint — no sync endpoints.
- Every endpoint must declare explicit **response models** (`response_model=...`).
- Use **Pydantic models** for request bodies and query parameters — never raw dicts.
- Use **Depends()** for dependency injection (database sessions, services, auth).
- Group endpoints into routers by feature domain (`routers/jails.py`, `routers/bans.py`, …).
- Use appropriate HTTP status codes: `201` for creation, `204` for deletion with no body, `404` for not found, etc.
- Use **HTTPException** or custom exception handlers — never return error dicts manually.
```python
from fastapi import APIRouter, Depends, HTTPException, status
from app.models.jail import JailResponse, JailListResponse
from app.services.jail_service import JailService
router: APIRouter = APIRouter(prefix="/api/jails", tags=["Jails"])
@router.get("/", response_model=JailListResponse)
async def list_jails(service: JailService = Depends()) -> JailListResponse:
jails: list[JailResponse] = await service.get_all_jails()
return JailListResponse(jails=jails)
```
---
## 5. Pydantic Models
- Every model inherits from `pydantic.BaseModel`.
- Use `model_config = ConfigDict(strict=True)` where appropriate.
- Field names use **snake_case** in Python, export as **camelCase** to the frontend via alias generators if needed.
- Validate at the boundary — once data enters a Pydantic model it is trusted.
- Use `Field(...)` with descriptions for every field to keep auto-generated docs useful.
- Separate **request models**, **response models**, and **domain (internal) models** — do not reuse one model for all three.
```python
from pydantic import BaseModel, Field
from datetime import datetime
class BanResponse(BaseModel):
ip: str = Field(..., description="Banned IP address")
jail: str = Field(..., description="Jail that issued the ban")
banned_at: datetime = Field(..., description="UTC timestamp of the ban")
expires_at: datetime | None = Field(None, description="UTC expiry, None if permanent")
ban_count: int = Field(..., ge=1, description="Number of times this IP was banned")
```
---
## 6. Async Rules
- **Never** call blocking / synchronous I/O in an async function — no `time.sleep()`, no synchronous file reads, no `requests.get()`.
- Use `aiohttp.ClientSession` for HTTP calls, `aiosqlite` for database access.
- Use `asyncio.TaskGroup` (Python 3.11+) when you need to run independent coroutines concurrently.
- Long-running startup/shutdown logic goes into the **FastAPI lifespan** context manager.
- Shared resources (DB connections, HTTP sessions) are created once during startup and closed during shutdown — never inside request handlers.
```python
from contextlib import asynccontextmanager
from collections.abc import AsyncGenerator
from fastapi import FastAPI
import aiohttp
import aiosqlite
@asynccontextmanager
async def lifespan(app: FastAPI) -> AsyncGenerator[None]:
# Startup
app.state.http_session = aiohttp.ClientSession()
app.state.db = await aiosqlite.connect("bangui.db")
yield
# Shutdown
await app.state.http_session.close()
await app.state.db.close()
```
---
## 7. Logging
- Use **structlog** for every log message.
- Bind contextual key-value pairs — never format strings manually.
- Log levels: `debug` for development detail, `info` for operational events, `warning` for recoverable issues, `error` for failures, `critical` for fatal problems.
- Never log sensitive data (passwords, tokens, session IDs).
```python
import structlog
log: structlog.stdlib.BoundLogger = structlog.get_logger()
async def ban_ip(ip: str, jail: str) -> None:
log.info("banning_ip", ip=ip, jail=jail)
try:
await _execute_ban(ip, jail)
log.info("ip_banned", ip=ip, jail=jail)
except BanError as exc:
log.error("ban_failed", ip=ip, jail=jail, error=str(exc))
raise
```
---
## 8. Error Handling
- Define **custom exception classes** for domain errors (e.g., `JailNotFoundError`, `BanFailedError`).
- Catch specific exceptions — never bare `except:` or `except Exception:` without re-raising.
- Map domain exceptions to HTTP status codes via FastAPI **exception handlers** registered on the app.
- Always log errors with context before raising.
```python
class JailNotFoundError(Exception):
def __init__(self, name: str) -> None:
self.name: str = name
super().__init__(f"Jail '{name}' not found")
# In main.py
@app.exception_handler(JailNotFoundError)
async def jail_not_found_handler(request: Request, exc: JailNotFoundError) -> JSONResponse:
return JSONResponse(status_code=404, content={"detail": f"Jail '{exc.name}' not found"})
```
---
## 9. Testing
- **Every** new feature or bug fix must include tests.
- Tests live in `tests/` mirroring the `app/` structure.
- Use `pytest` with `pytest-asyncio` for async tests.
- Use `httpx.AsyncClient` to test FastAPI endpoints (not `TestClient` which is sync).
- Mock external dependencies (fail2ban socket, aiohttp calls) — tests must never touch real infrastructure.
- Aim for **>80 % line coverage** — critical paths (auth, banning, scheduling) must be 100 %.
- Test names follow `test_<unit>_<scenario>_<expected>` pattern.
```python
import pytest
from httpx import AsyncClient, ASGITransport
from app.main import create_app
@pytest.fixture
async def client() -> AsyncClient:
app = create_app()
transport: ASGITransport = ASGITransport(app=app)
async with AsyncClient(transport=transport, base_url="http://test") as ac:
yield ac
@pytest.mark.asyncio
async def test_list_jails_returns_200(client: AsyncClient) -> None:
response = await client.get("/api/jails/")
assert response.status_code == 200
data: dict = response.json()
assert "jails" in data
```
---
## 10. Code Style & Tooling
| Tool | Purpose |
|---|---|
| **Ruff** | Linter and formatter (replaces black, isort, flake8). |
| **mypy** or **pyright** | Static type checking in strict mode. |
| **pre-commit** | Run ruff + type checker before every commit. |
- Line length: **120 characters** max.
- Strings: use **double quotes** (`"`).
- Imports: sorted by ruff — stdlib → third-party → local, one import per line.
- No unused imports, no unused variables, no `# type: ignore` without explanation.
- Docstrings in **Google style** on every public function, class, and module.
---
## 11. Configuration & Secrets
- All configuration lives in **environment variables** loaded through **pydantic-settings**.
- Secrets (master password hash, session key) are **never** committed to the repository.
- Provide a `.env.example` with all keys and placeholder values.
- Validate config at startup — fail fast with a clear error if a required value is missing.
```python
from pydantic_settings import BaseSettings
from pydantic import Field
class Settings(BaseSettings):
database_path: str = Field("bangui.db", description="Path to SQLite database")
fail2ban_socket: str = Field("/var/run/fail2ban/fail2ban.sock", description="fail2ban socket path")
session_secret: str = Field(..., description="Secret key for session signing")
log_level: str = Field("info", description="Logging level")
model_config = {"env_prefix": "BANGUI_", "env_file": ".env"}
```
---
## 12. Git & Workflow
- **Branch naming:** `feature/<short-description>`, `fix/<short-description>`, `chore/<short-description>`.
- **Commit messages:** imperative tense, max 72 chars first line (`Add jail reload endpoint`, `Fix ban history query`).
- Every merge request must pass: ruff, type checker, all tests.
- Do not merge with failing CI.
- Keep pull requests small and focused — one feature or fix per PR.
---
## 13. Coding Principles
These principles are **non-negotiable**. Every backend contributor must internalise and apply them daily.
### 13.1 Clean Code
- Write code that **reads like well-written prose** — a new developer should understand intent without asking.
- **Meaningful names** — variables, functions, and classes must reveal their purpose. Avoid abbreviations (`cnt`, `mgr`, `tmp`) unless universally understood.
- **Small functions** — each function does exactly one thing. If you need a comment to explain a block inside a function, extract it into its own function.
- **No magic numbers or strings** — use named constants.
- **Boy Scout Rule** — leave every file cleaner than you found it.
- **Avoid deep nesting** — prefer early returns (guard clauses) to keep the happy path at the top indentation level.
```python
# Good — guard clause, clear name, one job
async def get_active_ban(ip: str, jail: str) -> Ban:
ban: Ban | None = await repo.find_ban(ip=ip, jail=jail)
if ban is None:
raise BanNotFoundError(ip=ip, jail=jail)
if ban.is_expired():
raise BanExpiredError(ip=ip, jail=jail)
return ban
# Bad — nested, vague name
async def check(ip, j):
b = await repo.find_ban(ip=ip, jail=j)
if b:
if not b.is_expired():
return b
else:
raise Exception("expired")
else:
raise Exception("not found")
```
### 13.2 Separation of Concerns (SoC)
- Each module, class, and function must have a **single, well-defined responsibility**.
- **Routers** → HTTP layer only (parse requests, return responses).
- **Services** → business logic and orchestration.
- **Repositories** → data access and persistence.
- **Models** → data shapes and validation.
- **Tasks** → scheduled background jobs.
- Never mix layers — a router must not execute SQL, and a repository must not raise `HTTPException`.
### 13.3 Single Responsibility Principle (SRP)
- A class or module should have **one and only one reason to change**.
- If a service handles both ban management *and* email notifications, split it into `BanService` and `NotificationService`.
### 13.4 Don't Repeat Yourself (DRY)
- Extract shared logic into utility functions, base classes, or dependency providers.
- If the same block of code appears in more than one place, **refactor it** into a single source of truth.
- But don't over-abstract — premature DRY that couples unrelated features is worse than a little duplication (see **Rule of Three**: refactor when something appears a third time).
### 13.5 KISS — Keep It Simple, Stupid
- Choose the simplest solution that works correctly.
- Avoid clever tricks, premature optimisation, and over-engineering.
- If a standard library function does the job, prefer it over a custom implementation.
### 13.6 YAGNI — You Aren't Gonna Need It
- Do **not** build features, abstractions, or config options "just in case".
- Implement what is required **now**. Extend later when a real need emerges.
### 13.7 Dependency Inversion Principle (DIP)
- High-level modules (services) must not depend on low-level modules (repositories) directly. Both should depend on **abstractions** (protocols / interfaces).
- Use FastAPI's `Depends()` to inject implementations — this makes swapping and testing trivial.
```python
from typing import Protocol
class BanRepository(Protocol):
async def find_ban(self, ip: str, jail: str) -> Ban | None: ...
async def save_ban(self, ban: Ban) -> None: ...
class SqliteBanRepository:
"""Concrete implementation — depends on aiosqlite."""
async def find_ban(self, ip: str, jail: str) -> Ban | None: ...
async def save_ban(self, ban: Ban) -> None: ...
```
### 13.8 Composition over Inheritance
- Favour **composing** small, focused objects over deep inheritance hierarchies.
- Use mixins or protocols only when a clear "is-a" relationship exists; otherwise, pass collaborators as constructor arguments.
### 13.9 Fail Fast
- Validate inputs as early as possible — at the API boundary with Pydantic, at service entry with assertions or domain checks.
- Raise specific exceptions immediately rather than letting bad data propagate silently.
### 13.10 Law of Demeter (Principle of Least Knowledge)
- A function should only call methods on:
1. Its own object (`self`).
2. Objects passed as parameters.
3. Objects it creates.
- Avoid long accessor chains like `request.state.db.cursor().execute(...)` — wrap them in a meaningful method.
### 13.11 Defensive Programming
- Never trust external input — validate and sanitise everything that crosses a boundary (HTTP request, file, socket, environment variable).
- Handle edge cases explicitly: empty lists, `None` values, negative numbers, empty strings.
- Use type narrowing and exhaustive pattern matching (`match` / `case`) to eliminate impossible states.
---
## 14. Quick Reference — Do / Don't
| Do | Don't |
|---|---|
| Type every function, variable, return | Leave types implicit |
| Use `async def` for I/O | Use sync functions for I/O |
| Validate with Pydantic at the boundary | Pass raw dicts through the codebase |
| Log with structlog + context keys | Use `print()` or format strings in logs |
| Write tests for every feature | Ship untested code |
| Use `aiohttp` for HTTP calls | Use `requests` |
| Handle errors with custom exceptions | Use bare `except:` |
| Keep routers thin, logic in services | Put business logic in routers |
| Use `datetime.now(datetime.UTC)` | Use naive datetimes |
| Run ruff + mypy before committing | Push code that doesn't pass linting |