BanGUI/Docs/Tasks.md at 81f009e32304b7c14c6480bde1d4982389db784a

Files

Lukas 81f009e323 TASK-022: Hash session tokens in database for security

- Store session tokens as one-way SHA256 hashes instead of plaintext
- Hash tokens on write (create_session) and on read (get_session, delete_session)
- Add migration to drop plaintext sessions table and recreate with token_hash column
- Update Session model: token field still contains raw token for signing
- Add test to verify tokens are hashed in database, not plaintext
- Update Architekture.md to document session token hashing
- Update Backend-Development.md with implementation pattern and best practices

Prevents direct session token hijacking if database file is exposed to attacker.
If plaintext DB was readable, sessions are invalidated by the migration anyway.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

2026-04-26 14:36:21 +02:00

23 KiB

Raw Blame History

TASK-022 — Session tokens stored in plaintext in SQLite

Severity: High

Where found

backend/app/db.py — sessions table schema: token TEXT NOT NULL UNIQUE. backend/app/repositories/session_repo.py — INSERT INTO sessions (token, ...) and SELECT ... WHERE token = ? both use the raw token value.

Why this is needed

If the BanGUI SQLite database file is exposed (volume mount misconfiguration, backup leak, path traversal via another vulnerability), all active session tokens are immediately usable — no cracking required. The attacker can directly use the token in the bangui_session cookie or Authorization: Bearer header.

Goal

Store a one-way hash of the session token in the database so that the DB file alone is not sufficient to hijack a session.

What to do

In session_repo.create_session(), store hashlib.sha256(token.encode()).hexdigest() instead of token in the token column.
In session_repo.get_session() and delete_session(), hash the supplied token before the SQL lookup.
The Session model's token field returned to the service layer still contains the raw token (for use in signing and response) — only the DB column changes.
Add a migration (_MIGRATIONS[2]) that renames the existing sessions table to sessions_old, creates a new one, and drops sessions_old (or simply truncates all sessions on upgrade, since they are all compromised anyway once the DB was readable in plaintext).

Possible traps and issues

Coordinate with TASK-025 (HMAC bypass) — both fixes invalidate all existing sessions. Do them in the same release.
The migration must be atomic (see TASK-023).
The Session.token field name is slightly misleading once it stores a hash — consider renaming the DB column to token_hash.

Docs changes needed

Architekture.md — update session data model description.
Backend-Development.md — document the session token hashing pattern.

Doc references

Architekture.md — authentication and session model
Backend-Development.md — security patterns

TASK-023 — Database migration is non-atomic

Severity: Medium

Where found

backend/app/db.py — _apply_migration(): calls db.executescript(migration_script) (which auto-commits per SQLite Python driver behavior) and then separately db.execute("INSERT INTO schema_migrations ...") + db.commit().

Why this is needed

executescript() issues an implicit COMMIT before executing the script, so the schema change and the migration record insertion are in two separate transactions. A process crash between them leaves the database in a migrated-but-unrecorded state. On next startup, the migration is re-applied. For a migration that is not idempotent (e.g., INSERT without OR IGNORE, ALTER TABLE ADD COLUMN without IF NOT EXISTS), this causes a runtime error or data duplication.

Goal

Wrap each migration's DDL and its schema_migrations record in a single atomic transaction.

What to do

Replace db.executescript(migration_script) with individual await db.execute(stmt) calls for each DDL statement in the migration (split on ;).
Wrap the entire migration (all DDL statements + the INSERT INTO schema_migrations) in an explicit BEGIN IMMEDIATE ... COMMIT transaction.
Test: verify that a simulated crash mid-migration (mocked execute that raises on the second statement) leaves the DB at its prior version.

Possible traps and issues

SQLite DDL in WAL mode: CREATE TABLE IF NOT EXISTS and CREATE INDEX IF NOT EXISTS are safe to re-run. ALTER TABLE ADD COLUMN is not — it must be guarded with a PRAGMA table_info check if used in future migrations.
Splitting a migration script on ; must handle semicolons inside string literals and comments. Consider storing each migration as a list[str] of individual statements instead of a single script string.

Docs changes needed

Backend-Development.md — migration authoring guidelines.

Doc references

Backend-Development.md — database schema and migrations

TASK-024 — No CSRF protection on state-mutating endpoints

Severity: High

Where found

All POST, PUT, DELETE routes in backend/app/routers/. Only SameSite=Lax on the session cookie provides any CSRF protection.

Why this is needed

SameSite=Lax blocks cross-site <form> POST requests but does not block fetch(..., {credentials: "include"}) initiated by JavaScript on a subdomain or a same-origin XSS injection. Without a CSRF token or Origin header check, a compromised subdomain can issue authenticated requests on behalf of the logged-in user.

Goal

Add explicit CSRF protection for all cookie-authenticated state-mutating endpoints.

What to do

Option A (recommended — custom header check):

Add a middleware that, for all POST/PUT/DELETE/PATCH requests authenticated via cookie (not Authorization: Bearer), requires the presence of a custom header: X-BanGUI-Request: 1.
The frontend API client (frontend/src/api/client.ts) already uses a shared request() function — add "X-BanGUI-Request": "1" to the default headers there.
Cross-site fetch() calls cannot set custom headers without CORS preflight, which the backend rejects (CORS is only configured for allowed origins).

Option B — Origin header validation: Add middleware that checks Origin or Referer matches the configured allowed origin for all mutating requests.

Possible traps and issues

The Bearer-token path (Authorization: Bearer) does not use cookies and is therefore not CSRF-vulnerable — do not apply the check to those requests.
Detecting cookie-vs-bearer authentication in middleware requires reading request headers before the auth dependency runs — check for Cookie: bangui_session= presence.
Do not apply CSRF checks to GET, HEAD, OPTIONS requests.

Docs changes needed

Architekture.md — document the CSRF protection mechanism.
Backend-Development.md — CSRF middleware.
Web-Development.md — document the X-BanGUI-Request header requirement.

Doc references

Architekture.md — security architecture
Backend-Development.md — security patterns
Web-Development.md — API client conventions

TASK-025 — `unwrap_session_token` legacy fallback bypasses HMAC check entirely

Severity: High

Where found

backend/app/services/auth_service.py — unwrap_session_token() lines 44–49:

if SESSION_TOKEN_SIGNATURE_SEPARATOR not in token:
    return token  # HMAC check skipped entirely

Why this is needed

Any token that does not contain the separator character is returned unchanged as a "valid" token — the HMAC signature is never verified. Combined with TASK-022 (plaintext DB), an attacker who reads the database can take a raw token (no separator) and use it directly, bypassing the HMAC layer entirely. The signing mechanism provides zero additional security once the DB is readable.

Goal

Remove the HMAC bypass. All tokens must carry a valid signature.

What to do

Remove the early-return branch: if SESSION_TOKEN_SIGNATURE_SEPARATOR not in token: return token.
If the separator is absent, raise ValueError("Invalid session token.").
This invalidates all sessions created before HMAC signing was introduced — coordinate with TASK-022 (all sessions should be invalidated during that migration anyway).
Update all tests that use raw unsigned tokens.

Possible traps and issues

Any test that constructs a raw token without a signature will start failing — this is intentional, update the tests.
The unwrap_session_token docstring mentions "backward compatibility with existing raw session tokens stored in the DB" — remove this rationale once TASK-022 hashes the DB column (raw tokens will no longer be in the DB).

Docs changes needed

Backend-Development.md — document the session token format (signed only).

Doc references

Backend-Development.md — authentication internals

TASK-026 — OpenAPI docs (`/docs`, `/redoc`) exposed without authentication in production

Severity: Medium

Where found

backend/app/main.py — FastAPI(title="BanGUI", ...) with no docs_url, redoc_url, or openapi_url override.

Why this is needed

FastAPI serves interactive API documentation at /docs (Swagger UI) and /redoc by default, accessible to any unauthenticated user. These pages expose every endpoint URL, all request and response schemas, and allow direct API invocation from the browser. An attacker can enumerate all available routes, understand input/output models, and attempt attacks without any prior knowledge of the API.

Goal

Disable API docs in production, or protect them behind authentication.

What to do

Add a BANGUI_ENABLE_DOCS: bool = Field(default=False, ...) setting to Settings.

In create_app():

docs_url = "/api/docs" if resolved_settings.enable_docs else None
redoc_url = "/api/redoc" if resolved_settings.enable_docs else None
openapi_url = "/api/openapi.json" if resolved_settings.enable_docs else None
app = FastAPI(..., docs_url=docs_url, redoc_url=redoc_url, openapi_url=openapi_url)

In compose.debug.yml, set BANGUI_ENABLE_DOCS: "true".
Production (compose.prod.yml) leaves BANGUI_ENABLE_DOCS unset (defaults to false).

Possible traps and issues

The SetupRedirectMiddleware must allow /api/docs and /api/openapi.json in _ALWAYS_ALLOWED (or behind auth — a protected docs endpoint requires a custom Starlette route with require_auth).
If you want protected docs (only accessible when logged in), use a custom route that returns the Swagger HTML after require_auth.

Docs changes needed

Backend-Development.md — document the BANGUI_ENABLE_DOCS flag.

Doc references

Backend-Development.md — configuration options

TASK-027 — Debug compose hardcodes a publicly known weak session secret

Severity: Medium

Where found

Docker/compose.debug.yml line ~63:

BANGUI_SESSION_SECRET: "${BANGUI_SESSION_SECRET:-dev-secret-do-not-use-in-production}"

Why this is needed

The fallback value dev-secret-do-not-use-in-production is now publicly visible in the repository. If compose.debug.yml is used in any environment where BANGUI_SESSION_SECRET is not set (e.g., a CI environment or a staging server that uses the debug compose file), all session tokens can be forged by anyone who has seen this repository.

Goal

Remove the insecure default. Require the secret to be set explicitly before the container starts.

What to do

Change to BANGUI_SESSION_SECRET: "${BANGUI_SESSION_SECRET:?BANGUI_SESSION_SECRET must be set — generate with: python -c 'import secrets; print(secrets.token_hex(32))'}".
Create a .env.example file at the project root with placeholder values and generation instructions.
Add .env to .gitignore (verify it is already there).

Possible traps and issues

This will break docker compose -f Docker/compose.debug.yml up without a .env file. Add a clear error message and setup instructions to the README or Instructions.md.
docker-compose.yml (the legacy file) already uses the :? pattern — follow the same approach.

Docs changes needed

Instructions.md — add first-run setup instructions for the .env file.

Doc references

Instructions.md — developer setup

TASK-028 — Fire-and-forget `asyncio.create_task()` silently discards exceptions

Severity: Low

Where found

backend/app/services/ban_service.py line ~614:

asyncio.create_task(  # noqa: RUF006
    geo_cache.lookup_batch(uncached, http_session, db=app_db),
    name="geo_bans_by_country",
)

Why this is needed

The task reference is immediately discarded. Any exception raised inside geo_cache.lookup_batch() — network errors, aiohttp timeouts, DB write failures — becomes an unhandled task exception. In Python 3.11+ this emits a RuntimeWarning to stderr but is otherwise silently swallowed. Errors in background geo resolution are invisible in structured logs.

Goal

Ensure exceptions in fire-and-forget tasks are always logged.

What to do

Wrap the task body in a logging wrapper:

async def _logged_task(coro: Coroutine[Any, Any, Any], name: str) -> None:
    try:
        await coro
    except Exception:
        log.exception("background_task_failed", task_name=name)

asyncio.create_task(_logged_task(geo_cache.lookup_batch(...), "geo_bans_by_country"))

Extract _logged_task into backend/app/utils/async_utils.py as a reusable helper so the same pattern is used for all fire-and-forget tasks.

Possible traps and issues

The done callback must not re-raise the exception — only log it.
log.exception() inside a callback/task captures the traceback automatically with structlog.

Docs changes needed

Backend-Development.md — fire-and-forget task conventions.

Doc references

Backend-Development.md — async patterns

TASK-029 — `Fail2BanConnectionError` leaks socket path in HTTP error responses

Severity: Medium

Where found

backend/app/exceptions.py — Fail2BanConnectionError.__init__() formats the message as f"{message} (socket: {socket_path})". backend/app/main.py — _fail2ban_connection_handler() returns {"detail": f"Cannot reach fail2ban: {exc}"} verbatim.

Why this is needed

Every 502 response caused by fail2ban being unreachable includes the full socket path (e.g., Cannot reach fail2ban: [Errno 2] No such file or directory (socket: /var/run/fail2ban/fail2ban.sock)) in the JSON error body. This discloses internal infrastructure details to unauthenticated users who can trigger the error. Similarly, _fail2ban_protocol_handler includes raw exception details that may expose internal parsing logic.

Goal

Return generic, user-friendly error messages in HTTP responses. Log full details server-side only.

What to do

In _fail2ban_connection_handler(), replace:

content={"detail": f"Cannot reach fail2ban: {exc}"}

with:

content={"detail": "Cannot reach the fail2ban service. Check the server status page."}

In _fail2ban_protocol_handler(), similarly return a generic message.
Both handlers already log error=str(exc) server-side — this is correct and should remain.

Possible traps and issues

Update any tests that assert the exact detail string in 502 responses.
If the frontend displays this error message directly to the user, ensure it still makes sense after genericizing.

Docs changes needed

Backend-Development.md — error message hygiene (no internal paths/details in responses).

Doc references

Backend-Development.md — error handling

TASK-030 — ip-api.com geo lookups use plain HTTP — IP addresses sent unencrypted

Severity: Medium

Where found

backend/app/services/geo_cache.py lines ~41–46:

_API_URL = "http://ip-api.com/json/{ip}?fields=..."
_BATCH_API_URL = "http://ip-api.com/batch?fields=..."

Why this is needed

All banned and monitored IP addresses are transmitted to ip-api.com in cleartext over HTTP. These are potentially sensitive data (PII under GDPR/CCPA — IP addresses identify users). Any network path between the BanGUI server and ip-api.com's servers can observe or modify the traffic. Forged responses would corrupt the geo database silently.

Goal

Use encrypted transport for all geo API calls, or switch to a local resolver.

What to do

ip-api.com's free tier does not support HTTPS. The recommended approach:

Promote the existing geoip_db_path setting (MaxMind GeoLite2-Country MMDB) to the primary resolver.
Use ip-api.com as a secondary fallback only when the MMDB is unavailable or returns no result.
Add documentation and compose file examples for downloading and mounting the GeoLite2 MMDB.
If ip-api.com HTTP is retained as a fallback, add a config flag BANGUI_GEOIP_ALLOW_HTTP_FALLBACK (default false) and warn clearly at startup when enabled.

Possible traps and issues

The MaxMind GeoLite2 database requires a free account and a license key to download — document the setup process.
The GeoLite2-Country MMDB does not include ASN or organisation data — these fields will be null when using the local resolver. The GeoInfo model must handle nullable asn and org.

Docs changes needed

Features.md — document the geo resolution mechanism and MMDB setup.
Architekture.md — update the external API dependency section.
Backend-Development.md — configuration for geoip_db_path.

Doc references

Features.md — geolocation
Architekture.md — external API dependencies

TASK-031 — bcrypt 72-byte truncation not enforced — long passwords silently equivalent to their prefix

Severity: Medium

Where found

backend/app/models/auth.py — LoginRequest.password: str = Field(...) (no max_length). backend/app/models/setup.py — SetupRequest.master_password has min_length=8 but no max_length.

Why this is needed

bcrypt silently truncates all input at 72 bytes before hashing. A user who sets a 100-character password can be authenticated by supplying only the first 72 characters. The extra characters provide no additional security. An attacker who has reduced the search space to 72 characters can brute-force the password more efficiently than the user intended.

Goal

Enforce a maximum password length of 72 bytes, or pre-hash before bcrypt to remove the limit entirely.

What to do

Option A (simple):

Add max_length=72 to SetupRequest.master_password and LoginRequest.password.
Update the setup wizard UI to reflect the 72-character maximum.

Option B (removes the 72-byte limit entirely):

Pre-hash the password with HMAC-SHA256 using the session_secret as the key before passing to bcrypt:

pre_hashed = hmac.new(secret.encode(), password.encode(), hashlib.sha256).digest()
bcrypt.hashpw(pre_hashed, bcrypt.gensalt())

Apply consistently in both run_setup() and _check_password().

Option A is recommended as the simpler, lower-risk fix. Option B is architecturally cleaner but requires a stored hash migration.

Possible traps and issues

Option A: Users who already have passwords longer than 72 characters will need to reset. For a single-admin app this is acceptable.
Option B: If the session_secret changes, all stored password hashes become invalid (since the pre-hash key changes). This is a hidden coupling — document it explicitly.

Docs changes needed

Features.md — document the password length constraint.
Backend-Development.md — bcrypt usage notes.

Doc references

Features.md — authentication and setup
Backend-Development.md — password hashing

TASK-032 — `geo_cache` table grows unboundedly — no eviction or purge

Severity: Medium

Where found

backend/app/repositories/geo_cache_repo.py — has upsert_entry, bulk_upsert_entries, upsert_neg_entry — but no DELETE functions. backend/app/db.py — geo_cache table has no last_seen or created_at column.

Why this is needed

Every unique IP address ever seen by fail2ban gets a row in geo_cache. The table is never trimmed. A BanGUI instance monitoring a busy server can accumulate millions of rows over months, increasing the DB file size and degrading query performance on every geo lookup.

Goal

Implement a retention policy that prunes geo cache entries not referenced recently.

What to do

Add a migration (_MIGRATIONS[2]) that adds a last_seen TEXT NOT NULL DEFAULT CURRENT_TIMESTAMP column to geo_cache.
Update upsert_entry and bulk_upsert_entries to set last_seen = CURRENT_TIMESTAMP on every upsert.
Add delete_stale_entries(db: aiosqlite.Connection, cutoff_iso: str) -> int to geo_cache_repo.py.
Create backend/app/tasks/geo_cache_cleanup.py — a nightly task that calls delete_stale_entries with a 90-day cutoff.
Register the task in startup_shared_resources.

Possible traps and issues

Adding a column requires a migration. Coordinate with TASK-023 (migration atomicity) and TASK-022 (session hash migration) — all three migrations must be sequenced correctly as _MIGRATIONS[2], [3], etc.
IPs that have not been seen in 90 days will lose their geo data — on their next appearance they will be re-resolved from ip-api.com or the MMDB. This is acceptable.

Docs changes needed

Architekture.md — update the geo_cache table description and add the cleanup task.
Backend-Development.md — document the geo cache retention policy.

Doc references

Architekture.md — application database schema
Backend-Development.md — background tasks

Severity: Medium

Where found

backend/app/routers/auth.py — login() returns LoginResponse(token=signed_token, expires_at=expires_at) in the JSON body and sets the HttpOnly cookie. backend/app/models/auth.py — LoginResponse.token field.

Why this is needed

The LoginResponse JSON body contains the full signed session token. JavaScript running on the page (including third-party analytics scripts or a future XSS injection) can read the response body from a fetch() call and store the token in localStorage or a non-HttpOnly cookie. The Bearer-header authentication path (Authorization: Bearer <token>) then allows using that extracted token, completely bypassing the protections provided by the HttpOnly cookie.

Goal

Prevent the session token from being accessible to JavaScript when using cookie-based authentication.

What to do

For browser SPA consumers: Remove the token field from LoginResponse. The HttpOnly cookie is the only token the browser needs.
If an API-first (non-browser) token flow is required, create a separate endpoint POST /api/auth/token that returns a token in the body and does not set a cookie. Document this endpoint as "for programmatic API clients only, not for browser use".
Update the frontend — verify that AuthProvider does not use response.token (confirmed: it currently does not).

Possible traps and issues

Any existing API client that relies on the token in the LoginResponse body will break. Check tests.
The expires_at field in LoginResponse is useful for the frontend to know when to prompt for re-login — this can remain.
The Bearer-token path in require_auth (Authorization: Bearer) remains functional for programmatic clients using the dedicated token endpoint.

Docs changes needed

Features.md — document the authentication flow (cookie for browser, token endpoint for API clients).
Backend-Development.md — authentication endpoint design.
Web-Development.md — document that the frontend uses only the HttpOnly cookie.

Doc references

Features.md — authentication
Backend-Development.md — auth router design
Web-Development.md — AuthProvider

23 KiB Raw Blame History Unescape Escape

TASK-022 — Session tokens stored in plaintext in SQLite

Where found

Why this is needed

Goal

What to do

Possible traps and issues

Docs changes needed

Doc references

TASK-023 — Database migration is non-atomic

Where found

Why this is needed

Goal

What to do

Possible traps and issues

Docs changes needed

Doc references

TASK-024 — No CSRF protection on state-mutating endpoints

Where found

Why this is needed

Goal

What to do

Possible traps and issues

Docs changes needed

Doc references

TASK-025 — unwrap_session_token legacy fallback bypasses HMAC check entirely

Where found

Why this is needed

Goal

What to do

Possible traps and issues

Docs changes needed

Doc references

TASK-026 — OpenAPI docs (/docs, /redoc) exposed without authentication in production

Where found

Why this is needed

Goal

What to do

Possible traps and issues

Docs changes needed

Doc references

TASK-027 — Debug compose hardcodes a publicly known weak session secret

Where found

Why this is needed

Goal

What to do

Possible traps and issues

Docs changes needed

Doc references

TASK-028 — Fire-and-forget asyncio.create_task() silently discards exceptions

Where found

Why this is needed

Goal

What to do

Possible traps and issues

Docs changes needed

Doc references

TASK-029 — Fail2BanConnectionError leaks socket path in HTTP error responses

Where found

Why this is needed

Goal

What to do

Possible traps and issues

Docs changes needed

Doc references

TASK-030 — ip-api.com geo lookups use plain HTTP — IP addresses sent unencrypted

Where found

Why this is needed

Goal

What to do

Possible traps and issues

Docs changes needed

Doc references

TASK-031 — bcrypt 72-byte truncation not enforced — long passwords silently equivalent to their prefix

Where found

Why this is needed

Goal

What to do

Possible traps and issues

Docs changes needed

23 KiB

Raw Blame History

TASK-025 — `unwrap_session_token` legacy fallback bypasses HMAC check entirely

TASK-026 — OpenAPI docs (`/docs`, `/redoc`) exposed without authentication in production

TASK-028 — Fire-and-forget `asyncio.create_task()` silently discards exceptions

TASK-029 — `Fail2BanConnectionError` leaks socket path in HTTP error responses

TASK-032 — `geo_cache` table grows unboundedly — no eviction or purge