Files
BanGUI/Docs/PERFORMANCE.md
Lukas cc6dbcf3f0 feat: implement API versioning /api/v1/
- All backend routers moved to /api/v1/ prefix
- Frontend BASE_URL updated to /api/v1
- Setup redirect middleware updated to redirect to /api/v1/setup
- Health router path fixed: prefix=/api/v1/health, @router.get('')
- conftest.py: set server_status=online for test fixture
- Created Docs/API_VERSIONING.md with deprecation policy
- Updated Docs/Backend-Development.md with versioning section
- Updated Instructions.md curl examples

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-02 21:29:30 +02:00

3.7 KiB
Raw Blame History

Performance Guidelines

Query optimization patterns for BanGUI backend services.


Never Load Unbounded Result Sets

Loading large result sets into Python memory causes OOM crashes, slow responses, and unbounded growth. Every query that processes large datasets must use one of the following strategies.

The Problem

With millions of ban records:

  • Loading all rows as Python dicts → 200-400 MB+ memory spike
  • Python loop aggregation (O(n) per item) → seconds of CPU time
  • Offset pagination on large tables → O(n) scan before returning results

The Solution: SQL Aggregation

SQL GROUP BY executes inside SQLite's optimized query planner, using indexes where available, and returns only the aggregated result (typically a few KB).

# BAD: loads 1M rows into Python
all_rows = await get_all_archived_history(db, since=since)
agg = {}
for row in all_rows:  # O(n) Python loop
    agg[row["ip"]] = agg.get(row["ip"], 0) + 1

# GOOD: SQL aggregation, returns lightweight {ip, count} pairs
ip_counts = await get_ip_ban_counts(db, since=since)
# [{ip: "1.2.3.4", event_count: 42}, ...] — a few KB regardless of table size

Aggregation Reference

Use Case SQL Pattern Repository Function
Ban count per IP SELECT ip, COUNT(*) FROM history_archive ... GROUP BY ip get_ip_ban_counts()
Ban count per jail SELECT jail, COUNT(*) FROM history_archive ... GROUP BY jail ORDER BY COUNT(*) DESC get_jail_ban_counts()
Ban count per time bucket SELECT CAST((timeofban - ?) / ? AS INTEGER), COUNT(*) ... GROUP BY bucket_idx get_ban_counts_by_bucket()
Paginated rows (no offset) WHERE id < ? ORDER BY id DESC LIMIT ? get_archived_history_keyset()
Total count SELECT COUNT(*) FROM ... (fast with where clause) included in get_jail_ban_counts() return

Pagination vs Aggregation

Use aggregation when:

  • Displaying summary data (counts, totals, group-by results)
  • Building country/jail/timeline dashboards
  • Only need counts, not individual row data

Use pagination when:

  • Displaying individual records (ban list, history)
  • Clients need access to specific rows
  • Exporting or bulk operations

Batch Geo Lookups

When you need geo data for many IPs, batch in a single call rather than per-IP:

# BAD: N sequential API calls
for ip in unique_ips:
    geo = await geo_service.lookup(ip)  # 45 req/min rate limit × N calls

# GOOD: one batch call, geo_service handles rate limiting
geo_map, uncached = geo_cache_lookup(unique_ips)  # uses in-memory cache
if uncached:
    asyncio.create_task(geo_cache.lookup_batch(uncached, http_session))  # fire-and-forget

Index Requirements

SQLite needs indexes on:

  • Columns used in WHERE clauses (timeofban, jail, action)
  • Columns used in GROUP BY (ip, jail, bucket index)
  • Sort columns for pagination (id)

Current indexes on history_archive:

  • idx_history_archive_timeofban — for time-range filtering
  • idx_history_archive_jail_timeofban — for jail + time filtering
  • idx_history_archive_action_timeofban — for action + time filtering
  • idx_history_archive_id — for keyset pagination

Before adding a new query pattern, verify it uses an existing index or add one with a benchmark test.

Memory Monitoring

Watch for these warning signs:

  • Python RSS > 500 MB in container metrics
  • Response time > 5s for dashboard endpoints
  • Query time > 1s in SQLite EXPLAIN ANALYZE output

Use EXPLAIN QUERY PLAN to verify index usage:

EXPLAIN QUERY PLAN SELECT ip, COUNT(*) FROM history_archive WHERE timeofban >= ? GROUP BY ip;

Expected: USING INDEX idx_history_archive_timeofban in the output.