Fix empty error field in geo_lookup_request_failed log events
- Replace str(exc) with repr(exc) in lookup() and _batch_api_call() so exception class name is always present even for no-message errors (e.g. aiohttp.ServerDisconnectedError() whose str() is empty) - Add exc_type=type(exc).__name__ field to network-error log events for easy structured-log filtering - Move import aiohttp to runtime import; use aiohttp.ClientTimeout() instead of raw float, removing # type: ignore[arg-type] workarounds - Add TestErrorLogging with 3 tests covering empty-message exceptions
This commit is contained in:
743
Docs/Tasks.md
743
Docs/Tasks.md
@@ -4,723 +4,100 @@ This document breaks the entire BanGUI project into development stages, ordered
|
||||
|
||||
---
|
||||
|
||||
## Stage 1 — Dashboard Charts Foundation
|
||||
## Task 1 — Make Geo-Cache Persistent ✅ DONE
|
||||
|
||||
### Task 1.1 — Install and configure a charting library
|
||||
**Goal:** Minimise calls to the external geo-IP lookup service by caching results in the database.
|
||||
|
||||
**Status:** `done`
|
||||
**Details:**
|
||||
|
||||
The frontend currently has no charting library. Install **Recharts** (`recharts`) as the project charting library. Recharts is React-native, composable, and integrates cleanly with Fluent UI v9 theming.
|
||||
|
||||
**Steps:**
|
||||
|
||||
1. Run `npm install recharts` in the `frontend/` directory.
|
||||
2. Verify the dependency appears in `package.json` under `dependencies`.
|
||||
3. Confirm the build still succeeds with `npm run build` (no type errors, no warnings).
|
||||
|
||||
No wrapper or configuration file is needed — Recharts components are imported directly where used.
|
||||
|
||||
**Acceptance criteria:**
|
||||
|
||||
- `recharts` is listed in `frontend/package.json`.
|
||||
- `npm run build` succeeds with zero errors or warnings.
|
||||
- Currently geo-IP results may only live in memory and are lost on restart. Persist every successful geo-lookup result into the database so the external service is called as rarely as possible.
|
||||
- On each geo-lookup request, first query the database for a cached entry for that IP. Only call the external service if no cached entry exists (or the entry has expired, if a TTL policy is desired).
|
||||
- After a successful external lookup, write the result back to the database immediately.
|
||||
- Review the existing implementation in `app/services/geo_service.py` and the related repository/model code. Verify that:
|
||||
- The DB table/model for geo-cache entries exists and has the correct schema (IP, country, city, latitude, longitude, looked-up timestamp, etc.).
|
||||
- The repository layer exposes `get_by_ip` and `upsert` (or equivalent) methods.
|
||||
- The service checks the cache before calling the external API.
|
||||
- Bulk inserts are used where multiple IPs need to be resolved at once (see Task 3).
|
||||
|
||||
---
|
||||
|
||||
### Task 1.2 — Create a shared chart theme utility
|
||||
## Task 2 — Fix `geo_lookup_request_failed` Warnings ✅ DONE
|
||||
|
||||
**Status:** `done`
|
||||
**Goal:** Investigate and fix the frequent `geo_lookup_request_failed` log warnings that occur with an empty `error` field.
|
||||
|
||||
Create a small utility at `frontend/src/utils/chartTheme.ts` that exports a function (or constant object) mapping Fluent UI v9 design tokens to Recharts-compatible colour values. The charts must respect the current Fluent theme (light and dark mode). At minimum export:
|
||||
**Resolution:** The root cause was `str(exc)` returning `""` for aiohttp exceptions with no message (e.g. `ServerDisconnectedError`). Fixed by:
|
||||
- Replacing `error=str(exc)` with `error=repr(exc)` in both `lookup()` and `_batch_api_call()` so the exception class name is always present in the log.
|
||||
- Adding `exc_type=type(exc).__name__` field to every network-error log event for easy filtering.
|
||||
- Moving `import aiohttp` from the `TYPE_CHECKING` block to a regular runtime import and replacing the raw-float `timeout` arguments with `aiohttp.ClientTimeout(total=...)`, removing the `# type: ignore[arg-type]` workarounds.
|
||||
- Three new tests in `TestErrorLogging` verify empty-message exceptions are correctly captured.
|
||||
|
||||
- A palette of 5+ distinct categorical colours for pie/bar slices, derived from Fluent token aliases (e.g. `colorPaletteBlueBorderActive`, `colorPaletteRedBorderActive`, etc.).
|
||||
- Axis/grid/tooltip colours derived from `colorNeutralForeground2`, `colorNeutralStroke2`, `colorNeutralBackground1`, etc.
|
||||
- A helper that returns the CSS value of a Fluent token at runtime (since Recharts needs literal CSS colour strings, not CSS custom properties).
|
||||
|
||||
Keep the file under 60 lines. No React components here — pure utility.
|
||||
|
||||
**References:** [Web-Design.md](Web-Design.md) § colour tokens.
|
||||
|
||||
**Acceptance criteria:**
|
||||
|
||||
- The exported palette contains at least 5 distinct colours.
|
||||
- Colours change correctly between light and dark mode.
|
||||
- `tsc --noEmit` and `eslint` pass with zero warnings.
|
||||
|
||||
---
|
||||
|
||||
## Stage 2 — Country Pie Chart (Top 4 + Other)
|
||||
|
||||
### Task 2.1 — Create the `TopCountriesPieChart` component
|
||||
|
||||
**Status:** `done`
|
||||
|
||||
Create `frontend/src/components/TopCountriesPieChart.tsx`. This component renders a **pie chart (Kuchendiagramm)** showing the **top 4 countries by ban count** plus an **"Other"** slice that aggregates every remaining country.
|
||||
|
||||
**Data source:** The component receives the `countries` map (`Record<string, number>`) and `country_names` map (`Record<string, string>`) from the existing `/api/dashboard/bans/by-country` endpoint response (`BansByCountryResponse`). No new API endpoint is needed.
|
||||
|
||||
**Aggregation logic (frontend):**
|
||||
|
||||
1. Sort the `countries` entries descending by ban count.
|
||||
2. Take the top 4 entries.
|
||||
3. Sum all remaining entries into a single `"Other"` bucket.
|
||||
4. The result is exactly 5 slices (or fewer if fewer than 5 countries exist).
|
||||
|
||||
**Visual requirements:**
|
||||
|
||||
- Use `<PieChart>` and `<Pie>` from Recharts with `<Cell>` for per-slice colours from the chart theme palette (Task 1.2).
|
||||
- Display a `<Tooltip>` on hover showing the country name and ban count.
|
||||
- Display a `<Legend>` listing each slice with its country name (full name from `country_names`, not just the code) and percentage.
|
||||
- Label each slice with the percentage (use Recharts `label` prop or `<Label>`).
|
||||
- Use `makeStyles` for any layout styling. Follow [Web-Design.md](Web-Design.md) spacing and card conventions.
|
||||
- Wrap the chart in a responsive container so it scales with its parent.
|
||||
|
||||
**Props interface:**
|
||||
|
||||
```ts
|
||||
interface TopCountriesPieChartProps {
|
||||
countries: Record<string, number>;
|
||||
countryNames: Record<string, string>;
|
||||
}
|
||||
```
|
||||
|
||||
**Acceptance criteria:**
|
||||
|
||||
- Always renders exactly 5 slices (or fewer when data has < 5 countries).
|
||||
- The "Other" slice correctly sums all countries outside the top 4.
|
||||
- Tooltip displays country name + count on hover.
|
||||
- Legend shows country name + percentage.
|
||||
- Responsive — no horizontal overflow on narrow viewports.
|
||||
- `tsc --noEmit` passes. No `any` types. ESLint clean.
|
||||
|
||||
---
|
||||
|
||||
### Task 2.2 — Create a `useDashboardCountryData` hook
|
||||
|
||||
**Status:** `done`
|
||||
|
||||
Create `frontend/src/hooks/useDashboardCountryData.ts`. This hook wraps the existing `GET /api/dashboard/bans/by-country` call and returns the data the dashboard charts need. The existing `useMapData` hook is designed for the map page and should not be reused because it is coupled to map-specific debouncing and state.
|
||||
|
||||
**Signature:**
|
||||
|
||||
```ts
|
||||
function useDashboardCountryData(
|
||||
timeRange: TimeRange,
|
||||
origin: BanOriginFilter,
|
||||
): {
|
||||
countries: Record<string, number>;
|
||||
countryNames: Record<string, string>;
|
||||
bans: DashboardBanItem[];
|
||||
total: number;
|
||||
isLoading: boolean;
|
||||
error: string | null;
|
||||
};
|
||||
```
|
||||
|
||||
**Behaviour:**
|
||||
|
||||
- Call `GET /api/dashboard/bans/by-country?range={timeRange}` with optional `origin` query param (omit when `"all"`).
|
||||
- Use the typed API client from `api/client.ts`.
|
||||
- Set `isLoading` while fetching, populate `error` on failure.
|
||||
- Re-fetch when `timeRange` or `origin` changes.
|
||||
- Mirror the data-fetching patterns used by `useBans` / `useMapData`.
|
||||
|
||||
**Acceptance criteria:**
|
||||
|
||||
- Returns typed data matching `BansByCountryResponse`.
|
||||
- Re-fetches on param change.
|
||||
- `tsc --noEmit` and ESLint pass.
|
||||
|
||||
---
|
||||
|
||||
### Task 2.3 — Integrate the pie chart into `DashboardPage`
|
||||
|
||||
**Status:** `done`
|
||||
|
||||
Add the `TopCountriesPieChart` below the `ServerStatusBar` and above the "Ban List" section on the `DashboardPage`. The chart must share the same `timeRange` and `originFilter` state that already exists on the page.
|
||||
|
||||
**Layout:**
|
||||
|
||||
- Place the pie chart inside a new section card (reuse the `section` / `sectionHeader` pattern from the existing ban-list section).
|
||||
- Section title: **"Top Countries"**.
|
||||
- The pie chart card sits in a future row of chart cards (see Task 3.3). For now, render it full-width. Use a CSS class name like `chartsRow` so the bar chart can be added beside it later.
|
||||
|
||||
**Acceptance criteria:**
|
||||
|
||||
- The pie chart renders on the dashboard, respecting the selected time range and origin filter.
|
||||
- Changing the time range or origin filter re-renders the chart with new data.
|
||||
- The loading and error states from the hook are handled (show `<Spinner>` while loading, `<MessageBar>` on error).
|
||||
- `tsc --noEmit` and ESLint pass.
|
||||
|
||||
---
|
||||
|
||||
## Stage 3 — Country Bar Chart (Top 20)
|
||||
|
||||
### Task 3.1 — Create the `TopCountriesBarChart` component
|
||||
|
||||
**Status:** `done`
|
||||
|
||||
Create `frontend/src/components/TopCountriesBarChart.tsx`. This component renders a **horizontal bar chart (Balkendiagramm)** showing the **top 20 countries by ban count**.
|
||||
|
||||
**Data source:** Same `countries` and `country_names` maps from `BansByCountryResponse` — passed as props identical to the pie chart.
|
||||
|
||||
**Aggregation logic (frontend):**
|
||||
|
||||
1. Sort the `countries` entries descending by ban count.
|
||||
2. Take the top 20 entries.
|
||||
3. No "Other" bucket — the bar chart is detail-focused.
|
||||
|
||||
**Visual requirements:**
|
||||
|
||||
- Use `<BarChart>` (horizontal via `layout="vertical"`) from Recharts with `<Bar>`, `<XAxis>`, `<YAxis>`, `<CartesianGrid>`, and `<Tooltip>`.
|
||||
- Y-axis shows country names (full name from `country_names`, truncated to ~20 chars with ellipsis if needed).
|
||||
- X-axis shows ban count (numeric).
|
||||
- Bars are coloured with the primary colour from the chart theme palette.
|
||||
- Tooltip shows the full country name and exact ban count.
|
||||
- Chart height should be dynamic based on the number of bars (e.g. `barCount * 36px` min), with a reasonable minimum height.
|
||||
- Wrap in a `<ResponsiveContainer>` for width.
|
||||
|
||||
**Props interface:**
|
||||
|
||||
```ts
|
||||
interface TopCountriesBarChartProps {
|
||||
countries: Record<string, number>;
|
||||
countryNames: Record<string, string>;
|
||||
}
|
||||
```
|
||||
|
||||
**Acceptance criteria:**
|
||||
|
||||
- Renders up to 20 bars, sorted descending.
|
||||
- Country names readable on the Y-axis; tooltip provides full detail.
|
||||
- Responsive width, dynamic height.
|
||||
- `tsc --noEmit` passes. No `any`. ESLint clean.
|
||||
|
||||
---
|
||||
|
||||
### Task 3.2 — Integrate the bar chart into `DashboardPage`
|
||||
|
||||
**Status:** `done`
|
||||
|
||||
Add the `TopCountriesBarChart` to the dashboard alongside the pie chart.
|
||||
|
||||
**Layout:**
|
||||
|
||||
- The charts section now contains two cards side-by-side in a responsive grid row (the `chartsRow` class from Task 2.3):
|
||||
- Left: **Top Countries** pie chart (Task 2.1).
|
||||
- Right: **Top 20 Countries** bar chart (Task 3.1).
|
||||
- On narrow screens (< 768 px viewport width) the cards should stack vertically.
|
||||
- Both charts consume data from the **same** `useDashboardCountryData` hook call — do not fetch twice.
|
||||
|
||||
**Acceptance criteria:**
|
||||
|
||||
- Both charts render side by side on wide screens, stacked on narrow screens.
|
||||
- A single API call feeds both charts.
|
||||
- Time range / origin filter controls affect both charts.
|
||||
- Loading / error states handled for both.
|
||||
- `tsc --noEmit` and ESLint pass.
|
||||
|
||||
---
|
||||
|
||||
## Stage 4 — Bans-Over-Time Trend Chart
|
||||
|
||||
### Task 4.1 — Add a backend endpoint for time-series ban aggregation
|
||||
|
||||
**Status:** `done`
|
||||
|
||||
Added `GET /api/dashboard/bans/trend`. New Pydantic models `BanTrendBucket` and
|
||||
`BanTrendResponse` (plus `BUCKET_SECONDS`, `BUCKET_SIZE_LABEL`, `bucket_count`
|
||||
helpers) in `ban.py`. Service function `ban_trend()` in `ban_service.py` groups
|
||||
`bans.timeofban` into equal-width buckets via SQL and fills empty buckets with
|
||||
zero so the frontend always receives a gap-free series. Route added to
|
||||
`dashboard.py`. 20 new tests (10 service, 10 router) — all pass, total suite
|
||||
480 passed, 83% coverage.
|
||||
|
||||
The existing endpoints return flat lists or country-aggregated counts but **no time-bucketed series**. A dashboard trend chart needs data grouped into time buckets.
|
||||
|
||||
Create a new endpoint: **`GET /api/dashboard/bans/trend`**.
|
||||
|
||||
**Query params:**
|
||||
|
||||
| Param | Type | Default | Description |
|
||||
|---|---|---|---|
|
||||
| `range` | `TimeRange` | `"24h"` | Time-range preset. |
|
||||
| `origin` | `BanOrigin \| null` | `null` | Optional filter by ban origin. |
|
||||
|
||||
**Response model** (`BanTrendResponse`):
|
||||
|
||||
```python
|
||||
class BanTrendBucket(BaseModel):
|
||||
timestamp: str # ISO 8601 UTC start of the bucket
|
||||
count: int # Number of bans in this bucket
|
||||
|
||||
class BanTrendResponse(BaseModel):
|
||||
buckets: list[BanTrendBucket]
|
||||
bucket_size: str # Human-readable label: "1h", "6h", "1d", "7d"
|
||||
```
|
||||
|
||||
**Bucket strategy:**
|
||||
|
||||
| Range | Bucket size | Example buckets |
|
||||
|---|---|---|
|
||||
| `24h` | 1 hour | 24 buckets |
|
||||
| `7d` | 6 hours | 28 buckets |
|
||||
| `30d` | 1 day | 30 buckets |
|
||||
| `365d` | 7 days | ~52 buckets |
|
||||
|
||||
**Implementation:**
|
||||
|
||||
- Add the Pydantic models to `backend/app/models/ban.py`.
|
||||
- Add the service function in `backend/app/services/ban_service.py`. Query the fail2ban database (`bans` table), group rows by the computed bucket. Use SQL `CAST((banned_at - ?) / bucket_seconds AS INTEGER)` style bucketing.
|
||||
- Add the route in `backend/app/routers/dashboard.py`.
|
||||
- Follow the existing layering: router → service → repository.
|
||||
- Write tests for the new endpoint in `backend/tests/test_routers/` and `backend/tests/test_services/`.
|
||||
|
||||
**Acceptance criteria:**
|
||||
|
||||
- `GET /api/dashboard/bans/trend?range=24h` returns 24 hourly buckets.
|
||||
- Each bucket has a correct ISO 8601 timestamp and count.
|
||||
- Origin filter is applied correctly.
|
||||
- Empty buckets (zero bans) are included so the frontend has a continuous series.
|
||||
- Tests pass and cover happy path + empty data + origin filter.
|
||||
- `ruff check` and `mypy --strict` pass.
|
||||
|
||||
---
|
||||
|
||||
### Task 4.2 — Create the `BanTrendChart` component
|
||||
|
||||
**Status:** `done`
|
||||
|
||||
Created `frontend/src/components/BanTrendChart.tsx` — an area chart using Recharts
|
||||
`AreaChart` with a gradient fill, human-readable X-axis time labels (format varies by
|
||||
time range), and a custom tooltip. Added `BanTrendBucket`/`BanTrendResponse` types to
|
||||
`types/ban.ts`, `dashboardBansTrend` constant to `api/endpoints.ts`, `fetchBanTrend()`
|
||||
to `api/dashboard.ts`, and the `useBanTrend` hook at `hooks/useBanTrend.ts`. Component
|
||||
handles loading (Spinner), error (MessageBar), and empty states inline.
|
||||
`tsc --noEmit` and ESLint pass with zero warnings.
|
||||
|
||||
Create `frontend/src/components/BanTrendChart.tsx`. This component renders an **area/line chart** showing the number of bans over time.
|
||||
|
||||
**Data source:** A new `useBanTrend` hook that calls `GET /api/dashboard/bans/trend`.
|
||||
|
||||
**Visual requirements:**
|
||||
|
||||
- Use `<AreaChart>` (or `<LineChart>`) from Recharts with `<Area>`, `<XAxis>`, `<YAxis>`, `<CartesianGrid>`, `<Tooltip>`.
|
||||
- X-axis: time labels formatted human-readably (e.g. "Mon 14:00", "Mar 5").
|
||||
- Y-axis: ban count.
|
||||
- Area fill with a semi-transparent version of the primary chart colour.
|
||||
- Tooltip shows exact timestamp + count.
|
||||
- Responsive via `<ResponsiveContainer>`.
|
||||
|
||||
**Acceptance criteria:**
|
||||
|
||||
- Displays a continuous time-series line with the correct number of data points for each range.
|
||||
- Readable axis labels for all four time ranges.
|
||||
- Responsive.
|
||||
- `tsc --noEmit`, ESLint clean.
|
||||
|
||||
---
|
||||
|
||||
### Task 4.3 — Integrate the trend chart into `DashboardPage`
|
||||
|
||||
**Status:** `done`
|
||||
|
||||
Added a "Ban Trend" full-width section card to `DashboardPage` between the
|
||||
`ServerStatusBar` and the "Top Countries" section. The section renders
|
||||
`<BanTrendChart timeRange={timeRange} origin={originFilter} />`, sharing the
|
||||
same state already used by the country charts and ban list. Loading, error,
|
||||
and empty states are handled inside `BanTrendChart` itself. `tsc --noEmit` and
|
||||
ESLint pass with zero warnings.
|
||||
|
||||
Add the `BanTrendChart` to the dashboard page **above** the two country charts and **below** the `ServerStatusBar`.
|
||||
|
||||
**Layout:**
|
||||
|
||||
- Full-width section card.
|
||||
- Section title: **"Ban Trend"**.
|
||||
- Shares the same `timeRange` and `originFilter` state.
|
||||
|
||||
**Acceptance criteria:**
|
||||
|
||||
- The trend chart renders on the dashboard showing bans over time.
|
||||
- Responds to time-range and origin-filter changes.
|
||||
- Loading/error states handled.
|
||||
- `tsc --noEmit` and ESLint pass.
|
||||
|
||||
---
|
||||
|
||||
## Stage 5 — Jail Distribution Chart
|
||||
|
||||
### Task 5.1 — Add a backend endpoint for ban counts per jail
|
||||
|
||||
**Status:** `done`
|
||||
|
||||
Added `GET /api/dashboard/bans/by-jail`. New Pydantic models `JailBanCount` and
|
||||
`BansByJailResponse` added to `ban.py`. Service function `bans_by_jail()` in
|
||||
`ban_service.py` queries the `bans` table with `GROUP BY jail ORDER BY COUNT(*) DESC`
|
||||
and applies the origin filter. Route added to `dashboard.py`. 7 new service tests
|
||||
(happy path, total equality, empty DB, time-window exclusion, origin filter variants)
|
||||
and 10 new router tests — all pass, total suite 497 passed, 83% coverage.
|
||||
`ruff check` and `mypy --strict` pass.
|
||||
|
||||
The existing `GET /api/jails` endpoint returns jail metadata with `status.currently_banned` — but this counts **currently active** bans, not historical bans in the selected time window. The dashboard needs historical ban counts per jail within the selected time range.
|
||||
|
||||
Create a new endpoint: **`GET /api/dashboard/bans/by-jail`**.
|
||||
|
||||
**Query params:**
|
||||
|
||||
| Param | Type | Default | Description |
|
||||
|---|---|---|---|
|
||||
| `range` | `TimeRange` | `"24h"` | Time-range preset. |
|
||||
| `origin` | `BanOrigin \| null` | `null` | Optional origin filter. |
|
||||
|
||||
**Response model** (`BansByJailResponse`):
|
||||
|
||||
```python
|
||||
class JailBanCount(BaseModel):
|
||||
jail: str
|
||||
count: int
|
||||
|
||||
class BansByJailResponse(BaseModel):
|
||||
jails: list[JailBanCount]
|
||||
total: int
|
||||
```
|
||||
|
||||
**Implementation:**
|
||||
|
||||
- Query the `bans` table: `SELECT jail, COUNT(*) FROM bans WHERE timestart >= ? GROUP BY jail ORDER BY COUNT(*) DESC`.
|
||||
- Apply origin filter by checking whether `jail == 'blocklist-import'`.
|
||||
- Add models, service function, route, and tests following existing patterns.
|
||||
|
||||
**Acceptance criteria:**
|
||||
|
||||
- Returns jail names with ban counts descending, within the selected time window.
|
||||
- Origin filter works correctly.
|
||||
- Tests covering happy path, empty data, and filter.
|
||||
- `ruff check` and `mypy --strict` pass.
|
||||
|
||||
---
|
||||
|
||||
### Task 5.2 — Create the `JailDistributionChart` component
|
||||
|
||||
**Status:** `done`
|
||||
|
||||
Created `frontend/src/components/JailDistributionChart.tsx` — a horizontal
|
||||
bar chart using Recharts `BarChart` showing ban counts per jail sorted descending.
|
||||
Added `JailBanCount`/`BansByJailResponse` types to `types/ban.ts`,
|
||||
`dashboardBansByJail` constant to `api/endpoints.ts`, `fetchBansByJail()` to
|
||||
`api/dashboard.ts`, and the `useJailDistribution` hook at
|
||||
`hooks/useJailDistribution.ts`. Component handles loading (Spinner), error
|
||||
(MessageBar), and empty states inline. `tsc --noEmit` and ESLint pass with zero
|
||||
warnings.
|
||||
|
||||
Create `frontend/src/components/JailDistributionChart.tsx`. This component renders a **horizontal bar chart** showing the distribution of bans across jails.
|
||||
|
||||
**Why this is useful and not covered by existing views:** The current Jails page shows configuration details and live counters per jail, but **does not** provide a visual comparison of which jails are catching the most threats within a selectable time window. An admin reviewing the dashboard benefits from an at-a-glance answer to: *"Which services are being attacked most frequently right now?"* — this is fundamentally different from the country-based charts (which answer *"where"*) and from the ban trend (which answers *"when"*). The jail distribution answers **"what service is targeted"** and helps prioritise hardening efforts.
|
||||
|
||||
**Data source:** A new `useJailDistribution` hook calling `GET /api/dashboard/bans/by-jail`.
|
||||
|
||||
**Visual requirements:**
|
||||
|
||||
- Horizontal `<BarChart>` from Recharts.
|
||||
- Y-axis: jail names.
|
||||
- X-axis: ban count.
|
||||
- Colour-coded bars from the chart theme.
|
||||
- Tooltip with jail name and exact count.
|
||||
- Responsive.
|
||||
|
||||
**Acceptance criteria:**
|
||||
|
||||
- Renders one bar per jail, sorted descending.
|
||||
- Responsive.
|
||||
- `tsc --noEmit`, ESLint clean.
|
||||
|
||||
---
|
||||
|
||||
### Task 5.3 — Integrate the jail distribution chart into `DashboardPage`
|
||||
|
||||
**Status:** `done`
|
||||
|
||||
Added a full-width "Jail Distribution" section card to `DashboardPage` below the
|
||||
"Top Countries" section (2-column country charts on row 1, jail chart full-width
|
||||
on row 2). The section renders `<JailDistributionChart timeRange={timeRange}
|
||||
origin={originFilter} />`, sharing the same state already used by the other
|
||||
charts. Loading, error, and empty states are handled inside
|
||||
`JailDistributionChart` itself. `tsc --noEmit` and ESLint pass with zero warnings.
|
||||
|
||||
Add the `JailDistributionChart` as a third chart card alongside the two country charts, or in a second chart row below them if space is constrained.
|
||||
|
||||
**Layout decision:**
|
||||
|
||||
- If three cards fit side-by-side at the standard breakpoint, place all three in one row.
|
||||
- Otherwise, use a 2-column + 1-column stacked layout (pie + bar on row 1, jail chart full-width on row 2). Choose whichever looks cleaner.
|
||||
|
||||
**Acceptance criteria:**
|
||||
|
||||
- The jail distribution chart renders on the dashboard.
|
||||
- Shares time-range and origin-filter controls with the other charts.
|
||||
- Loading/error states handled.
|
||||
- Responsive layout.
|
||||
- `tsc --noEmit` and ESLint pass.
|
||||
|
||||
---
|
||||
|
||||
## Stage 6 — Polish and Final Review
|
||||
|
||||
### Task 6.1 — Ensure consistent loading, error, and empty states across all charts
|
||||
|
||||
**Status:** `done`
|
||||
|
||||
Review all four chart components and ensure:
|
||||
|
||||
1. **Loading state**: Each shows a Fluent UI `<Spinner>` centred in its card while data is fetching.
|
||||
2. **Error state**: Each shows a Fluent UI `<MessageBar intent="error">` with a retry button.
|
||||
3. **Empty state**: When the data set has zero bans, each chart shows a friendly message (e.g. "No bans in this time range") instead of an empty or broken chart.
|
||||
|
||||
Extract a small shared wrapper if three or more charts duplicate the same loading/error/empty pattern (e.g. `ChartCard` or `ChartStateWrapper`).
|
||||
|
||||
**Acceptance criteria:**
|
||||
|
||||
- All charts handle loading, error, and empty states consistently.
|
||||
- No broken or blank chart renders when data is unavailable.
|
||||
- `tsc --noEmit` and ESLint pass.
|
||||
|
||||
---
|
||||
|
||||
### Task 6.2 — Write frontend tests for chart components
|
||||
|
||||
**Status:** `done`
|
||||
|
||||
Add tests for each chart component to confirm:
|
||||
|
||||
- Correct number of rendered slices/bars given known test data.
|
||||
- "Other" aggregation logic in the pie chart.
|
||||
- Top-N truncation in the bar chart.
|
||||
- Hook re-fetch on prop change.
|
||||
- Loading and error states render the expected UI.
|
||||
|
||||
Follow the project's existing frontend test setup and conventions.
|
||||
|
||||
**Acceptance criteria:**
|
||||
|
||||
- Each chart component has at least one happy-path and one edge-case test.
|
||||
- Tests pass.
|
||||
- ESLint clean.
|
||||
|
||||
---
|
||||
|
||||
### Task 6.3 — Full build and lint check
|
||||
|
||||
**Status:** `done`
|
||||
|
||||
Run the complete quality-assurance pipeline:
|
||||
|
||||
1. Backend: `ruff check`, `mypy --strict`, `pytest` with coverage.
|
||||
2. Frontend: `tsc --noEmit`, `eslint`, `npm run build`.
|
||||
3. Fix any warnings or errors introduced during stages 1–6.
|
||||
4. Verify overall test coverage remains ≥ 80 %.
|
||||
|
||||
**Acceptance criteria:**
|
||||
|
||||
- Zero lint warnings/errors on both backend and frontend.
|
||||
- All tests pass.
|
||||
- Build artifacts generated successfully.
|
||||
|
||||
---
|
||||
|
||||
## Stage 7 — Global Dashboard Filter Bar
|
||||
|
||||
The time-range and origin-filter controls currently live inside the "Ban List" section header, but they control **every** section on the dashboard (Ban Trend, Top Countries, Jail Distribution, **and** Ban List). This creates a misleading UX: the buttons appear scoped to the ban list when they are actually global. This stage extracts those controls into a dedicated, always-visible filter bar at the top of the dashboard, directly below the `ServerStatusBar`.
|
||||
|
||||
### Task 7.1 — Create the `DashboardFilterBar` component
|
||||
|
||||
**Status:** `done`
|
||||
|
||||
Create `frontend/src/components/DashboardFilterBar.tsx`. This is a self-contained toolbar component that renders the time-range presets and origin filter as two groups of toggle buttons.
|
||||
|
||||
**Props interface:**
|
||||
|
||||
```ts
|
||||
interface DashboardFilterBarProps {
|
||||
timeRange: TimeRange;
|
||||
onTimeRangeChange: (value: TimeRange) => void;
|
||||
originFilter: BanOriginFilter;
|
||||
onOriginFilterChange: (value: BanOriginFilter) => void;
|
||||
}
|
||||
```
|
||||
|
||||
**Visual requirements:**
|
||||
|
||||
- Render inside a card-like container using the existing `section` style pattern (neutral background, border, border-radius, padding) — but **without** a section title. The toolbar **is** the content.
|
||||
- Layout: a single row with two `<Toolbar>` groups separated by a visual divider (use Fluent UI `<Divider vertical>` or a horizontal gap of `spacingHorizontalXL`).
|
||||
- **Left group** — "Time Range" label + four `<ToggleButton>` presets:
|
||||
- `Last 24 h` (value `"24h"`)
|
||||
- `Last 7 days` (value `"7d"`)
|
||||
- `Last 30 days` (value `"30d"`)
|
||||
- `Last 365 days` (value `"365d"`)
|
||||
- **Right group** — "Filter" label + three `<ToggleButton>` options:
|
||||
- `All` (value `"all"`)
|
||||
- `Blocklist` (value `"blocklist"`)
|
||||
- `Selfblock` (value `"selfblock"`)
|
||||
- Each group label is a `<Text weight="semibold" size={300}>` rendered inline before the buttons.
|
||||
- Use `size="small"` on all toggle buttons. The active button uses `checked={true}` and `aria-pressed={true}`.
|
||||
- On narrow viewports (< 640 px), the two groups should **wrap** onto separate lines (use `flexWrap: "wrap"` on the outer container).
|
||||
- Reuse `TIME_RANGE_LABELS` and `BAN_ORIGIN_FILTER_LABELS` from `types/ban.ts` — no hard-coded label strings.
|
||||
- Use `makeStyles` for all styling. Follow [Web-Design.md](Web-Design.md) spacing conventions: `spacingHorizontalM` between buttons within a group, `spacingHorizontalXL` between groups, `spacingVerticalS` for vertical padding.
|
||||
|
||||
**Behaviour:**
|
||||
|
||||
- Clicking a time-range button calls `onTimeRangeChange(value)`.
|
||||
- Clicking an origin-filter button calls `onOriginFilterChange(value)`.
|
||||
- Exactly one button per group is active at any time (mutually exclusive — not multi-select).
|
||||
- Component is fully controlled: it does not own state, it receives and reports values only.
|
||||
|
||||
**File structure rules:**
|
||||
|
||||
- One component per file. No barrel exports needed — import directly.
|
||||
- Keep under 100 lines.
|
||||
|
||||
**Acceptance criteria:**
|
||||
|
||||
- The component renders two labelled button groups in a single row.
|
||||
- Calls the correct callback with the correct value when a button is clicked.
|
||||
- Buttons reflect the current selection via `checked` / `aria-pressed`.
|
||||
- Wraps gracefully on narrow viewports.
|
||||
- `tsc --noEmit` passes. No `any`. ESLint clean.
|
||||
|
||||
---
|
||||
|
||||
### Task 7.2 — Integrate `DashboardFilterBar` into `DashboardPage`
|
||||
|
||||
**Status:** `done`
|
||||
|
||||
Move the global filter controls out of the "Ban List" section and replace them with the new `DashboardFilterBar`, placed at the top of the page.
|
||||
|
||||
**Changes to `DashboardPage.tsx`:**
|
||||
|
||||
1. **Add** `<DashboardFilterBar>` immediately **below** `<ServerStatusBar />` and **above** the "Ban Trend" section. Pass the existing `timeRange`, `setTimeRange`, `originFilter`, and `setOriginFilter` as props.
|
||||
2. **Remove** the two `<Toolbar>` blocks (time-range selector and origin filter) that are currently inside the "Ban List" section header. The section header should keep only the `<Text as="h2">Ban List</Text>` title — no filter buttons.
|
||||
3. **Remove** the `TIME_RANGES` and `ORIGIN_FILTERS` local constant arrays from `DashboardPage.tsx` since the `DashboardFilterBar` component now owns the iteration. (If `DashboardFilterBar` re-uses these arrays, it defines them locally or imports them from `types/ban.ts`.)
|
||||
4. **Keep** the `timeRange` and `originFilter` state (`useState`) in `DashboardPage` — the page still owns the state; it just no longer renders the buttons directly.
|
||||
5. **Verify** that all sections (Ban Trend, Top Countries, Jail Distribution, Ban List) still receive the filter values as props and re-render when they change — this should already work since the state location is unchanged.
|
||||
|
||||
**Layout after change:**
|
||||
**Observed behaviour (from container logs):**
|
||||
|
||||
```
|
||||
┌──────────────────────────────────────┐
|
||||
│ ServerStatusBar │
|
||||
├──────────────────────────────────────┤
|
||||
│ DashboardFilterBar │ ← NEW location
|
||||
│ [24h] [7d] [30d] [365d] │ [All] [Blocklist] [Selfblock] │
|
||||
├──────────────────────────────────────┤
|
||||
│ Ban Trend (chart) │
|
||||
├──────────────────────────────────────┤
|
||||
│ Top Countries (pie + bar) │
|
||||
├──────────────────────────────────────┤
|
||||
│ Jail Distribution (bar) │
|
||||
├──────────────────────────────────────┤
|
||||
│ Ban List (table) │ ← filters REMOVED from here
|
||||
└──────────────────────────────────────┘
|
||||
{"ip": "197.221.98.153", "error": "", "event": "geo_lookup_request_failed", ...}
|
||||
{"ip": "197.231.178.38", "error": "", "event": "geo_lookup_request_failed", ...}
|
||||
{"ip": "197.234.201.154", "error": "", "event": "geo_lookup_request_failed", ...}
|
||||
{"ip": "197.234.206.108", "error": "", "event": "geo_lookup_request_failed", ...}
|
||||
```
|
||||
|
||||
**Acceptance criteria:**
|
||||
**Details:**
|
||||
|
||||
- The filter bar is visible at the top of the dashboard, below the status bar.
|
||||
- Changing a filter updates all four sections simultaneously.
|
||||
- The "Ban List" section header no longer contains filter buttons.
|
||||
- No functional regression — the dashboard behaves identically, filters are just relocated.
|
||||
- `tsc --noEmit` and ESLint pass.
|
||||
- Open `app/services/geo_service.py` and trace the code path that emits the `geo_lookup_request_failed` event.
|
||||
- The `error` field is empty, which suggests the request may silently fail (e.g. the external service returns a non-200 status, an empty body, or the response parsing swallows the real error).
|
||||
- Ensure the actual HTTP status code and response body (or exception message) are captured and logged in the `error` field so failures are diagnosable.
|
||||
- Check whether the external geo-IP service has rate-limiting or IP-range restrictions that could explain the failures.
|
||||
- Add proper error handling: distinguish between transient errors (timeout, 429, 5xx) and permanent ones (invalid IP, 404) so retries can be applied only when appropriate.
|
||||
|
||||
---
|
||||
|
||||
### Task 7.3 — Write tests for `DashboardFilterBar`
|
||||
## Task 3 — Non-Blocking Web Requests & Bulk DB Operations
|
||||
|
||||
**Status:** `done`
|
||||
**Goal:** Ensure the web UI remains responsive while geo-IP lookups and database writes are in progress.
|
||||
|
||||
Create `frontend/src/components/__tests__/DashboardFilterBar.test.tsx`.
|
||||
**Details:**
|
||||
|
||||
**Test cases:**
|
||||
|
||||
1. **Renders all time-range buttons** — confirm four buttons with correct labels appear.
|
||||
2. **Renders all origin-filter buttons** — confirm three buttons with correct labels appear.
|
||||
3. **Active state matches props** — given `timeRange="7d"` and `originFilter="blocklist"`, the corresponding buttons have `aria-pressed="true"` and the others `"false"`.
|
||||
4. **Time-range click fires callback** — click the "Last 30 days" button, assert `onTimeRangeChange` was called with `"30d"`.
|
||||
5. **Origin-filter click fires callback** — click the "Selfblock" button, assert `onOriginFilterChange` was called with `"selfblock"`.
|
||||
6. **Already-active button click still fires callback** — clicking the currently active button should still call the callback (no no-op guard).
|
||||
|
||||
**Test setup:**
|
||||
|
||||
- Wrap the component in `<FluentProvider theme={webLightTheme}>` (required for Fluent UI token resolution).
|
||||
- Use `vi.fn()` for the callback props.
|
||||
- Follow the existing test patterns in `frontend/src/components/__tests__/`.
|
||||
|
||||
**Acceptance criteria:**
|
||||
|
||||
- All 6 test cases pass.
|
||||
- Tests are fully typed — no `any`.
|
||||
- ESLint clean.
|
||||
- After the geo-IP service was integrated, web UI requests became slow or appeared to hang because geo lookups and individual DB writes block the async event loop.
|
||||
- **Bulk DB operations:** When multiple IPs need geo data at once (e.g. loading the ban list), collect all uncached IPs and resolve them in a single batch. Use bulk `INSERT … ON CONFLICT` (or equivalent) to write results to the DB in one round-trip instead of one query per IP.
|
||||
- **Non-blocking external calls:** Make sure all HTTP calls to the external geo-IP service use an async HTTP client (`httpx.AsyncClient` or similar) so the event loop is never blocked by network I/O.
|
||||
- **Non-blocking DB access:** Ensure all database operations use the async SQLAlchemy session (or are off-loaded to a thread) so they do not block request handling.
|
||||
- **Background processing:** Consider moving bulk geo-lookups into a background task (e.g. the existing task infrastructure in `app/tasks/`) so the API endpoint returns immediately and the UI is updated once results are ready.
|
||||
|
||||
---
|
||||
|
||||
### Task 7.4 — Final lint, type-check, and build verification
|
||||
## Task 4 — Better Jail Configuration
|
||||
|
||||
**Status:** `done`
|
||||
**Goal:** Expose the full fail2ban configuration surface (jails, filters, actions) in the web UI.
|
||||
|
||||
Run the full quality-assurance pipeline after the filter-bar changes:
|
||||
Reference config directory: `/home/lukas/Volume/repo/BanGUI/Docker/fail2ban-dev-config/fail2ban/`
|
||||
|
||||
1. `tsc --noEmit` — zero errors.
|
||||
2. `npm run lint` — zero warnings, zero errors.
|
||||
3. `npm run build` — succeeds.
|
||||
4. `npm test` — all frontend tests pass (including the new `DashboardFilterBar` tests).
|
||||
5. Backend: `ruff check`, `mypy --strict`, `pytest` — still green (no backend changes expected, but verify no accidental modifications).
|
||||
### 4a — Activate / Deactivate Jail Configs
|
||||
|
||||
**Acceptance criteria:**
|
||||
- List all `.conf` and `.local` files in the jail config folder.
|
||||
- Allow the user to toggle inactive jails to active (and vice-versa) from the UI.
|
||||
|
||||
- Zero lint warnings/errors.
|
||||
- All tests pass on both frontend and backend.
|
||||
- Production build succeeds.
|
||||
### 4b — Editable Log Paths
|
||||
|
||||
---
|
||||
- Each jail has a `logpath` setting. Expose this in the UI as an editable text field so the user can point a jail at a different log file without SSH access.
|
||||
|
||||
## Stage 8 — Jails Router Test Coverage
|
||||
### 4c — Audit Missing Config Options
|
||||
|
||||
### Task 8.1 — Bring jails router to 100 % line coverage
|
||||
- Open every jail `.conf`/`.local` file in the dev-config directory and compare the available options with what the web UI currently exposes.
|
||||
- Write down all options that are **not yet implemented** in the UI (e.g. `findtime`, `bantime.increment`, `ignoreip`, `ignorecommand`, `maxretry`, `backend`, `usedns`, `destemail`, `sender`, `action`, etc.).
|
||||
- Document the findings in this task or a separate reference file so they can be implemented incrementally.
|
||||
|
||||
**Status:** `done`
|
||||
### 4d — Filter Configuration (`filter.d`)
|
||||
|
||||
`app/routers/jails.py` currently sits at **61 %** line coverage (54 of 138 lines uncovered). The missing lines are exclusively error-handling paths — the 502 `Fail2BanConnectionError` branch across every endpoint, several 404/409 branches in the jail-control and ignore-list endpoints, and the `toggle_ignore_self` endpoint which has no tests at all. These are critical banning-related paths that the Instructions require to be fully covered.
|
||||
- List all filter files in `filter.d/`.
|
||||
- Allow the user to activate, deactivate, view, and edit filter definitions from the UI.
|
||||
- Provide an option to create a brand-new filter file.
|
||||
|
||||
**Missing coverage (uncovered lines):**
|
||||
### 4e — Action Configuration (`action.d`)
|
||||
|
||||
| Lines | Endpoint | Missing path |
|
||||
|---|---|---|
|
||||
| 69 | `_bad_gateway` helper | One-time body — hit by first 502 test |
|
||||
| 120–121 | `GET /api/jails` | `Fail2BanConnectionError` → 502 |
|
||||
| 157–158 | `GET /api/jails/{name}` | `Fail2BanConnectionError` → 502 |
|
||||
| 195–198 | `POST /api/jails/reload-all` | `JailOperationError` → 409 and `Fail2BanConnectionError` → 502 |
|
||||
| 234–235 | `POST /api/jails/{name}/start` | `Fail2BanConnectionError` → 502 |
|
||||
| 270–273 | `POST /api/jails/{name}/stop` | `JailOperationError` → 409 and `Fail2BanConnectionError` → 502 |
|
||||
| 314–319 | `POST /api/jails/{name}/idle` | `JailNotFoundError` → 404, `JailOperationError` → 409, `Fail2BanConnectionError` → 502 |
|
||||
| 351–356 | `POST /api/jails/{name}/reload` | `JailNotFoundError` → 404, `JailOperationError` → 409, `Fail2BanConnectionError` → 502 |
|
||||
| 399–402 | `GET /api/jails/{name}/ignoreip` | `JailNotFoundError` → 404, `Fail2BanConnectionError` → 502 |
|
||||
| 449–454 | `POST /api/jails/{name}/ignoreip` | `JailNotFoundError` → 404, `JailOperationError` → 409, `Fail2BanConnectionError` → 502 |
|
||||
| 491–496 | `DELETE /api/jails/{name}/ignoreip` | `JailNotFoundError` → 404, `JailOperationError` → 409, `Fail2BanConnectionError` → 502 |
|
||||
| 529–542 | `POST /api/jails/{name}/ignoreself` | All paths (entirely untested) |
|
||||
- List all action files in `action.d/`.
|
||||
- Allow the user to activate, deactivate, view, and edit action definitions from the UI.
|
||||
- Provide an option to create a brand-new action file.
|
||||
|
||||
**Implementation:**
|
||||
|
||||
- Add new test classes / test methods to `backend/tests/test_routers/test_jails.py`.
|
||||
- Follow the naming pattern: `test_<unit>_<scenario>_<expected>`.
|
||||
- Each 502 test mocks the service function to raise `Fail2BanConnectionError`.
|
||||
- Each 404 test mocks the service to raise `JailNotFoundError`.
|
||||
- Each 409 test mocks the service to raise `JailOperationError`.
|
||||
- Wrap `toggle_ignore_self` tests in a `TestToggleIgnoreSelf` class covering: 200 (on), 200 (off), 404, 409, 502.
|
||||
- No changes to production code required — this is a pure test addition.
|
||||
|
||||
**Acceptance criteria:**
|
||||
|
||||
- `app/routers/jails.py` reaches **100 %** line coverage.
|
||||
- All new tests use `AsyncMock` and follow existing test patterns.
|
||||
- `ruff check` and `mypy --strict` pass (tests are type-clean).
|
||||
- Total test suite still passes (`497 + N` tests passing).
|
||||
### 4f — Create New Configuration Files
|
||||
|
||||
- Add a UI flow to create a new jail, filter, or action configuration file from scratch (or from a template).
|
||||
- Validate the new file before writing it to the config directory.
|
||||
|
||||
@@ -38,14 +38,15 @@ Usage::
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import asyncio
|
||||
import time
|
||||
from dataclasses import dataclass
|
||||
from typing import TYPE_CHECKING
|
||||
|
||||
import aiohttp
|
||||
import structlog
|
||||
|
||||
if TYPE_CHECKING:
|
||||
import aiohttp
|
||||
import aiosqlite
|
||||
import geoip2.database
|
||||
import geoip2.errors
|
||||
@@ -81,6 +82,14 @@ _REQUEST_TIMEOUT: float = 5.0
|
||||
#: eligible for a new API attempt. Default: 5 minutes.
|
||||
_NEG_CACHE_TTL: float = 300.0
|
||||
|
||||
#: Minimum delay in seconds between consecutive batch HTTP requests to
|
||||
#: ip-api.com. The free tier allows 45 requests/min; 1.5 s ≈ 40 req/min.
|
||||
_BATCH_DELAY: float = 1.5
|
||||
|
||||
#: Maximum number of retries for a batch chunk that fails with a
|
||||
#: transient error (e.g. connection reset due to rate limiting).
|
||||
_BATCH_MAX_RETRIES: int = 2
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Domain model
|
||||
# ---------------------------------------------------------------------------
|
||||
@@ -146,6 +155,49 @@ def clear_neg_cache() -> None:
|
||||
_neg_cache.clear()
|
||||
|
||||
|
||||
def is_cached(ip: str) -> bool:
|
||||
"""Return ``True`` if *ip* has a positive entry in the in-memory cache.
|
||||
|
||||
A positive entry is one with a non-``None`` ``country_code``. This is
|
||||
useful for skipping IPs that have already been resolved when building
|
||||
a list for :func:`lookup_batch`.
|
||||
|
||||
Args:
|
||||
ip: IPv4 or IPv6 address string.
|
||||
|
||||
Returns:
|
||||
``True`` when *ip* is in the cache with a known country code.
|
||||
"""
|
||||
return ip in _cache and _cache[ip].country_code is not None
|
||||
|
||||
|
||||
async def cache_stats(db: aiosqlite.Connection) -> dict[str, int]:
|
||||
"""Return diagnostic counters for the geo cache subsystem.
|
||||
|
||||
Queries the persistent store for the number of unresolved entries and
|
||||
combines it with in-memory counters.
|
||||
|
||||
Args:
|
||||
db: Open BanGUI application database connection.
|
||||
|
||||
Returns:
|
||||
Dict with keys ``cache_size``, ``unresolved``, ``neg_cache_size``,
|
||||
and ``dirty_size``.
|
||||
"""
|
||||
async with db.execute(
|
||||
"SELECT COUNT(*) FROM geo_cache WHERE country_code IS NULL"
|
||||
) as cur:
|
||||
row = await cur.fetchone()
|
||||
unresolved: int = int(row[0]) if row else 0
|
||||
|
||||
return {
|
||||
"cache_size": len(_cache),
|
||||
"unresolved": unresolved,
|
||||
"neg_cache_size": len(_neg_cache),
|
||||
"dirty_size": len(_dirty),
|
||||
}
|
||||
|
||||
|
||||
def init_geoip(mmdb_path: str | None) -> None:
|
||||
"""Initialise the MaxMind GeoLite2-Country database reader.
|
||||
|
||||
@@ -322,7 +374,7 @@ async def lookup(
|
||||
url: str = _API_URL.format(ip=ip)
|
||||
api_ok = False
|
||||
try:
|
||||
async with http_session.get(url, timeout=_REQUEST_TIMEOUT) as resp: # type: ignore[arg-type]
|
||||
async with http_session.get(url, timeout=aiohttp.ClientTimeout(total=_REQUEST_TIMEOUT)) as resp:
|
||||
if resp.status != 200:
|
||||
log.warning("geo_lookup_non_200", ip=ip, status=resp.status)
|
||||
else:
|
||||
@@ -345,7 +397,12 @@ async def lookup(
|
||||
message=data.get("message", "unknown"),
|
||||
)
|
||||
except Exception as exc: # noqa: BLE001
|
||||
log.warning("geo_lookup_request_failed", ip=ip, error=str(exc))
|
||||
log.warning(
|
||||
"geo_lookup_request_failed",
|
||||
ip=ip,
|
||||
exc_type=type(exc).__name__,
|
||||
error=repr(exc),
|
||||
)
|
||||
|
||||
if not api_ok:
|
||||
# Try local MaxMind database as fallback.
|
||||
@@ -421,9 +478,36 @@ async def lookup_batch(
|
||||
|
||||
log.info("geo_batch_lookup_start", total=len(uncached))
|
||||
|
||||
for chunk_start in range(0, len(uncached), _BATCH_SIZE):
|
||||
for batch_idx, chunk_start in enumerate(range(0, len(uncached), _BATCH_SIZE)):
|
||||
chunk = uncached[chunk_start : chunk_start + _BATCH_SIZE]
|
||||
chunk_result = await _batch_api_call(chunk, http_session)
|
||||
|
||||
# Throttle: pause between consecutive HTTP calls to stay within the
|
||||
# ip-api.com free-tier rate limit (45 req/min).
|
||||
if batch_idx > 0:
|
||||
await asyncio.sleep(_BATCH_DELAY)
|
||||
|
||||
# Retry transient failures (e.g. connection-reset from rate limit).
|
||||
chunk_result: dict[str, GeoInfo] | None = None
|
||||
for attempt in range(_BATCH_MAX_RETRIES + 1):
|
||||
chunk_result = await _batch_api_call(chunk, http_session)
|
||||
# If every IP in the chunk came back with country_code=None and the
|
||||
# batch wasn't tiny, that almost certainly means the whole request
|
||||
# was rejected (connection reset / 429). Retry after a back-off.
|
||||
all_failed = all(
|
||||
info.country_code is None for info in chunk_result.values()
|
||||
)
|
||||
if not all_failed or attempt >= _BATCH_MAX_RETRIES:
|
||||
break
|
||||
backoff = _BATCH_DELAY * (2 ** (attempt + 1))
|
||||
log.warning(
|
||||
"geo_batch_retry",
|
||||
attempt=attempt + 1,
|
||||
chunk_size=len(chunk),
|
||||
backoff=backoff,
|
||||
)
|
||||
await asyncio.sleep(backoff)
|
||||
|
||||
assert chunk_result is not None # noqa: S101
|
||||
|
||||
for ip, info in chunk_result.items():
|
||||
if info.country_code is not None:
|
||||
@@ -493,14 +577,19 @@ async def _batch_api_call(
|
||||
async with http_session.post(
|
||||
_BATCH_API_URL,
|
||||
json=payload,
|
||||
timeout=_REQUEST_TIMEOUT * 2, # type: ignore[arg-type]
|
||||
timeout=aiohttp.ClientTimeout(total=_REQUEST_TIMEOUT * 2),
|
||||
) as resp:
|
||||
if resp.status != 200:
|
||||
log.warning("geo_batch_non_200", status=resp.status, count=len(ips))
|
||||
return fallback
|
||||
data: list[dict[str, object]] = await resp.json(content_type=None)
|
||||
except Exception as exc: # noqa: BLE001
|
||||
log.warning("geo_batch_request_failed", count=len(ips), error=str(exc))
|
||||
log.warning(
|
||||
"geo_batch_request_failed",
|
||||
count=len(ips),
|
||||
exc_type=type(exc).__name__,
|
||||
error=repr(exc),
|
||||
)
|
||||
return fallback
|
||||
|
||||
out: dict[str, GeoInfo] = {}
|
||||
|
||||
@@ -572,3 +572,198 @@ class TestFlushDirty:
|
||||
assert not geo_service._dirty # type: ignore[attr-defined]
|
||||
db.commit.assert_awaited_once()
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Rate-limit throttling and retry tests (Task 5)
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
class TestLookupBatchThrottling:
|
||||
"""Verify the inter-batch delay, retry, and give-up behaviour."""
|
||||
|
||||
async def test_lookup_batch_throttles_between_chunks(self) -> None:
|
||||
"""When more than _BATCH_SIZE IPs are sent, asyncio.sleep is called
|
||||
between consecutive batch HTTP calls with at least _BATCH_DELAY."""
|
||||
# Generate _BATCH_SIZE + 1 IPs so we get exactly 2 batch calls.
|
||||
batch_size: int = geo_service._BATCH_SIZE # type: ignore[attr-defined]
|
||||
ips = [f"10.0.{i // 256}.{i % 256}" for i in range(batch_size + 1)]
|
||||
|
||||
def _make_result(chunk: list[str], _session: object) -> dict[str, GeoInfo]:
|
||||
return {
|
||||
ip: GeoInfo(country_code="DE", country_name="Germany", asn=None, org=None)
|
||||
for ip in chunk
|
||||
}
|
||||
|
||||
with (
|
||||
patch(
|
||||
"app.services.geo_service._batch_api_call",
|
||||
new_callable=AsyncMock,
|
||||
side_effect=_make_result,
|
||||
) as mock_batch,
|
||||
patch("app.services.geo_service.asyncio.sleep", new_callable=AsyncMock) as mock_sleep,
|
||||
):
|
||||
await geo_service.lookup_batch(ips, MagicMock())
|
||||
|
||||
# Two chunks → one sleep between them.
|
||||
assert mock_batch.call_count == 2
|
||||
mock_sleep.assert_awaited_once()
|
||||
delay_arg: float = mock_sleep.call_args[0][0]
|
||||
assert delay_arg >= geo_service._BATCH_DELAY # type: ignore[attr-defined]
|
||||
|
||||
async def test_lookup_batch_retries_on_full_chunk_failure(self) -> None:
|
||||
"""When a chunk returns all-None on first try, it retries and succeeds."""
|
||||
ips = ["1.2.3.4", "5.6.7.8"]
|
||||
|
||||
_empty = GeoInfo(country_code=None, country_name=None, asn=None, org=None)
|
||||
_success = {
|
||||
"1.2.3.4": GeoInfo(country_code="DE", country_name="Germany", asn=None, org=None),
|
||||
"5.6.7.8": GeoInfo(country_code="US", country_name="United States", asn=None, org=None),
|
||||
}
|
||||
_failure: dict[str, GeoInfo] = dict.fromkeys(ips, _empty)
|
||||
|
||||
call_count = 0
|
||||
|
||||
async def _side_effect(chunk: list[str], _session: object) -> dict[str, GeoInfo]:
|
||||
nonlocal call_count
|
||||
call_count += 1
|
||||
if call_count == 1:
|
||||
return _failure
|
||||
return _success
|
||||
|
||||
with (
|
||||
patch(
|
||||
"app.services.geo_service._batch_api_call",
|
||||
new_callable=AsyncMock,
|
||||
side_effect=_side_effect,
|
||||
),
|
||||
patch("app.services.geo_service.asyncio.sleep", new_callable=AsyncMock),
|
||||
):
|
||||
result = await geo_service.lookup_batch(ips, MagicMock())
|
||||
|
||||
assert call_count == 2
|
||||
assert result["1.2.3.4"].country_code == "DE"
|
||||
assert result["5.6.7.8"].country_code == "US"
|
||||
|
||||
async def test_lookup_batch_gives_up_after_max_retries(self) -> None:
|
||||
"""After _BATCH_MAX_RETRIES + 1 attempts, IPs end up in the neg cache."""
|
||||
ips = ["9.9.9.9"]
|
||||
_empty = GeoInfo(country_code=None, country_name=None, asn=None, org=None)
|
||||
_failure: dict[str, GeoInfo] = dict.fromkeys(ips, _empty)
|
||||
|
||||
max_retries: int = geo_service._BATCH_MAX_RETRIES # type: ignore[attr-defined]
|
||||
|
||||
with (
|
||||
patch(
|
||||
"app.services.geo_service._batch_api_call",
|
||||
new_callable=AsyncMock,
|
||||
return_value=_failure,
|
||||
) as mock_batch,
|
||||
patch("app.services.geo_service.asyncio.sleep", new_callable=AsyncMock) as mock_sleep,
|
||||
):
|
||||
result = await geo_service.lookup_batch(ips, MagicMock())
|
||||
|
||||
# Initial attempt + max_retries retries.
|
||||
assert mock_batch.call_count == max_retries + 1
|
||||
# IP should have no country.
|
||||
assert result["9.9.9.9"].country_code is None
|
||||
# Negative cache should contain the IP.
|
||||
assert "9.9.9.9" in geo_service._neg_cache # type: ignore[attr-defined]
|
||||
# Sleep called for each retry with exponential backoff.
|
||||
assert mock_sleep.call_count == max_retries
|
||||
backoff_values = [call.args[0] for call in mock_sleep.call_args_list]
|
||||
batch_delay: float = geo_service._BATCH_DELAY # type: ignore[attr-defined]
|
||||
for i, val in enumerate(backoff_values):
|
||||
expected = batch_delay * (2 ** (i + 1))
|
||||
assert val == pytest.approx(expected)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Error logging improvements (Task 2)
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
class TestErrorLogging:
|
||||
"""Verify that exception details are properly captured in log events.
|
||||
|
||||
Previously ``str(exc)`` was used which yields an empty string for
|
||||
aiohttp exceptions such as ``ServerDisconnectedError`` that carry no
|
||||
message. The fix uses ``repr(exc)`` so the exception class name is
|
||||
always present, and adds an ``exc_type`` field for easy log filtering.
|
||||
"""
|
||||
|
||||
async def test_empty_message_exception_logs_exc_type(self, caplog: pytest.LogCaptureFixture) -> None:
|
||||
"""When exception str() is empty, exc_type and repr are still logged."""
|
||||
|
||||
class _EmptyMessageError(Exception):
|
||||
"""Exception whose str() representation is empty."""
|
||||
|
||||
def __str__(self) -> str:
|
||||
return ""
|
||||
|
||||
session = MagicMock()
|
||||
mock_ctx = AsyncMock()
|
||||
mock_ctx.__aenter__ = AsyncMock(side_effect=_EmptyMessageError())
|
||||
mock_ctx.__aexit__ = AsyncMock(return_value=False)
|
||||
session.get = MagicMock(return_value=mock_ctx)
|
||||
|
||||
import structlog.testing
|
||||
|
||||
with structlog.testing.capture_logs() as captured:
|
||||
result = await geo_service.lookup("197.221.98.153", session) # type: ignore[arg-type]
|
||||
|
||||
assert result is not None
|
||||
assert result.country_code is None
|
||||
|
||||
request_failed = [e for e in captured if e.get("event") == "geo_lookup_request_failed"]
|
||||
assert len(request_failed) == 1
|
||||
event = request_failed[0]
|
||||
# exc_type must name the exception class — never empty.
|
||||
assert event["exc_type"] == "_EmptyMessageError"
|
||||
# repr() must include the class name even when str() is empty.
|
||||
assert "_EmptyMessageError" in event["error"]
|
||||
|
||||
async def test_connection_error_logs_exc_type(self, caplog: pytest.LogCaptureFixture) -> None:
|
||||
"""A standard OSError with message is logged both in error and exc_type."""
|
||||
session = MagicMock()
|
||||
mock_ctx = AsyncMock()
|
||||
mock_ctx.__aenter__ = AsyncMock(side_effect=OSError("connection refused"))
|
||||
mock_ctx.__aexit__ = AsyncMock(return_value=False)
|
||||
session.get = MagicMock(return_value=mock_ctx)
|
||||
|
||||
import structlog.testing
|
||||
|
||||
with structlog.testing.capture_logs() as captured:
|
||||
await geo_service.lookup("10.0.0.1", session) # type: ignore[arg-type]
|
||||
|
||||
request_failed = [e for e in captured if e.get("event") == "geo_lookup_request_failed"]
|
||||
assert len(request_failed) == 1
|
||||
event = request_failed[0]
|
||||
assert event["exc_type"] == "OSError"
|
||||
assert "connection refused" in event["error"]
|
||||
|
||||
async def test_batch_empty_message_exception_logs_exc_type(self) -> None:
|
||||
"""Batch API call: empty-message exceptions include exc_type in the log."""
|
||||
|
||||
class _EmptyMessageError(Exception):
|
||||
def __str__(self) -> str:
|
||||
return ""
|
||||
|
||||
session = MagicMock()
|
||||
mock_ctx = AsyncMock()
|
||||
mock_ctx.__aenter__ = AsyncMock(side_effect=_EmptyMessageError())
|
||||
mock_ctx.__aexit__ = AsyncMock(return_value=False)
|
||||
session.post = MagicMock(return_value=mock_ctx)
|
||||
|
||||
import structlog.testing
|
||||
|
||||
with structlog.testing.capture_logs() as captured:
|
||||
result = await geo_service._batch_api_call(["1.2.3.4"], session) # type: ignore[attr-defined]
|
||||
|
||||
assert result["1.2.3.4"].country_code is None
|
||||
|
||||
batch_failed = [e for e in captured if e.get("event") == "geo_batch_request_failed"]
|
||||
assert len(batch_failed) == 1
|
||||
event = batch_failed[0]
|
||||
assert event["exc_type"] == "_EmptyMessageError"
|
||||
assert "_EmptyMessageError" in event["error"]
|
||||
|
||||
|
||||
Reference in New Issue
Block a user