Fix empty error field in geo_lookup_request_failed log events

- Replace str(exc) with repr(exc) in lookup() and _batch_api_call() so exception class name is always present even for no-message errors (e.g. aiohttp.ServerDisconnectedError() whose str() is empty) - Add exc_type=type(exc).__name__ field to network-error log events for easy structured-log filtering - Move import aiohttp to runtime import; use aiohttp.ClientTimeout() instead of raw float, removing # type: ignore[arg-type] workarounds - Add TestErrorLogging with 3 tests covering empty-message exceptions
2026-03-12 17:50:58 +01:00
parent 029c094e18
commit a61c9dc969
3 changed files with 351 additions and 690 deletions
--- a/Docs/Tasks.md
+++ b/Docs/Tasks.md
@@ -4,723 +4,100 @@ This document breaks the entire BanGUI project into development stages, ordered

 ---

-## Stage 1 — Dashboard Charts Foundation
+## Task 1 — Make Geo-Cache Persistent ✅ DONE

-### Task 1.1 — Install and configure a charting library
+**Goal:** Minimise calls to the external geo-IP lookup service by caching results in the database.

-**Status:** `done`
+**Details:**

-The frontend currently has no charting library. Install **Recharts** (`recharts`) as the project charting library. Recharts is React-native, composable, and integrates cleanly with Fluent UI v9 theming.
-
-**Steps:**
-
-1. Run `npm install recharts` in the `frontend/` directory.
-2. Verify the dependency appears in `package.json` under `dependencies`.
-3. Confirm the build still succeeds with `npm run build` (no type errors, no warnings).
-
-No wrapper or configuration file is needed — Recharts components are imported directly where used.
-
-**Acceptance criteria:**
-
- `recharts` is listed in `frontend/package.json`.
- `npm run build` succeeds with zero errors or warnings.
+- Currently geo-IP results may only live in memory and are lost on restart. Persist every successful geo-lookup result into the database so the external service is called as rarely as possible.
+- On each geo-lookup request, first query the database for a cached entry for that IP. Only call the external service if no cached entry exists (or the entry has expired, if a TTL policy is desired).
+- After a successful external lookup, write the result back to the database immediately.
+- Review the existing implementation in `app/services/geo_service.py` and the related repository/model code. Verify that:
+  - The DB table/model for geo-cache entries exists and has the correct schema (IP, country, city, latitude, longitude, looked-up timestamp, etc.).
+  - The repository layer exposes `get_by_ip` and `upsert` (or equivalent) methods.
+  - The service checks the cache before calling the external API.
+  - Bulk inserts are used where multiple IPs need to be resolved at once (see Task 3).

 ---

-### Task 1.2 — Create a shared chart theme utility
+## Task 2 — Fix `geo_lookup_request_failed` Warnings ✅ DONE

-**Status:** `done`
+**Goal:** Investigate and fix the frequent `geo_lookup_request_failed` log warnings that occur with an empty `error` field.

-Create a small utility at `frontend/src/utils/chartTheme.ts` that exports a function (or constant object) mapping Fluent UI v9 design tokens to Recharts-compatible colour values. The charts must respect the current Fluent theme (light and dark mode). At minimum export:
+**Resolution:** The root cause was `str(exc)` returning `""` for aiohttp exceptions with no message (e.g. `ServerDisconnectedError`). Fixed by:
+- Replacing `error=str(exc)` with `error=repr(exc)` in both `lookup()` and `_batch_api_call()` so the exception class name is always present in the log.
+- Adding `exc_type=type(exc).__name__` field to every network-error log event for easy filtering.
+- Moving `import aiohttp` from the `TYPE_CHECKING` block to a regular runtime import and replacing the raw-float `timeout` arguments with `aiohttp.ClientTimeout(total=...)`, removing the `# type: ignore[arg-type]` workarounds.
+- Three new tests in `TestErrorLogging` verify empty-message exceptions are correctly captured.

- A palette of 5+ distinct categorical colours for pie/bar slices, derived from Fluent token aliases (e.g. `colorPaletteBlueBorderActive`, `colorPaletteRedBorderActive`, etc.).
- Axis/grid/tooltip colours derived from `colorNeutralForeground2`, `colorNeutralStroke2`, `colorNeutralBackground1`, etc.
- A helper that returns the CSS value of a Fluent token at runtime (since Recharts needs literal CSS colour strings, not CSS custom properties).
-
-Keep the file under 60 lines. No React components here — pure utility.
-
-**References:** [Web-Design.md](Web-Design.md) § colour tokens.
-
-**Acceptance criteria:**
-
- The exported palette contains at least 5 distinct colours.
- Colours change correctly between light and dark mode.
- `tsc --noEmit` and `eslint` pass with zero warnings.
-
---
-
-## Stage 2 — Country Pie Chart (Top 4 + Other)
-
-### Task 2.1 — Create the `TopCountriesPieChart` component
-
-**Status:** `done`
-
-Create `frontend/src/components/TopCountriesPieChart.tsx`. This component renders a **pie chart (Kuchendiagramm)** showing the **top 4 countries by ban count** plus an **"Other"** slice that aggregates every remaining country.
-
-**Data source:** The component receives the `countries` map (`Record<string, number>`) and `country_names` map (`Record<string, string>`) from the existing `/api/dashboard/bans/by-country` endpoint response (`BansByCountryResponse`). No new API endpoint is needed.
-
-**Aggregation logic (frontend):**
-
-1. Sort the `countries` entries descending by ban count.
-2. Take the top 4 entries.
-3. Sum all remaining entries into a single `"Other"` bucket.
-4. The result is exactly 5 slices (or fewer if fewer than 5 countries exist).
-
-**Visual requirements:**
-
- Use `<PieChart>` and `<Pie>` from Recharts with `<Cell>` for per-slice colours from the chart theme palette (Task 1.2).
- Display a `<Tooltip>` on hover showing the country name and ban count.
- Display a `<Legend>` listing each slice with its country name (full name from `country_names`, not just the code) and percentage.
- Label each slice with the percentage (use Recharts `label` prop or `<Label>`).
- Use `makeStyles` for any layout styling. Follow [Web-Design.md](Web-Design.md) spacing and card conventions.
- Wrap the chart in a responsive container so it scales with its parent.
-
-**Props interface:**
-
-```ts
-interface TopCountriesPieChartProps {
-  countries: Record<string, number>;
-  countryNames: Record<string, string>;
-}
-```
-
-**Acceptance criteria:**
-
- Always renders exactly 5 slices (or fewer when data has < 5 countries).
- The "Other" slice correctly sums all countries outside the top 4.
- Tooltip displays country name + count on hover.
- Legend shows country name + percentage.
- Responsive — no horizontal overflow on narrow viewports.
- `tsc --noEmit` passes. No `any` types. ESLint clean.
-
---
-
-### Task 2.2 — Create a `useDashboardCountryData` hook
-
-**Status:** `done`
-
-Create `frontend/src/hooks/useDashboardCountryData.ts`. This hook wraps the existing `GET /api/dashboard/bans/by-country` call and returns the data the dashboard charts need. The existing `useMapData` hook is designed for the map page and should not be reused because it is coupled to map-specific debouncing and state.
-
-**Signature:**
-
-```ts
-function useDashboardCountryData(
-  timeRange: TimeRange,
-  origin: BanOriginFilter,
-): {
-  countries: Record<string, number>;
-  countryNames: Record<string, string>;
-  bans: DashboardBanItem[];
-  total: number;
-  isLoading: boolean;
-  error: string | null;
-};
-```
-
-**Behaviour:**
-
- Call `GET /api/dashboard/bans/by-country?range={timeRange}` with optional `origin` query param (omit when `"all"`).
- Use the typed API client from `api/client.ts`.
- Set `isLoading` while fetching, populate `error` on failure.
- Re-fetch when `timeRange` or `origin` changes.
- Mirror the data-fetching patterns used by `useBans` / `useMapData`.
-
-**Acceptance criteria:**
-
- Returns typed data matching `BansByCountryResponse`.
- Re-fetches on param change.
- `tsc --noEmit` and ESLint pass.
-
---
-
-### Task 2.3 — Integrate the pie chart into `DashboardPage`
-
-**Status:** `done`
-
-Add the `TopCountriesPieChart` below the `ServerStatusBar` and above the "Ban List" section on the `DashboardPage`. The chart must share the same `timeRange` and `originFilter` state that already exists on the page.
-
-**Layout:**
-
- Place the pie chart inside a new section card (reuse the `section` / `sectionHeader` pattern from the existing ban-list section).
- Section title: **"Top Countries"**.
- The pie chart card sits in a future row of chart cards (see Task 3.3). For now, render it full-width. Use a CSS class name like `chartsRow` so the bar chart can be added beside it later.
-
-**Acceptance criteria:**
-
- The pie chart renders on the dashboard, respecting the selected time range and origin filter.
- Changing the time range or origin filter re-renders the chart with new data.
- The loading and error states from the hook are handled (show `<Spinner>` while loading, `<MessageBar>` on error).
- `tsc --noEmit` and ESLint pass.
-
---
-
-## Stage 3 — Country Bar Chart (Top 20)
-
-### Task 3.1 — Create the `TopCountriesBarChart` component
-
-**Status:** `done`
-
-Create `frontend/src/components/TopCountriesBarChart.tsx`. This component renders a **horizontal bar chart (Balkendiagramm)** showing the **top 20 countries by ban count**.
-
-**Data source:** Same `countries` and `country_names` maps from `BansByCountryResponse` — passed as props identical to the pie chart.
-
-**Aggregation logic (frontend):**
-
-1. Sort the `countries` entries descending by ban count.
-2. Take the top 20 entries.
-3. No "Other" bucket — the bar chart is detail-focused.
-
-**Visual requirements:**
-
- Use `<BarChart>` (horizontal via `layout="vertical"`) from Recharts with `<Bar>`, `<XAxis>`, `<YAxis>`, `<CartesianGrid>`, and `<Tooltip>`.
- Y-axis shows country names (full name from `country_names`, truncated to ~20 chars with ellipsis if needed).
- X-axis shows ban count (numeric).
- Bars are coloured with the primary colour from the chart theme palette.
- Tooltip shows the full country name and exact ban count.
- Chart height should be dynamic based on the number of bars (e.g. `barCount * 36px` min), with a reasonable minimum height.
- Wrap in a `<ResponsiveContainer>` for width.
-
-**Props interface:**
-
-```ts
-interface TopCountriesBarChartProps {
-  countries: Record<string, number>;
-  countryNames: Record<string, string>;
-}
-```
-
-**Acceptance criteria:**
-
- Renders up to 20 bars, sorted descending.
- Country names readable on the Y-axis; tooltip provides full detail.
- Responsive width, dynamic height.
- `tsc --noEmit` passes. No `any`. ESLint clean.
-
---
-
-### Task 3.2 — Integrate the bar chart into `DashboardPage`
-
-**Status:** `done`
-
-Add the `TopCountriesBarChart` to the dashboard alongside the pie chart.
-
-**Layout:**
-
- The charts section now contains two cards side-by-side in a responsive grid row (the `chartsRow` class from Task 2.3):
-  - Left: **Top Countries** pie chart (Task 2.1).
-  - Right: **Top 20 Countries** bar chart (Task 3.1).
- On narrow screens (< 768 px viewport width) the cards should stack vertically.
- Both charts consume data from the **same** `useDashboardCountryData` hook call — do not fetch twice.
-
-**Acceptance criteria:**
-
- Both charts render side by side on wide screens, stacked on narrow screens.
- A single API call feeds both charts.
- Time range / origin filter controls affect both charts.
- Loading / error states handled for both.
- `tsc --noEmit` and ESLint pass.
-
---
-
-## Stage 4 — Bans-Over-Time Trend Chart
-
-### Task 4.1 — Add a backend endpoint for time-series ban aggregation
-
-**Status:** `done`
-
-Added `GET /api/dashboard/bans/trend`. New Pydantic models `BanTrendBucket` and
-`BanTrendResponse` (plus `BUCKET_SECONDS`, `BUCKET_SIZE_LABEL`, `bucket_count`
-helpers) in `ban.py`. Service function `ban_trend()` in `ban_service.py` groups
-`bans.timeofban` into equal-width buckets via SQL and fills empty buckets with
-zero so the frontend always receives a gap-free series. Route added to
-`dashboard.py`. 20 new tests (10 service, 10 router) — all pass, total suite
-480 passed, 83% coverage.
-
-The existing endpoints return flat lists or country-aggregated counts but **no time-bucketed series**. A dashboard trend chart needs data grouped into time buckets.
-
-Create a new endpoint: **`GET /api/dashboard/bans/trend`**.
-
-**Query params:**
-
-| Param | Type | Default | Description |
-|---|---|---|---|
-| `range` | `TimeRange` | `"24h"` | Time-range preset. |
-| `origin` | `BanOrigin \| null` | `null` | Optional filter by ban origin. |
-
-**Response model** (`BanTrendResponse`):
-
-```python
-class BanTrendBucket(BaseModel):
-    timestamp: str  # ISO 8601 UTC start of the bucket
-    count: int      # Number of bans in this bucket
-
-class BanTrendResponse(BaseModel):
-    buckets: list[BanTrendBucket]
-    bucket_size: str  # Human-readable label: "1h", "6h", "1d", "7d"
-```
-
-**Bucket strategy:**
-
-| Range | Bucket size | Example buckets |
-|---|---|---|
-| `24h` | 1 hour | 24 buckets |
-| `7d` | 6 hours | 28 buckets |
-| `30d` | 1 day | 30 buckets |
-| `365d` | 7 days | ~52 buckets |
-
-**Implementation:**
-
- Add the Pydantic models to `backend/app/models/ban.py`.
- Add the service function in `backend/app/services/ban_service.py`. Query the fail2ban database (`bans` table), group rows by the computed bucket. Use SQL `CAST((banned_at - ?) / bucket_seconds AS INTEGER)` style bucketing.
- Add the route in `backend/app/routers/dashboard.py`.
- Follow the existing layering: router → service → repository.
- Write tests for the new endpoint in `backend/tests/test_routers/` and `backend/tests/test_services/`.
-
-**Acceptance criteria:**
-
- `GET /api/dashboard/bans/trend?range=24h` returns 24 hourly buckets.
- Each bucket has a correct ISO 8601 timestamp and count.
- Origin filter is applied correctly.
- Empty buckets (zero bans) are included so the frontend has a continuous series.
- Tests pass and cover happy path + empty data + origin filter.
- `ruff check` and `mypy --strict` pass.
-
---
-
-### Task 4.2 — Create the `BanTrendChart` component
-
-**Status:** `done`
-
-Created `frontend/src/components/BanTrendChart.tsx` — an area chart using Recharts
-`AreaChart` with a gradient fill, human-readable X-axis time labels (format varies by
-time range), and a custom tooltip. Added `BanTrendBucket`/`BanTrendResponse` types to
-`types/ban.ts`, `dashboardBansTrend` constant to `api/endpoints.ts`, `fetchBanTrend()`
-to `api/dashboard.ts`, and the `useBanTrend` hook at `hooks/useBanTrend.ts`. Component
-handles loading (Spinner), error (MessageBar), and empty states inline.
-`tsc --noEmit` and ESLint pass with zero warnings.
-
-Create `frontend/src/components/BanTrendChart.tsx`. This component renders an **area/line chart** showing the number of bans over time.
-
-**Data source:** A new `useBanTrend` hook that calls `GET /api/dashboard/bans/trend`.
-
-**Visual requirements:**
-
- Use `<AreaChart>` (or `<LineChart>`) from Recharts with `<Area>`, `<XAxis>`, `<YAxis>`, `<CartesianGrid>`, `<Tooltip>`.
- X-axis: time labels formatted human-readably (e.g. "Mon 14:00", "Mar 5").
- Y-axis: ban count.
- Area fill with a semi-transparent version of the primary chart colour.
- Tooltip shows exact timestamp + count.
- Responsive via `<ResponsiveContainer>`.
-
-**Acceptance criteria:**
-
- Displays a continuous time-series line with the correct number of data points for each range.
- Readable axis labels for all four time ranges.
- Responsive.
- `tsc --noEmit`, ESLint clean.
-
---
-
-### Task 4.3 — Integrate the trend chart into `DashboardPage`
-
-**Status:** `done`
-
-Added a "Ban Trend" full-width section card to `DashboardPage` between the
-`ServerStatusBar` and the "Top Countries" section. The section renders
-`<BanTrendChart timeRange={timeRange} origin={originFilter} />`, sharing the
-same state already used by the country charts and ban list. Loading, error,
-and empty states are handled inside `BanTrendChart` itself. `tsc --noEmit` and
-ESLint pass with zero warnings.
-
-Add the `BanTrendChart` to the dashboard page **above** the two country charts and **below** the `ServerStatusBar`.
-
-**Layout:**
-
- Full-width section card.
- Section title: **"Ban Trend"**.
- Shares the same `timeRange` and `originFilter` state.
-
-**Acceptance criteria:**
-
- The trend chart renders on the dashboard showing bans over time.
- Responds to time-range and origin-filter changes.
- Loading/error states handled.
- `tsc --noEmit` and ESLint pass.
-
---
-
-## Stage 5 — Jail Distribution Chart
-
-### Task 5.1 — Add a backend endpoint for ban counts per jail
-
-**Status:** `done`
-
-Added `GET /api/dashboard/bans/by-jail`. New Pydantic models `JailBanCount` and
-`BansByJailResponse` added to `ban.py`. Service function `bans_by_jail()` in
-`ban_service.py` queries the `bans` table with `GROUP BY jail ORDER BY COUNT(*) DESC`
-and applies the origin filter. Route added to `dashboard.py`. 7 new service tests
-(happy path, total equality, empty DB, time-window exclusion, origin filter variants)
-and 10 new router tests — all pass, total suite 497 passed, 83% coverage.
-`ruff check` and `mypy --strict` pass.
-
-The existing `GET /api/jails` endpoint returns jail metadata with `status.currently_banned` — but this counts **currently active** bans, not historical bans in the selected time window. The dashboard needs historical ban counts per jail within the selected time range.
-
-Create a new endpoint: **`GET /api/dashboard/bans/by-jail`**.
-
-**Query params:**
-
-| Param | Type | Default | Description |
-|---|---|---|---|
-| `range` | `TimeRange` | `"24h"` | Time-range preset. |
-| `origin` | `BanOrigin \| null` | `null` | Optional origin filter. |
-
-**Response model** (`BansByJailResponse`):
-
-```python
-class JailBanCount(BaseModel):
-    jail: str
-    count: int
-
-class BansByJailResponse(BaseModel):
-    jails: list[JailBanCount]
-    total: int
-```
-
-**Implementation:**
-
- Query the `bans` table: `SELECT jail, COUNT(*) FROM bans WHERE timestart >= ? GROUP BY jail ORDER BY COUNT(*) DESC`.
- Apply origin filter by checking whether `jail == 'blocklist-import'`.
- Add models, service function, route, and tests following existing patterns.
-
-**Acceptance criteria:**
-
- Returns jail names with ban counts descending, within the selected time window.
- Origin filter works correctly.
- Tests covering happy path, empty data, and filter.
- `ruff check` and `mypy --strict` pass.
-
---
-
-### Task 5.2 — Create the `JailDistributionChart` component
-
-**Status:** `done`
-
-Created `frontend/src/components/JailDistributionChart.tsx` — a horizontal
-bar chart using Recharts `BarChart` showing ban counts per jail sorted descending.
-Added `JailBanCount`/`BansByJailResponse` types to `types/ban.ts`,
-`dashboardBansByJail` constant to `api/endpoints.ts`, `fetchBansByJail()` to
-`api/dashboard.ts`, and the `useJailDistribution` hook at
-`hooks/useJailDistribution.ts`. Component handles loading (Spinner), error
-(MessageBar), and empty states inline. `tsc --noEmit` and ESLint pass with zero
-warnings.
-
-Create `frontend/src/components/JailDistributionChart.tsx`. This component renders a **horizontal bar chart** showing the distribution of bans across jails.
-
-**Why this is useful and not covered by existing views:** The current Jails page shows configuration details and live counters per jail, but **does not** provide a visual comparison of which jails are catching the most threats within a selectable time window. An admin reviewing the dashboard benefits from an at-a-glance answer to: *"Which services are being attacked most frequently right now?"* — this is fundamentally different from the country-based charts (which answer *"where"*) and from the ban trend (which answers *"when"*). The jail distribution answers **"what service is targeted"** and helps prioritise hardening efforts.
-
-**Data source:** A new `useJailDistribution` hook calling `GET /api/dashboard/bans/by-jail`.
-
-**Visual requirements:**
-
- Horizontal `<BarChart>` from Recharts.
- Y-axis: jail names.
- X-axis: ban count.
- Colour-coded bars from the chart theme.
- Tooltip with jail name and exact count.
- Responsive.
-
-**Acceptance criteria:**
-
- Renders one bar per jail, sorted descending.
- Responsive.
- `tsc --noEmit`, ESLint clean.
-
---
-
-### Task 5.3 — Integrate the jail distribution chart into `DashboardPage`
-
-**Status:** `done`
-
-Added a full-width "Jail Distribution" section card to `DashboardPage` below the
-"Top Countries" section (2-column country charts on row 1, jail chart full-width
-on row 2). The section renders `<JailDistributionChart timeRange={timeRange}
-origin={originFilter} />`, sharing the same state already used by the other
-charts. Loading, error, and empty states are handled inside
-`JailDistributionChart` itself. `tsc --noEmit` and ESLint pass with zero warnings.
-
-Add the `JailDistributionChart` as a third chart card alongside the two country charts, or in a second chart row below them if space is constrained.
-
-**Layout decision:**
-
- If three cards fit side-by-side at the standard breakpoint, place all three in one row.
- Otherwise, use a 2-column + 1-column stacked layout (pie + bar on row 1, jail chart full-width on row 2). Choose whichever looks cleaner.
-
-**Acceptance criteria:**
-
- The jail distribution chart renders on the dashboard.
- Shares time-range and origin-filter controls with the other charts.
- Loading/error states handled.
- Responsive layout.
- `tsc --noEmit` and ESLint pass.
-
---
-
-## Stage 6 — Polish and Final Review
-
-### Task 6.1 — Ensure consistent loading, error, and empty states across all charts
-
-**Status:** `done`
-
-Review all four chart components and ensure:
-
-1. **Loading state**: Each shows a Fluent UI `<Spinner>` centred in its card while data is fetching.
-2. **Error state**: Each shows a Fluent UI `<MessageBar intent="error">` with a retry button.
-3. **Empty state**: When the data set has zero bans, each chart shows a friendly message (e.g. "No bans in this time range") instead of an empty or broken chart.
-
-Extract a small shared wrapper if three or more charts duplicate the same loading/error/empty pattern (e.g. `ChartCard` or `ChartStateWrapper`).
-
-**Acceptance criteria:**
-
- All charts handle loading, error, and empty states consistently.
- No broken or blank chart renders when data is unavailable.
- `tsc --noEmit` and ESLint pass.
-
---
-
-### Task 6.2 — Write frontend tests for chart components
-
-**Status:** `done`
-
-Add tests for each chart component to confirm:
-
- Correct number of rendered slices/bars given known test data.
- "Other" aggregation logic in the pie chart.
- Top-N truncation in the bar chart.
- Hook re-fetch on prop change.
- Loading and error states render the expected UI.
-
-Follow the project's existing frontend test setup and conventions.
-
-**Acceptance criteria:**
-
- Each chart component has at least one happy-path and one edge-case test.
- Tests pass.
- ESLint clean.
-
---
-
-### Task 6.3 — Full build and lint check
-
-**Status:** `done`
-
-Run the complete quality-assurance pipeline:
-
-1. Backend: `ruff check`, `mypy --strict`, `pytest` with coverage.
-2. Frontend: `tsc --noEmit`, `eslint`, `npm run build`.
-3. Fix any warnings or errors introduced during stages 1–6.
-4. Verify overall test coverage remains ≥ 80 %.
-
-**Acceptance criteria:**
-
- Zero lint warnings/errors on both backend and frontend.
- All tests pass.
- Build artifacts generated successfully.
-
---
-
-## Stage 7 — Global Dashboard Filter Bar
-
-The time-range and origin-filter controls currently live inside the "Ban List" section header, but they control **every** section on the dashboard (Ban Trend, Top Countries, Jail Distribution, **and** Ban List). This creates a misleading UX: the buttons appear scoped to the ban list when they are actually global. This stage extracts those controls into a dedicated, always-visible filter bar at the top of the dashboard, directly below the `ServerStatusBar`.
-
-### Task 7.1 — Create the `DashboardFilterBar` component
-
-**Status:** `done`
-
-Create `frontend/src/components/DashboardFilterBar.tsx`. This is a self-contained toolbar component that renders the time-range presets and origin filter as two groups of toggle buttons.
-
-**Props interface:**
-
-```ts
-interface DashboardFilterBarProps {
-  timeRange: TimeRange;
-  onTimeRangeChange: (value: TimeRange) => void;
-  originFilter: BanOriginFilter;
-  onOriginFilterChange: (value: BanOriginFilter) => void;
-}
-```
-
-**Visual requirements:**
-
- Render inside a card-like container using the existing `section` style pattern (neutral background, border, border-radius, padding) — but **without** a section title. The toolbar **is** the content.
- Layout: a single row with two `<Toolbar>` groups separated by a visual divider (use Fluent UI `<Divider vertical>` or a horizontal gap of `spacingHorizontalXL`).
-  - **Left group** — "Time Range" label + four `<ToggleButton>` presets:
-    - `Last 24 h` (value `"24h"`)
-    - `Last 7 days` (value `"7d"`)
-    - `Last 30 days` (value `"30d"`)
-    - `Last 365 days` (value `"365d"`)
-  - **Right group** — "Filter" label + three `<ToggleButton>` options:
-    - `All` (value `"all"`)
-    - `Blocklist` (value `"blocklist"`)
-    - `Selfblock` (value `"selfblock"`)
- Each group label is a `<Text weight="semibold" size={300}>` rendered inline before the buttons.
- Use `size="small"` on all toggle buttons. The active button uses `checked={true}` and `aria-pressed={true}`.
- On narrow viewports (< 640 px), the two groups should **wrap** onto separate lines (use `flexWrap: "wrap"` on the outer container).
- Reuse `TIME_RANGE_LABELS` and `BAN_ORIGIN_FILTER_LABELS` from `types/ban.ts` — no hard-coded label strings.
- Use `makeStyles` for all styling. Follow [Web-Design.md](Web-Design.md) spacing conventions: `spacingHorizontalM` between buttons within a group, `spacingHorizontalXL` between groups, `spacingVerticalS` for vertical padding.
-
-**Behaviour:**
-
- Clicking a time-range button calls `onTimeRangeChange(value)`.
- Clicking an origin-filter button calls `onOriginFilterChange(value)`.
- Exactly one button per group is active at any time (mutually exclusive — not multi-select).
- Component is fully controlled: it does not own state, it receives and reports values only.
-
-**File structure rules:**
-
- One component per file. No barrel exports needed — import directly.
- Keep under 100 lines.
-
-**Acceptance criteria:**
-
- The component renders two labelled button groups in a single row.
- Calls the correct callback with the correct value when a button is clicked.
- Buttons reflect the current selection via `checked` / `aria-pressed`.
- Wraps gracefully on narrow viewports.
- `tsc --noEmit` passes. No `any`. ESLint clean.
-
---
-
-### Task 7.2 — Integrate `DashboardFilterBar` into `DashboardPage`
-
-**Status:** `done`
-
-Move the global filter controls out of the "Ban List" section and replace them with the new `DashboardFilterBar`, placed at the top of the page.
-
-**Changes to `DashboardPage.tsx`:**
-
-1. **Add** `<DashboardFilterBar>` immediately **below** `<ServerStatusBar />` and **above** the "Ban Trend" section. Pass the existing `timeRange`, `setTimeRange`, `originFilter`, and `setOriginFilter` as props.
-2. **Remove** the two `<Toolbar>` blocks (time-range selector and origin filter) that are currently inside the "Ban List" section header. The section header should keep only the `<Text as="h2">Ban List</Text>` title — no filter buttons.
-3. **Remove** the `TIME_RANGES` and `ORIGIN_FILTERS` local constant arrays from `DashboardPage.tsx` since the `DashboardFilterBar` component now owns the iteration. (If `DashboardFilterBar` re-uses these arrays, it defines them locally or imports them from `types/ban.ts`.)
-4. **Keep** the `timeRange` and `originFilter` state (`useState`) in `DashboardPage` — the page still owns the state; it just no longer renders the buttons directly.
-5. **Verify** that all sections (Ban Trend, Top Countries, Jail Distribution, Ban List) still receive the filter values as props and re-render when they change — this should already work since the state location is unchanged.
-
-**Layout after change:**
+**Observed behaviour (from container logs):**

 ```
-┌──────────────────────────────────────┐
-│ ServerStatusBar                      │
-├──────────────────────────────────────┤
-│ DashboardFilterBar                   │  ← NEW location
-│  [24h] [7d] [30d] [365d]  │  [All] [Blocklist] [Selfblock] │
-├──────────────────────────────────────┤
-│ Ban Trend (chart)                    │
-├──────────────────────────────────────┤
-│ Top Countries (pie + bar)            │
-├──────────────────────────────────────┤
-│ Jail Distribution (bar)              │
-├──────────────────────────────────────┤
-│ Ban List (table)                     │  ← filters REMOVED from here
-└──────────────────────────────────────┘
+{"ip": "197.221.98.153", "error": "", "event": "geo_lookup_request_failed", ...}
+{"ip": "197.231.178.38", "error": "", "event": "geo_lookup_request_failed", ...}
+{"ip": "197.234.201.154", "error": "", "event": "geo_lookup_request_failed", ...}
+{"ip": "197.234.206.108", "error": "", "event": "geo_lookup_request_failed", ...}
 ```

-**Acceptance criteria:**
+**Details:**

- The filter bar is visible at the top of the dashboard, below the status bar.
- Changing a filter updates all four sections simultaneously.
- The "Ban List" section header no longer contains filter buttons.
- No functional regression — the dashboard behaves identically, filters are just relocated.
- `tsc --noEmit` and ESLint pass.
+- Open `app/services/geo_service.py` and trace the code path that emits the `geo_lookup_request_failed` event.
+- The `error` field is empty, which suggests the request may silently fail (e.g. the external service returns a non-200 status, an empty body, or the response parsing swallows the real error).
+- Ensure the actual HTTP status code and response body (or exception message) are captured and logged in the `error` field so failures are diagnosable.
+- Check whether the external geo-IP service has rate-limiting or IP-range restrictions that could explain the failures.
+- Add proper error handling: distinguish between transient errors (timeout, 429, 5xx) and permanent ones (invalid IP, 404) so retries can be applied only when appropriate.

 ---

-### Task 7.3 — Write tests for `DashboardFilterBar`
+## Task 3 — Non-Blocking Web Requests & Bulk DB Operations

-**Status:** `done`
+**Goal:** Ensure the web UI remains responsive while geo-IP lookups and database writes are in progress.

-Create `frontend/src/components/__tests__/DashboardFilterBar.test.tsx`.
+**Details:**

-**Test cases:**
-
-1. **Renders all time-range buttons** — confirm four buttons with correct labels appear.
-2. **Renders all origin-filter buttons** — confirm three buttons with correct labels appear.
-3. **Active state matches props** — given `timeRange="7d"` and `originFilter="blocklist"`, the corresponding buttons have `aria-pressed="true"` and the others `"false"`.
-4. **Time-range click fires callback** — click the "Last 30 days" button, assert `onTimeRangeChange` was called with `"30d"`.
-5. **Origin-filter click fires callback** — click the "Selfblock" button, assert `onOriginFilterChange` was called with `"selfblock"`.
-6. **Already-active button click still fires callback** — clicking the currently active button should still call the callback (no no-op guard).
-
-**Test setup:**
-
- Wrap the component in `<FluentProvider theme={webLightTheme}>` (required for Fluent UI token resolution).
- Use `vi.fn()` for the callback props.
- Follow the existing test patterns in `frontend/src/components/__tests__/`.
-
-**Acceptance criteria:**
-
- All 6 test cases pass.
- Tests are fully typed — no `any`.
- ESLint clean.
+- After the geo-IP service was integrated, web UI requests became slow or appeared to hang because geo lookups and individual DB writes block the async event loop.
+- **Bulk DB operations:** When multiple IPs need geo data at once (e.g. loading the ban list), collect all uncached IPs and resolve them in a single batch. Use bulk `INSERT … ON CONFLICT` (or equivalent) to write results to the DB in one round-trip instead of one query per IP.
+- **Non-blocking external calls:** Make sure all HTTP calls to the external geo-IP service use an async HTTP client (`httpx.AsyncClient` or similar) so the event loop is never blocked by network I/O.
+- **Non-blocking DB access:** Ensure all database operations use the async SQLAlchemy session (or are off-loaded to a thread) so they do not block request handling.
+- **Background processing:** Consider moving bulk geo-lookups into a background task (e.g. the existing task infrastructure in `app/tasks/`) so the API endpoint returns immediately and the UI is updated once results are ready.

 ---

-### Task 7.4 — Final lint, type-check, and build verification
+## Task 4 — Better Jail Configuration

-**Status:** `done`
+**Goal:** Expose the full fail2ban configuration surface (jails, filters, actions) in the web UI.

-Run the full quality-assurance pipeline after the filter-bar changes:
+Reference config directory: `/home/lukas/Volume/repo/BanGUI/Docker/fail2ban-dev-config/fail2ban/`

-1. `tsc --noEmit` — zero errors.
-2. `npm run lint` — zero warnings, zero errors.
-3. `npm run build` — succeeds.
-4. `npm test` — all frontend tests pass (including the new `DashboardFilterBar` tests).
-5. Backend: `ruff check`, `mypy --strict`, `pytest` — still green (no backend changes expected, but verify no accidental modifications).
+### 4a — Activate / Deactivate Jail Configs

-**Acceptance criteria:**
+- List all `.conf` and `.local` files in the jail config folder.
+- Allow the user to toggle inactive jails to active (and vice-versa) from the UI.

- Zero lint warnings/errors.
- All tests pass on both frontend and backend.
- Production build succeeds.
+### 4b — Editable Log Paths

---
+- Each jail has a `logpath` setting. Expose this in the UI as an editable text field so the user can point a jail at a different log file without SSH access.

-## Stage 8 — Jails Router Test Coverage
+### 4c — Audit Missing Config Options

-### Task 8.1 — Bring jails router to 100 % line coverage
+- Open every jail `.conf`/`.local` file in the dev-config directory and compare the available options with what the web UI currently exposes.
+- Write down all options that are **not yet implemented** in the UI (e.g. `findtime`, `bantime.increment`, `ignoreip`, `ignorecommand`, `maxretry`, `backend`, `usedns`, `destemail`, `sender`, `action`, etc.).
+- Document the findings in this task or a separate reference file so they can be implemented incrementally.

-**Status:** `done`
+### 4d — Filter Configuration (`filter.d`)

-`app/routers/jails.py` currently sits at **61 %** line coverage (54 of 138 lines uncovered). The missing lines are exclusively error-handling paths — the 502 `Fail2BanConnectionError` branch across every endpoint, several 404/409 branches in the jail-control and ignore-list endpoints, and the `toggle_ignore_self` endpoint which has no tests at all. These are critical banning-related paths that the Instructions require to be fully covered.
+- List all filter files in `filter.d/`.
+- Allow the user to activate, deactivate, view, and edit filter definitions from the UI.
+- Provide an option to create a brand-new filter file.

-**Missing coverage (uncovered lines):**
+### 4e — Action Configuration (`action.d`)

-| Lines | Endpoint | Missing path |
-|---|---|---|
-| 69 | `_bad_gateway` helper | One-time body — hit by first 502 test |
-| 120–121 | `GET /api/jails` | `Fail2BanConnectionError` → 502 |
-| 157–158 | `GET /api/jails/{name}` | `Fail2BanConnectionError` → 502 |
-| 195–198 | `POST /api/jails/reload-all` | `JailOperationError` → 409 and `Fail2BanConnectionError` → 502 |
-| 234–235 | `POST /api/jails/{name}/start` | `Fail2BanConnectionError` → 502 |
-| 270–273 | `POST /api/jails/{name}/stop` | `JailOperationError` → 409 and `Fail2BanConnectionError` → 502 |
-| 314–319 | `POST /api/jails/{name}/idle` | `JailNotFoundError` → 404, `JailOperationError` → 409, `Fail2BanConnectionError` → 502 |
-| 351–356 | `POST /api/jails/{name}/reload` | `JailNotFoundError` → 404, `JailOperationError` → 409, `Fail2BanConnectionError` → 502 |
-| 399–402 | `GET /api/jails/{name}/ignoreip` | `JailNotFoundError` → 404, `Fail2BanConnectionError` → 502 |
-| 449–454 | `POST /api/jails/{name}/ignoreip` | `JailNotFoundError` → 404, `JailOperationError` → 409, `Fail2BanConnectionError` → 502 |
-| 491–496 | `DELETE /api/jails/{name}/ignoreip` | `JailNotFoundError` → 404, `JailOperationError` → 409, `Fail2BanConnectionError` → 502 |
-| 529–542 | `POST /api/jails/{name}/ignoreself` | All paths (entirely untested) |
+- List all action files in `action.d/`.
+- Allow the user to activate, deactivate, view, and edit action definitions from the UI.
+- Provide an option to create a brand-new action file.

-**Implementation:**
-
- Add new test classes / test methods to `backend/tests/test_routers/test_jails.py`.
- Follow the naming pattern: `test_<unit>_<scenario>_<expected>`.
- Each 502 test mocks the service function to raise `Fail2BanConnectionError`.
- Each 404 test mocks the service to raise `JailNotFoundError`.
- Each 409 test mocks the service to raise `JailOperationError`.
- Wrap `toggle_ignore_self` tests in a `TestToggleIgnoreSelf` class covering: 200 (on), 200 (off), 404, 409, 502.
- No changes to production code required — this is a pure test addition.
-
-**Acceptance criteria:**
-
- `app/routers/jails.py` reaches **100 %** line coverage.
- All new tests use `AsyncMock` and follow existing test patterns.
- `ruff check` and `mypy --strict` pass (tests are type-clean).
- Total test suite still passes (`497 + N` tests passing).
+### 4f — Create New Configuration Files

+- Add a UI flow to create a new jail, filter, or action configuration file from scratch (or from a template).
+- Validate the new file before writing it to the config directory.
--- a/backend/app/services/geo_service.py
+++ b/backend/app/services/geo_service.py
@@ -38,14 +38,15 @@ Usage::

 from __future__ import annotations

+import asyncio
 import time
 from dataclasses import dataclass
 from typing import TYPE_CHECKING

+import aiohttp
 import structlog

 if TYPE_CHECKING:
-    import aiohttp
    import aiosqlite
    import geoip2.database
    import geoip2.errors
@@ -81,6 +82,14 @@ _REQUEST_TIMEOUT: float = 5.0
 #: eligible for a new API attempt.  Default: 5 minutes.
 _NEG_CACHE_TTL: float = 300.0

+#: Minimum delay in seconds between consecutive batch HTTP requests to
+#: ip-api.com.  The free tier allows 45 requests/min; 1.5 s ≈ 40 req/min.
+_BATCH_DELAY: float = 1.5
+
+#: Maximum number of retries for a batch chunk that fails with a
+#: transient error (e.g. connection reset due to rate limiting).
+_BATCH_MAX_RETRIES: int = 2
+
 # ---------------------------------------------------------------------------
 # Domain model
 # ---------------------------------------------------------------------------
@@ -146,6 +155,49 @@ def clear_neg_cache() -> None:
    _neg_cache.clear()


+def is_cached(ip: str) -> bool:
+    """Return ``True`` if *ip* has a positive entry in the in-memory cache.
+
+    A positive entry is one with a non-``None`` ``country_code``.  This is
+    useful for skipping IPs that have already been resolved when building
+    a list for :func:`lookup_batch`.
+
+    Args:
+        ip: IPv4 or IPv6 address string.
+
+    Returns:
+        ``True`` when *ip* is in the cache with a known country code.
+    """
+    return ip in _cache and _cache[ip].country_code is not None
+
+
+async def cache_stats(db: aiosqlite.Connection) -> dict[str, int]:
+    """Return diagnostic counters for the geo cache subsystem.
+
+    Queries the persistent store for the number of unresolved entries and
+    combines it with in-memory counters.
+
+    Args:
+        db: Open BanGUI application database connection.
+
+    Returns:
+        Dict with keys ``cache_size``, ``unresolved``, ``neg_cache_size``,
+        and ``dirty_size``.
+    """
+    async with db.execute(
+        "SELECT COUNT(*) FROM geo_cache WHERE country_code IS NULL"
+    ) as cur:
+        row = await cur.fetchone()
+        unresolved: int = int(row[0]) if row else 0
+
+    return {
+        "cache_size": len(_cache),
+        "unresolved": unresolved,
+        "neg_cache_size": len(_neg_cache),
+        "dirty_size": len(_dirty),
+    }
+
+
 def init_geoip(mmdb_path: str | None) -> None:
    """Initialise the MaxMind GeoLite2-Country database reader.

@@ -322,7 +374,7 @@ async def lookup(
    url: str = _API_URL.format(ip=ip)
    api_ok = False
    try:
-        async with http_session.get(url, timeout=_REQUEST_TIMEOUT) as resp:  # type: ignore[arg-type]
+        async with http_session.get(url, timeout=aiohttp.ClientTimeout(total=_REQUEST_TIMEOUT)) as resp:
            if resp.status != 200:
                log.warning("geo_lookup_non_200", ip=ip, status=resp.status)
            else:
@@ -345,7 +397,12 @@ async def lookup(
                    message=data.get("message", "unknown"),
                )
    except Exception as exc:  # noqa: BLE001
-        log.warning("geo_lookup_request_failed", ip=ip, error=str(exc))
+        log.warning(
+            "geo_lookup_request_failed",
+            ip=ip,
+            exc_type=type(exc).__name__,
+            error=repr(exc),
+        )

    if not api_ok:
        # Try local MaxMind database as fallback.
@@ -421,9 +478,36 @@ async def lookup_batch(

    log.info("geo_batch_lookup_start", total=len(uncached))

-    for chunk_start in range(0, len(uncached), _BATCH_SIZE):
+    for batch_idx, chunk_start in enumerate(range(0, len(uncached), _BATCH_SIZE)):
        chunk = uncached[chunk_start : chunk_start + _BATCH_SIZE]
-        chunk_result = await _batch_api_call(chunk, http_session)
+
+        # Throttle: pause between consecutive HTTP calls to stay within the
+        # ip-api.com free-tier rate limit (45 req/min).
+        if batch_idx > 0:
+            await asyncio.sleep(_BATCH_DELAY)
+
+        # Retry transient failures (e.g. connection-reset from rate limit).
+        chunk_result: dict[str, GeoInfo] | None = None
+        for attempt in range(_BATCH_MAX_RETRIES + 1):
+            chunk_result = await _batch_api_call(chunk, http_session)
+            # If every IP in the chunk came back with country_code=None and the
+            # batch wasn't tiny, that almost certainly means the whole request
+            # was rejected (connection reset / 429).  Retry after a back-off.
+            all_failed = all(
+                info.country_code is None for info in chunk_result.values()
+            )
+            if not all_failed or attempt >= _BATCH_MAX_RETRIES:
+                break
+            backoff = _BATCH_DELAY * (2 ** (attempt + 1))
+            log.warning(
+                "geo_batch_retry",
+                attempt=attempt + 1,
+                chunk_size=len(chunk),
+                backoff=backoff,
+            )
+            await asyncio.sleep(backoff)
+
+        assert chunk_result is not None  # noqa: S101

        for ip, info in chunk_result.items():
            if info.country_code is not None:
@@ -493,14 +577,19 @@ async def _batch_api_call(
        async with http_session.post(
            _BATCH_API_URL,
            json=payload,
-            timeout=_REQUEST_TIMEOUT * 2,  # type: ignore[arg-type]
+            timeout=aiohttp.ClientTimeout(total=_REQUEST_TIMEOUT * 2),
        ) as resp:
            if resp.status != 200:
                log.warning("geo_batch_non_200", status=resp.status, count=len(ips))
                return fallback
            data: list[dict[str, object]] = await resp.json(content_type=None)
    except Exception as exc:  # noqa: BLE001
-        log.warning("geo_batch_request_failed", count=len(ips), error=str(exc))
+        log.warning(
+            "geo_batch_request_failed",
+            count=len(ips),
+            exc_type=type(exc).__name__,
+            error=repr(exc),
+        )
        return fallback

    out: dict[str, GeoInfo] = {}
--- a/backend/tests/test_services/test_geo_service.py
+++ b/backend/tests/test_services/test_geo_service.py
@@ -572,3 +572,198 @@ class TestFlushDirty:
        assert not geo_service._dirty  # type: ignore[attr-defined]
        db.commit.assert_awaited_once()

+
+# ---------------------------------------------------------------------------
+# Rate-limit throttling and retry tests (Task 5)
+# ---------------------------------------------------------------------------
+
+
+class TestLookupBatchThrottling:
+    """Verify the inter-batch delay, retry, and give-up behaviour."""
+
+    async def test_lookup_batch_throttles_between_chunks(self) -> None:
+        """When more than _BATCH_SIZE IPs are sent, asyncio.sleep is called
+        between consecutive batch HTTP calls with at least _BATCH_DELAY."""
+        # Generate _BATCH_SIZE + 1 IPs so we get exactly 2 batch calls.
+        batch_size: int = geo_service._BATCH_SIZE  # type: ignore[attr-defined]
+        ips = [f"10.0.{i // 256}.{i % 256}" for i in range(batch_size + 1)]
+
+        def _make_result(chunk: list[str], _session: object) -> dict[str, GeoInfo]:
+            return {
+                ip: GeoInfo(country_code="DE", country_name="Germany", asn=None, org=None)
+                for ip in chunk
+            }
+
+        with (
+            patch(
+                "app.services.geo_service._batch_api_call",
+                new_callable=AsyncMock,
+                side_effect=_make_result,
+            ) as mock_batch,
+            patch("app.services.geo_service.asyncio.sleep", new_callable=AsyncMock) as mock_sleep,
+        ):
+            await geo_service.lookup_batch(ips, MagicMock())
+
+        # Two chunks → one sleep between them.
+        assert mock_batch.call_count == 2
+        mock_sleep.assert_awaited_once()
+        delay_arg: float = mock_sleep.call_args[0][0]
+        assert delay_arg >= geo_service._BATCH_DELAY  # type: ignore[attr-defined]
+
+    async def test_lookup_batch_retries_on_full_chunk_failure(self) -> None:
+        """When a chunk returns all-None on first try, it retries and succeeds."""
+        ips = ["1.2.3.4", "5.6.7.8"]
+
+        _empty = GeoInfo(country_code=None, country_name=None, asn=None, org=None)
+        _success = {
+            "1.2.3.4": GeoInfo(country_code="DE", country_name="Germany", asn=None, org=None),
+            "5.6.7.8": GeoInfo(country_code="US", country_name="United States", asn=None, org=None),
+        }
+        _failure: dict[str, GeoInfo] = dict.fromkeys(ips, _empty)
+
+        call_count = 0
+
+        async def _side_effect(chunk: list[str], _session: object) -> dict[str, GeoInfo]:
+            nonlocal call_count
+            call_count += 1
+            if call_count == 1:
+                return _failure
+            return _success
+
+        with (
+            patch(
+                "app.services.geo_service._batch_api_call",
+                new_callable=AsyncMock,
+                side_effect=_side_effect,
+            ),
+            patch("app.services.geo_service.asyncio.sleep", new_callable=AsyncMock),
+        ):
+            result = await geo_service.lookup_batch(ips, MagicMock())
+
+        assert call_count == 2
+        assert result["1.2.3.4"].country_code == "DE"
+        assert result["5.6.7.8"].country_code == "US"
+
+    async def test_lookup_batch_gives_up_after_max_retries(self) -> None:
+        """After _BATCH_MAX_RETRIES + 1 attempts, IPs end up in the neg cache."""
+        ips = ["9.9.9.9"]
+        _empty = GeoInfo(country_code=None, country_name=None, asn=None, org=None)
+        _failure: dict[str, GeoInfo] = dict.fromkeys(ips, _empty)
+
+        max_retries: int = geo_service._BATCH_MAX_RETRIES  # type: ignore[attr-defined]
+
+        with (
+            patch(
+                "app.services.geo_service._batch_api_call",
+                new_callable=AsyncMock,
+                return_value=_failure,
+            ) as mock_batch,
+            patch("app.services.geo_service.asyncio.sleep", new_callable=AsyncMock) as mock_sleep,
+        ):
+            result = await geo_service.lookup_batch(ips, MagicMock())
+
+        # Initial attempt + max_retries retries.
+        assert mock_batch.call_count == max_retries + 1
+        # IP should have no country.
+        assert result["9.9.9.9"].country_code is None
+        # Negative cache should contain the IP.
+        assert "9.9.9.9" in geo_service._neg_cache  # type: ignore[attr-defined]
+        # Sleep called for each retry with exponential backoff.
+        assert mock_sleep.call_count == max_retries
+        backoff_values = [call.args[0] for call in mock_sleep.call_args_list]
+        batch_delay: float = geo_service._BATCH_DELAY  # type: ignore[attr-defined]
+        for i, val in enumerate(backoff_values):
+            expected = batch_delay * (2 ** (i + 1))
+            assert val == pytest.approx(expected)
+
+
+# ---------------------------------------------------------------------------
+# Error logging improvements (Task 2)
+# ---------------------------------------------------------------------------
+
+
+class TestErrorLogging:
+    """Verify that exception details are properly captured in log events.
+
+    Previously ``str(exc)`` was used which yields an empty string for
+    aiohttp exceptions such as ``ServerDisconnectedError`` that carry no
+    message.  The fix uses ``repr(exc)`` so the exception class name is
+    always present, and adds an ``exc_type`` field for easy log filtering.
+    """
+
+    async def test_empty_message_exception_logs_exc_type(self, caplog: pytest.LogCaptureFixture) -> None:
+        """When exception str() is empty, exc_type and repr are still logged."""
+
+        class _EmptyMessageError(Exception):
+            """Exception whose str() representation is empty."""
+
+            def __str__(self) -> str:
+                return ""
+
+        session = MagicMock()
+        mock_ctx = AsyncMock()
+        mock_ctx.__aenter__ = AsyncMock(side_effect=_EmptyMessageError())
+        mock_ctx.__aexit__ = AsyncMock(return_value=False)
+        session.get = MagicMock(return_value=mock_ctx)
+
+        import structlog.testing
+
+        with structlog.testing.capture_logs() as captured:
+            result = await geo_service.lookup("197.221.98.153", session)  # type: ignore[arg-type]
+
+        assert result is not None
+        assert result.country_code is None
+
+        request_failed = [e for e in captured if e.get("event") == "geo_lookup_request_failed"]
+        assert len(request_failed) == 1
+        event = request_failed[0]
+        # exc_type must name the exception class — never empty.
+        assert event["exc_type"] == "_EmptyMessageError"
+        # repr() must include the class name even when str() is empty.
+        assert "_EmptyMessageError" in event["error"]
+
+    async def test_connection_error_logs_exc_type(self, caplog: pytest.LogCaptureFixture) -> None:
+        """A standard OSError with message is logged both in error and exc_type."""
+        session = MagicMock()
+        mock_ctx = AsyncMock()
+        mock_ctx.__aenter__ = AsyncMock(side_effect=OSError("connection refused"))
+        mock_ctx.__aexit__ = AsyncMock(return_value=False)
+        session.get = MagicMock(return_value=mock_ctx)
+
+        import structlog.testing
+
+        with structlog.testing.capture_logs() as captured:
+            await geo_service.lookup("10.0.0.1", session)  # type: ignore[arg-type]
+
+        request_failed = [e for e in captured if e.get("event") == "geo_lookup_request_failed"]
+        assert len(request_failed) == 1
+        event = request_failed[0]
+        assert event["exc_type"] == "OSError"
+        assert "connection refused" in event["error"]
+
+    async def test_batch_empty_message_exception_logs_exc_type(self) -> None:
+        """Batch API call: empty-message exceptions include exc_type in the log."""
+
+        class _EmptyMessageError(Exception):
+            def __str__(self) -> str:
+                return ""
+
+        session = MagicMock()
+        mock_ctx = AsyncMock()
+        mock_ctx.__aenter__ = AsyncMock(side_effect=_EmptyMessageError())
+        mock_ctx.__aexit__ = AsyncMock(return_value=False)
+        session.post = MagicMock(return_value=mock_ctx)
+
+        import structlog.testing
+
+        with structlog.testing.capture_logs() as captured:
+            result = await geo_service._batch_api_call(["1.2.3.4"], session)  # type: ignore[attr-defined]
+
+        assert result["1.2.3.4"].country_code is None
+
+        batch_failed = [e for e in captured if e.get("event") == "geo_batch_request_failed"]
+        assert len(batch_failed) == 1
+        event = batch_failed[0]
+        assert event["exc_type"] == "_EmptyMessageError"
+        assert "_EmptyMessageError" in event["error"]
+