- Add Observability.md documentation - Standardize task logging with correlation_id support - Add log_sanitizer utility for PII masking - Update Tasks.md tracking - Update geo_cache tasks and other task modules with correlation_id Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
18 KiB
Issue #20: MEDIUM - No Correlation ID in Background Tasks
Where found:
- All task files in
backend/app/tasks/ - Background tasks don't propagate correlation ID
- Can't correlate task logs with triggering request
Why this is needed: Troubleshooting becomes hard:
- Task fails
- Logs show task name
- Can't find what triggered it
Goal: Track correlation ID through entire request lifecycle including background tasks.
What to do:
- Use contextvars for correlation ID:
from contextvars import ContextVar correlation_id_var: ContextVar[str] = ContextVar('correlation_id', default='bg-task') async def blocklist_import_task(source_id: str, correlation_id: str): token = correlation_id_var.set(correlation_id) try: logger.info(f"Starting import for source {source_id}") finally: correlation_id_var.reset(token) - Pass correlation ID to background tasks
- Include in structured logs
- Create task tracking UI showing correlation ID
Possible traps and issues:
- Correlation ID must flow through all async contexts
- Need to pass ID when scheduling tasks
- Multiple nested tasks might have parent/child correlation IDs
Docs changes needed:
- Add observability guide for background tasks
- Document correlation ID format
Doc references:
- DETAILED_FINDINGS.md - Issue #26 "Missing Correlation ID"
Issue #21: MEDIUM - Missing Database Indexes for Performance
Where found:
- fail2ban's
bans.timeofbancolumn likely has no index - BanGUI queries that filter by timeofban + jail do full table scan
- Large datasets cause slow dashboard loads
Why this is needed: Without indexes on commonly filtered columns, query performance degrades as dataset grows.
Goal: Add appropriate indexes to improve query performance.
What to do:
- Verify index exists on fail2ban database:
CREATE INDEX IF NOT EXISTS idx_bans_timeofban_jail ON bans(timeofban DESC, jail); - Check BanGUI database for missing indexes
- Add to history_archive:
CREATE INDEX IF NOT EXISTS idx_history_jail_timeofban ON history_archive(jail, timeofban DESC); - Monitor query performance before/after
- Add index creation to migration system
Possible traps and issues:
- Creating indexes on large tables locks table (need CONCURRENT on PostgreSQL)
- Wrong index type slows queries instead of helping
- Too many indexes slow writes
Docs changes needed:
- Document query optimization patterns
- Performance tuning guide
Doc references:
- DETAILED_FINDINGS.md - Issue #10 "Missing DB Index"
Issue #22: MEDIUM - Socket Cleanup Errors Silently Suppressed
Where found:
backend/app/utils/fail2ban_client.py(line 150)- Uses
contextlib.suppress(Exception)hiding close errors - Resource leaks possible
Why this is needed: Suppressing all errors hides real problems:
- Socket never actually closes
- Connection pool exhaustion
- File descriptors leak
Goal: Log errors properly while still cleaning up resources.
What to do:
- Replace suppress with logging:
finally: try: await socket.close() except Exception as e: logger.warning(f"Error closing fail2ban socket: {e}", exc_info=True) - Audit other resource cleanup (database connections, HTTP sessions)
- Write tests that verify cleanup happens even on errors
Possible traps and issues:
- Close might raise unexpected errors
- Logging exceptions in finally blocks can cause issues
- Resource exhaustion hard to debug if not logging
Docs changes needed:
- Add resource cleanup guidelines to dev guide
Doc references:
- DETAILED_FINDINGS.md - Issue #14 "Socket Cleanup Silent Fail"
Issue #23: MEDIUM - Missing Default Configuration Documentation
Where found:
.env.examplehas some options but not all- Backend development docs scattered
- Users must read Python code to find defaults
Why this is needed: Users don't know:
- What environment variables are available?
- What are the defaults?
- What values are valid?
Goal: Create comprehensive configuration reference documentation.
What to do:
- Create
Docs/CONFIGURATION.mdwith complete table:| Variable | Type | Default | Description | |----------|------|---------|-------------| | BANGUI_DATABASE_PATH | string | /data/bangui.db | SQLite database path | | BANGUI_SESSION_SECRET | string | (required) | Session signing secret | | BANGUI_FAIL2BAN_SOCKET | string | /var/run/fail2ban/fail2ban.sock | fail2ban socket | - Document each option with valid values and constraints
- Organize by section (database, security, performance, etc.)
- Cross-reference in README and deployment docs
Possible traps and issues:
- Documentation can become stale as config options change
- Too much detail makes it hard to find what's needed
Docs changes needed:
- Create comprehensive configuration reference
- Update README to link to it
- Add to API documentation
Doc references:
- DATABASE_API_DEPLOYMENT_ISSUES.md - Issue "5.1, 5.5, 5.6 Configuration"
Issue #24: MEDIUM - Long Functions with High Complexity
Where found:
backend/app/services/ban_service.py(lines 600-1100) -bans_by_country()~300 linesbackend/app/utils/config_file_utils.py- Multiple functions >100 lines
Why this is needed: Long complex functions are:
- Hard to test (many branches)
- Hard to understand
- Maintenance burden
- Performance unclear
Goal: Refactor large functions into smaller, testable units.
What to do:
- Identify functions >100 lines
- Break into smaller functions, each with single responsibility:
async def bans_by_country(self): # Load data bans = await self._load_bans_paginated() # Aggregate by_country = self._aggregate_by_country(bans) # Enrich enriched = await self._enrich_with_geo(by_country) # Sort and return return self._sort_by_count(enriched) - Each smaller function is testable independently
- Add unit tests for each piece
Possible traps and issues:
- Breaking up functions might expose bugs that were hidden
- Performance might change (profile before/after)
- Error handling complexity might increase
Docs changes needed:
- Add code complexity guidelines to style guide
Doc references:
- DETAILED_FINDINGS.md - Issue #19 "Long Functions"
Issue #25: MEDIUM - Incomplete Type Hints in Error Handling
Where found:
backend/app/main.py(line 283)- Error metadata uses
dict[str, str | int | list[str]]instead of TypedDict
Why this is needed: Generic types don't enable proper type narrowing in exception handlers. Code can't safely access error fields.
Goal: Use TypedDict for type-safe error responses.
What to do:
- Define error response types:
class ErrorResponse(TypedDict): error_id: str timestamp: int message: str tracebacks: list[str] correlation_id: str - Use in exception handlers
- Type checker can verify correct field access
Possible traps and issues:
- TypedDict is Python 3.8+ only
- Need to maintain multiple error response types
Docs changes needed:
- Add type safety guidelines
Doc references:
- DETAILED_FINDINGS.md - Issue #21 "Incomplete Type Hints"
Issue #26: MEDIUM - Hardcoded Constants Not Configurable
Where found:
backend/app/utils/constants.py- MAX_PAGE_SIZE = 1000
- BLOCKLIST_PREVIEW_MAX_LINES = 100
- HISTORY_RETENTION_DAYS = 90
Why this is needed: Different deployments have different needs:
- Large deployment might want smaller pages
- User might want different preview size
- Some want longer history retention
Goal: Make limits configurable via environment variables.
What to do:
- Move constants to config:
class Settings(BaseSettings): max_page_size: int = Field(default=1000, env="BANGUI_MAX_PAGE_SIZE") blocklist_preview_max_lines: int = Field(default=100, env="BANGUI_PREVIEW_MAX_LINES") history_retention_days: int = Field(default=90, env="BANGUI_HISTORY_RETENTION") - Validate ranges (max_page_size > 0, < 10000)
- Update .env.example with all options
- Document in configuration guide
Possible traps and issues:
- Too many configuration options can be overwhelming
- Some limits have dependencies (page_size < max_records)
Docs changes needed:
- Add to configuration reference
Doc references:
- DETAILED_FINDINGS.md - Issue #20 "Hardcoded Constants"
Issue #27: MEDIUM - Inconsistent Error Handling Patterns
Where found:
ban_service.py- Raises exceptionsserver_service.py- Returns defaults silentlyjail_service.py- Mixed approach
Why this is needed: Different services have different error handling contracts. Callers don't know what to expect.
Goal: Establish clear error handling contract for all services.
What to do:
- Document error handling patterns:
class ServiceErrorContract: """ ABORT_ON_ERROR: Raise exception, let router handle RETURN_DEFAULT: Return empty result, log warning PARTIAL_RESULT: Return partial success with error list """ - Each service method documents which pattern it follows
- Routers handle errors consistently
- Update service protocols
Possible traps and issues:
- Users of service must know pattern
- Switching patterns is breaking change
Docs changes needed:
- Create service development guide with error patterns
Doc references:
- DETAILED_FINDINGS.md - Issue "Inconsistent Error Handling"
LOWER PRIORITY ISSUES (LOW-MEDIUM)
Issue #28: LOW-MEDIUM - Missing Pre-Commit Hooks
Where found:
- No
.pre-commit-config.yaml - Docs mention husky but no
.husky/directory
Why this is needed: Without pre-commit hooks, developers commit code that fails linting/tests, slowing down CI.
Goal: Enforce code quality checks before commit.
What to do:
- Create
.pre-commit-config.yaml:repos: - repo: https://github.com/pre-commit/pre-commit-hooks hooks: - id: check-yaml - id: end-of-file-fixer - repo: https://github.com/astral-sh/ruff-pre-commit hooks: - id: ruff - id: ruff-format - Setup husky for frontend
- Document in CONTRIBUTING.md
Docs changes needed:
- Create CONTRIBUTING.md with setup instructions
Doc references:
- DATABASE_API_DEPLOYMENT_ISSUES.md - Issue "12.1 Missing Pre-Commit Hooks"
Issue #29: LOW-MEDIUM - Missing .editorconfig
Where found:
- No
.editorconfigfile
Why this is needed: Different developers use different editors with different default formatting, causing inconsistent code.
Goal: Enforce consistent formatting across all editors.
What to do:
- Create
.editorconfig:root = true [*] charset = utf-8 end_of_line = lf insert_final_newline = true [*.py] indent_style = space indent_size = 4 [*.{js,ts,tsx,jsx}] indent_style = space indent_size = 2 - Add editorconfig plugin to IDE guides
Docs changes needed:
- Add to development setup instructions
Doc references:
- DATABASE_API_DEPLOYMENT_ISSUES.md - Issue "12.3 Missing .editorconfig"
Issue #30: LOW-MEDIUM - IPv4-Mapped IPv6 Address Duplicates
Where found:
backend/app/utils/ip_utils.py- Treats "192.168.1.1" and "::ffff:192.168.1.1" as different IPs
Why this is needed: Same IP can be banned twice in different formats, causing:
- Duplicate ban logs
- Geo cache duplicates
- Analytics skewed
Goal: Normalize IP addresses to canonical form.
What to do:
- Add normalization:
def normalize_ip(ip_str: str) -> str: ip = ipaddress.ip_address(ip_str) # Convert IPv4-mapped IPv6 to IPv4 if isinstance(ip, ipaddress.IPv6Address) and ip.ipv4_mapped: return str(ip.ipv4_mapped) return str(ip) - Apply on all IP inputs (ban, import, etc.)
- Test with various formats
Docs changes needed:
- Document IP normalization
Doc references:
- DETAILED_FINDINGS.md - Issue #22 "IPv4-Mapped IPv6"
Issue #31: LOW-MEDIUM - Weak Master Password Validation
Where found:
backend/app/models/setup.py(line 22)- Requires uppercase, digit, special char but no dictionary check
Why this is needed: Passwords can still be weak (e.g., "Password1!" is common).
Goal: Prevent common passwords.
What to do:
- Add common passwords list or library:
import common_passwords @field_validator("password") def validate_password(cls, v): if v.lower() in common_passwords.PASSWORDS: raise ValueError("Password is too common, choose another") return v - Test against known weak passwords
Docs changes needed:
- Document password requirements
Doc references:
- DETAILED_FINDINGS.md - Issue #23 "Weak Password Validation"
Issue #32: LOW-MEDIUM - Missing Accessibility Features
Where found:
frontend/src/components/BanTable.tsx- No aria-label on tablefrontend/src/pages/HistoryPage.tsx- Button has tabIndex but no onKeyDown handler- World map missing alt text
Why this is needed: Application not usable by screen reader users or keyboard-only navigation.
Goal: Improve accessibility to WCAG AA compliance.
What to do:
- Add ARIA labels to major components
- Implement keyboard navigation handlers
- Test with screen readers
- Check color contrast ratios
Docs changes needed:
- Add accessibility guidelines
Doc references:
- DATABASE_API_DEPLOYMENT_ISSUES.md - Issue "11 Accessibility"
Issue #33: LOW - Missing Architecture Decision Records (ADRs)
Where found:
- No
Docs/adr/directory
Why this is needed: New developers don't understand architectural choices, recreate debates, make wrong assumptions.
Goal: Document important decisions and their rationale.
What to do:
- Create
Docs/adr/directory - Add ADRs for major decisions:
- Why SQLite instead of PostgreSQL?
- Why FastAPI instead of Django?
- Why React instead of Vue?
- Why APScheduler instead of Celery?
- Why single-instance scheduler?
Docs changes needed:
- Create ADR template and examples
Doc references:
- DATABASE_API_DEPLOYMENT_ISSUES.md - Issue "8.5 Missing ADRs"
Issue #34: LOW - No Troubleshooting Guide
Where found:
- Missing
Docs/TROUBLESHOOTING.md
Why this is needed: Users can't self-serve on common issues, create issues instead.
Goal: Document common problems and solutions.
What to do:
- Create
Docs/TROUBLESHOOTING.mdwith:- "502 Bad Gateway" - backend is down or not ready
- "Permission denied" - database directory not writable
- "fail2ban not found" - socket path wrong
- "Geo lookups empty" - GeoLite2 database missing
- "Rate limited (429)" - too many requests
- Expand based on real user issues
Docs changes needed:
- Create comprehensive troubleshooting guide
Doc references:
- DATABASE_API_DEPLOYMENT_ISSUES.md - Issue "8.3 No Troubleshooting"
Issue #35: LOW - Missing Upgrade/Migration Guide
Where found:
- No
Docs/UPGRADING.md
Why this is needed: Users don't know how to safely upgrade without losing data.
Goal: Document upgrade process and breaking changes.
What to do:
- Create
Docs/UPGRADING.mdwith:- Backup procedure
- Breaking changes for each version
- Step-by-step upgrade procedure
- Rollback procedure if something goes wrong
Docs changes needed:
- Create upgrade guide for each major version
Doc references:
- DATABASE_API_DEPLOYMENT_ISSUES.md - Issue "8.5 No Migration Guide"
Issue #36: LOW - No Backup Strategy Documented
Where found:
- No backup procedure in deployment docs
- No automated backup in Docker Compose
Why this is needed: Users don't know how to protect their data.
Goal: Document and automate database backups.
What to do:
- Create
Docs/BACKUP_RESTORE.md - Add backup script to Docker
- Document retention policy
- Document restore procedure
Docs changes needed:
- Create backup & restore guide
Doc references:
- DATABASE_API_DEPLOYMENT_ISSUES.md - Issue "10.4 No Backup Strategy"
Issue #37: LOW - Missing CONTRIBUTING.md
Where found:
fail2ban-master/CONTRIBUTING.mdis from fail2ban, not BanGUI
Why this is needed: Contributors don't know project guidelines.
Goal: Document contribution guidelines.
What to do:
- Create
CONTRIBUTING.mdwith:- Development setup
- Branch naming conventions
- PR requirements
- Code style guidelines
- Testing requirements
- PR review process
Docs changes needed:
- Create CONTRIBUTING.md
Doc references:
- DATABASE_API_DEPLOYMENT_ISSUES.md - Issue "12.5 No CONTRIBUTING.md"
Issue #38: LOW - No Test Coverage Minimum Enforced
Where found:
backend/pyproject.toml- Coverage report generated but no minimum threshold- CI doesn't fail on low coverage
Why this is needed: Code quality can degrade as coverage drops.
Goal: Enforce minimum test coverage.
What to do:
- Set minimum coverage threshold in CI (e.g., 80%)
- Fail build if coverage drops below threshold
- Add coverage badge to README
Docs changes needed:
- Document testing requirements
Doc references:
- DATABASE_API_DEPLOYMENT_ISSUES.md - Issue "12.6 Test Coverage Not Enforced"
DOCUMENTATION GAPS (Cross-Cutting)
Issue #39: DOCUMENTATION - Missing API Reference
Files affected: All routers
Create: Comprehensive API reference documenting:
- All endpoints
- Request/response formats
- Status codes
- Examples
References:
- DATABASE_API_DEPLOYMENT_ISSUES.md - Issue "8.1 Missing API Documentation"
Issue #40: DOCUMENTATION - Missing Deployment Best Practices
Files affected: Docs/Deployment.md, Docker configuration
Create/Update:
- Security best practices
- Performance tuning
- Monitoring setup
- Scaling guidelines
References:
- DATABASE_API_DEPLOYMENT_ISSUES.md - Issue "6 Build & Deployment"
Issue #41: DOCUMENTATION - Missing Database Schema Documentation
Create: Document:
- All tables and their purpose
- Relationships and constraints
- Indexes and performance notes
- Migration history
References:
- DATABASE_API_DEPLOYMENT_ISSUES.md - Issue "1 Database Design"