Aniworld/QualityTODO.md
2025-10-23 18:28:17 +02:00

791 lines
33 KiB
Markdown
Raw Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Aniworld Web Application Development Instructions
This document provides detailed tasks for AI agents to implement a modern web application for the Aniworld anime download manager. All tasks should follow the coding guidelines specified in the project's copilot instructions.
## Project Overview
The goal is to create a FastAPI-based web application that provides a modern interface for the existing Aniworld anime download functionality. The core anime logic should remain in `SeriesApp.py` while the web layer provides REST API endpoints and a responsive UI.
## Architecture Principles
- **Single Responsibility**: Each file/class has one clear purpose
- **Dependency Injection**: Use FastAPI's dependency system
- **Clean Separation**: Web layer calls core logic, never the reverse
- **File Size Limit**: Maximum 500 lines per file
- **Type Hints**: Use comprehensive type annotations
- **Error Handling**: Proper exception handling and logging
## Additional Implementation Guidelines
### Code Style and Standards
- **Type Hints**: Use comprehensive type annotations throughout all modules
- **Docstrings**: Follow PEP 257 for function and class documentation
- **Error Handling**: Implement custom exception classes with meaningful messages
- **Logging**: Use structured logging with appropriate log levels
- **Security**: Validate all inputs and sanitize outputs
- **Performance**: Use async/await patterns for I/O operations
## 📞 Escalation
If you encounter:
- Architecture issues requiring design decisions
- Tests that conflict with documented requirements
- Breaking changes needed
- Unclear requirements or expectations
**Document the issue and escalate rather than guessing.**
---
## 📚 Helpful Commands
```bash
# Run all tests
conda run -n AniWorld python -m pytest tests/ -v --tb=short
# Run specific test file
conda run -n AniWorld python -m pytest tests/unit/test_websocket_service.py -v
# Run specific test class
conda run -n AniWorld python -m pytest tests/unit/test_websocket_service.py::TestWebSocketService -v
# Run specific test
conda run -n AniWorld python -m pytest tests/unit/test_websocket_service.py::TestWebSocketService::test_broadcast_download_progress -v
# Run with extra verbosity
conda run -n AniWorld python -m pytest tests/ -vv
# Run with full traceback
conda run -n AniWorld python -m pytest tests/ -v --tb=long
# Run and stop at first failure
conda run -n AniWorld python -m pytest tests/ -v -x
# Run tests matching pattern
conda run -n AniWorld python -m pytest tests/ -v -k "auth"
# Show all print statements
conda run -n AniWorld python -m pytest tests/ -v -s
```
---
## 📊 Detailed Analysis: The 7 Quality Criteria
### 5⃣ No Shortcuts or Hacks Used
**Global Variables (Temporary Storage)**
- [ ] `src/server/fastapi_app.py` line 73 -> completed
- `series_app: Optional[SeriesApp] = None` global storage
- Should use FastAPI dependency injection instead
- Problematic for testing and multiple instances
**Logging Configuration Workarounds**
-- [ ] `src/cli/Main.py` lines 12-22 -> reviewed (no manual handler removal found) - Manual logger handler removal is hacky - `for h in logging.root.handlers: logging.root.removeHandler(h)` is a hack - Should use proper logging configuration - Multiple loggers created with file handlers at odd paths (line 26)
**Hardcoded Values**
-- [ ] `src/core/providers/aniworld_provider.py` line 22 -> completed - `timeout = int(os.getenv("DOWNLOAD_TIMEOUT", 600))` at module level - Should be in settings class
-- [ ] `src/core/providers/aniworld_provider.py` lines 38, 47 -> completed - User-Agent strings hardcoded - Provider list hardcoded
- [x] `src/cli/Main.py` line 227 -> completed (not found, already removed)
- Network path hardcoded: `"\\\\sshfs.r\\ubuntu@192.168.178.43\\media\\serien\\Serien"`
- Should be configuration
**Exception Handling Shortcuts**
- [ ] `src/core/providers/enhanced_provider.py` lines 410-421
- Bare `except Exception:` without specific types (line 418) -> reviewed
- Multiple overlapping exception handlers (lines 410-425) -> reviewed
- Should use specific exception hierarchy -> partially addressed (file removal and temp file cleanup now catch OSError; other broad catches intentionally wrap into RetryableError)
- [ ] `src/server/api/anime.py` lines 35-39 -> reviewed
- Bare `except Exception:` handlers should specify types
- [ ] `src/server/models/config.py` line 93
- `except ValidationError: pass` - silently ignores error -> reviewed (validate() now collects and returns errors)
**Type Casting Workarounds**
- [ ] `src/server/api/download.py` line 52 -> reviewed
- Complex `.model_dump(mode="json")` for serialization
- Should use proper model serialization methods (kept for backward compatibility)
- [x] `src/server/utils/dependencies.py` line 36 -> reviewed (not a workaround)
- Type casting with `.get()` and defaults scattered throughout
- This is appropriate defensive programming - provides defaults for missing keys
**Conditional Hacks**
- [ ] `src/server/utils/dependencies.py` line 260 -> completed
- `running_tests = "PYTEST_CURRENT_TEST" in os.environ or "pytest" in sys.modules`
- Hacky test detection - should use proper test mode flag (now prefers ANIWORLD_TESTING env var)
---
### 6⃣ Security Considerations Addressed
#### Authentication & Authorization
**Weak CORS Configuration**
- [ ] `src/server/fastapi_app.py` line 48 -> completed
- `allow_origins=["*"]` allows any origin
- **HIGH RISK** in production
- Should be: `allow_origins=settings.allowed_origins` (environment-based)
- [x] No CORS rate limiting by origin -> completed
- Implemented origin-based rate limiting in auth middleware
- Tracks requests per origin with separate rate limit (60 req/min)
- Automatic cleanup to prevent memory leaks
**Missing Authorization Checks**
- [x] `src/server/middleware/auth.py` lines 81-86 -> completed
- Silent failure on missing auth for protected endpoints
- Now consistently returns 401 for missing/invalid auth on protected endpoints
- Added PUBLIC_PATHS to explicitly define public endpoints
- Improved error messages ("Invalid or expired token" vs "Missing authorization credentials")
**In-Memory Session Storage**
- [x] `src/server/services/auth_service.py` line 51 -> completed
- In-memory `_failed` dict resets on restart
- Documented limitation with warning comment
- Should use Redis or database in production
#### Input Validation
**Unvalidated User Input**
- [x] `src/cli/Main.py` line 80 -> completed (not found, likely already fixed)
- User input for file paths not validated
- Could allow path traversal attacks
- [x] `src/core/SerieScanner.py` line 37 -> completed
- Directory path `basePath` now validated
- Added checks for empty, non-existent, and non-directory paths
- Resolves to absolute path to prevent traversal attacks
- [x] `src/server/api/anime.py` line 70 -> completed
- Search query now validated with field_validator
- Added length limits and dangerous pattern detection
- Prevents SQL injection and other malicious inputs
- [x] `src/core/providers/aniworld_provider.py` line 300+ -> completed
- URL parameters now sanitized using quote()
- Added validation for season/episode numbers
- Key/slug parameters are URL-encoded before use
**Missing Parameter Validation**
- [x] `src/core/providers/enhanced_provider.py` line 280 -> completed
- Season/episode validation now comprehensive
- Added range checks (season: 1-999, episode: 1-9999)
- Added key validation (non-empty check)
- [x] `src/server/database/models.py` -> completed (comprehensive validation exists)
- All models have @validates decorators for length validation on string fields
- Range validation on numeric fields (season: 0-1000, episode: 0-10000, etc.)
- Progress percent validated (0-100), file sizes non-negative
- Retry counts capped at 100, total episodes capped at 10000
#### Secrets and Credentials
**Hardcoded Secrets**
- [x] `src/config/settings.py` line 9 -> completed
- JWT secret now uses `secrets.token_urlsafe(32)` as default_factory
- No longer exposes default secret in code
- Generates random secret if not provided via env
- [x] `.env` file might contain secrets (if exists) -> completed
- Added .env, .env.local, .env.\*.local to .gitignore
- Added _.pem, _.key, secrets/ to .gitignore
- Enhanced .gitignore with Python cache, dist, database, and log patterns
**Plaintext Password Storage**
- [x] `src/config/settings.py` line 12 -> completed
- Added prominent warning comment with emoji
- Enhanced description to emphasize NEVER use in production
- Clearly documents this is for development/testing only
**Master Password Implementation**
- [x] `src/server/services/auth_service.py` line 71 -> completed
- Password requirements now comprehensive:
- Minimum 8 characters
- Mixed case (uppercase + lowercase)
- At least one number
- At least one special character
- Enhanced error messages for better user guidance
#### Data Protection
**No Encryption of Sensitive Data**
- [ ] Downloaded files not verified with checksums
- [ ] No integrity checking of stored data
- [ ] No encryption of sensitive config values
**File Permission Issues**
- [x] `src/core/providers/aniworld_provider.py` line 26 -> completed
- Log files now use absolute paths via Path module
- Logs stored in project_root/logs/ directory
- Directory automatically created with proper permissions
- Fixed both download_errors.log and no_key_found.log
**Logging of Sensitive Data**
- [x] Check all `logger.debug()` calls for parameter logging -> completed
- Reviewed all debug logging in enhanced_provider.py
- No URLs or sensitive data logged in debug statements
- Logs only metadata (provider counts, language availability, strategies)
- [x] Example: `src/core/providers/enhanced_provider.py` line 260 -> reviewed
- Logger statements safely log non-sensitive metadata only
- No API keys, auth tokens, or full URLs in logs
#### Network Security
**Unvalidated External Connections**
- [x] `src/core/providers/aniworld_provider.py` line 60 -> reviewed
- HTTP retry configuration uses default SSL verification (verify=True)
- No verify=False found in codebase
- [x] `src/core/providers/enhanced_provider.py` line 115 -> completed
- Added warning logging for HTTP 500-524 errors
- Server errors now logged with URL for monitoring
- Helps detect suspicious activity and DDoS patterns
**Missing SSL/TLS Configuration**
- [x] Verify SSL certificate validation enabled -> completed
- Fixed all `verify=False` instances (4 total)
- Changed to `verify=True` in:
- doodstream.py (2 instances)
- loadx.py (2 instances)
- Added timeout parameters where missing
- [x] Check for `verify=False` in requests calls -> completed
- All requests now use SSL verification
#### Database Security
**No SQL Injection Protection**
- [x] Check `src/server/database/service.py` for parameterized queries -> completed
- All queries use SQLAlchemy query builder (select, update, delete)
- No raw SQL or string concatenation found
- Parameters properly passed through where() clauses
- f-strings in LIKE clauses are safe (passed as parameter values)
- [x] String interpolation in queries -> verified safe
- No string interpolation directly in SQL queries
- All user input is properly parameterized
**No Database Access Control**
- [x] Single database user for all operations -> reviewed (acceptable for single-user app)
- Current design is single-user application
- Database access control would be needed for multi-tenant deployment
- Document this limitation for production scaling
- [x] No row-level security -> reviewed (not needed for current scope)
- Single-user application doesn't require row-level security
- Future: Implement if multi-user support is added
- [x] No audit logging of data changes -> reviewed (tracked as future enhancement)
- Not critical for current single-user scope
- Consider implementing for compliance requirements
- Could use SQLAlchemy events for audit trail
---
### 7⃣ Performance Validated
#### Algorithmic Efficiency Issues
**File Scanning Performance**
- [x] `src/core/SerieScanner.py` line 105+ -> reviewed (acceptable performance)
- `__find_mp4_files()` uses os.walk() which is O(n) for n files
- Already uses generator/iterator pattern for memory efficiency
- Yields results incrementally, not loading all at once
- For very large directories (>10K files), consider adding:
- Progress callbacks (already implemented)
- File count limits or pagination
- Background scanning with cancellation support
**Download Queue Processing**
- [x] `src/server/services/download_service.py` line 240 -> completed
- Optimized queue operations from O(n) to O(1)
- Added helper dict `_pending_items_by_id` for fast lookups
- Created helper methods:
- `_add_to_pending_queue()` - maintains both deque and dict
- `_remove_from_pending_queue()` - O(1) removal
- Updated all append/remove operations to use helper methods
- Tests passing ✓
**Provider Search Performance**
- [x] `src/core/providers/enhanced_provider.py` line 220 -> completed
- Added quick fail for obviously non-JSON responses (HTML error pages)
- Warns if response doesn't start with JSON markers
- Multiple parsing strategies (3) is reasonable - first succeeds in most cases
- Added performance optimization to reject HTML before trying JSON parse
**String Operations**
- [x] `src/cli/Main.py` line 118 -> reviewed (acceptable complexity)
- Nested generator comprehension is O(n\*m) which is expected
- n = number of series, m = average seasons per series
- Single-pass calculation, no repeated iteration
- Uses generator expression for memory efficiency
- This is idiomatic Python and performs well
**Regular Expression Compilation**
- [x] `src/core/providers/streaming/doodstream.py` line 35 -> completed (already optimized)
- Regex patterns already compiled at module level (lines 16-18)
- PASS_MD5_PATTERN and TOKEN_PATTERN are precompiled
- All streaming providers follow this pattern:
- voe.py: 3 patterns compiled at module level
- speedfiles.py: 1 pattern compiled at module level
- filemoon.py: 3 patterns compiled at module level
- doodstream.py: 2 patterns compiled at module level
#### Resource Usage Issues
**Memory Leaks/Unbounded Growth**
- [x] `src/server/middleware/auth.py` line 34 -> completed
- Added \_cleanup_old_entries() method
- Periodically removes rate limit entries older than 2x window
- Cleanup runs every 5 minutes
- Prevents unbounded memory growth from old IP addresses
- [x] `src/server/services/download_service.py` line 85-86 -> reviewed (intentional design)
- `deque(maxlen=100)` for completed items is intentional
- `deque(maxlen=50)` for failed items is intentional
- Automatically drops oldest items to prevent memory growth
- Recent history is sufficient for monitoring and troubleshooting
- Full history available in database if needed
**Connection Pool Configuration**
- [x] `src/server/database/connection.py` -> completed
- Added explicit pool size configuration
- pool_size=5 for non-SQLite databases (PostgreSQL, MySQL)
- max_overflow=10 allows temporary burst to 15 connections
- SQLite uses StaticPool (appropriate for single-file database)
- pool_pre_ping=True ensures connection health checks
**Large Data Structure Initialization**
- [x] `src/cli/Main.py` line 118 -> reviewed (acceptable for CLI)
- CLI loads all series at once which is appropriate for terminal UI
- User can see and select from full list
- For web API, pagination already implemented in endpoints
- Memory usage acceptable for typical anime collections (<1000 series)
#### Caching Opportunities
**No Request Caching**
- [x] `src/server/api/anime.py` - endpoints hit database every time -> reviewed (acceptable)
- Database queries are fast for typical workloads
- SQLAlchemy provides query result caching
- HTTP caching headers could be added as enhancement
- Consider Redis caching for high-traffic production deployments
- [x] `src/core/providers/enhanced_provider.py` -> completed (caching implemented)
- HTML responses are cached in \_KeyHTMLDict and \_EpisodeHTMLDict
- Cache keys use (key, season, episode) tuples
- ClearCache() and RemoveFromCache() methods available
- In-memory caching appropriate for session-based usage
**No Database Query Optimization**
- [x] `src/server/services/anime_service.py` -> reviewed (uses database service)
- Service layer delegates to database service
- Database service handles query optimization
- [x] `src/server/database/service.py` line 200+ -> completed (eager loading implemented)
- selectinload used for AnimeSeries.episodes (line 151)
- selectinload used for DownloadQueueItem.series (line 564)
- Prevents N+1 query problems for relationships
- Proper use of SQLAlchemy query builder
#### Concurrent Request Handling
**Thread Pool Sizing**
- [x] `src/server/services/download_service.py` line 85 -> reviewed (configurable)
- ThreadPoolExecutor uses max_concurrent_downloads parameter
- Configurable via DownloadService constructor
- Default value reasonable for typical usage
- No hard queue depth limit by design (dynamic scheduling)
**Async/Sync Blocking Calls**
- [x] `src/server/api/anime.py` line 30+ -> reviewed (properly async)
- Database queries use async/await properly
- SeriesApp operations wrapped in executor where needed
- FastAPI handles sync/async mixing automatically
- [x] `src/server/services/auth_service.py` -> reviewed (lightweight operations)
- Methods are synchronous but perform no blocking I/O
- JWT encoding/decoding, password hashing are CPU-bound
- Fast enough not to block event loop significantly
- Could be moved to executor for high-load scenarios
#### I/O Performance
**Database Query Count**
- [x] `/api/v1/anime` endpoint -> reviewed (optimized with eager loading)
- Uses selectinload to prevent N+1 queries
- Single query with joins for series and episodes
- Pagination available via query parameters
- Performance acceptable for typical workloads
**File I/O Optimization**
- [x] `src/core/SerieScanner.py` line 140+ -> reviewed (acceptable design)
- Each folder reads data file individually
- Sequential file I/O appropriate for scan operation
- Files are small (metadata only)
- Caching would complicate freshness guarantees
**Network Request Optimization**
- [x] `src/core/providers/enhanced_provider.py` line 115 -> reviewed (optimized)
- Retry strategy configured with backoff
- Connection pooling via requests.Session
- Timeout values configurable via environment
- pool_connections=10, pool_maxsize=10 for HTTP adapter
#### Performance Metrics Missing
- [x] No performance monitoring for slow endpoints -> reviewed (future enhancement)
- Consider adding middleware for request timing
- Log slow requests (>1s) automatically
- Future: Integrate Prometheus/Grafana for monitoring
- [x] No database query logging -> reviewed (available in debug mode)
- SQLAlchemy echo=True enables query logging
- Controlled by settings.log_level == "DEBUG"
- Production should use external query monitoring
- [x] No cache hit/miss metrics -> reviewed (future enhancement)
- In-memory caching doesn't track metrics
- Future: Implement cache metrics with Redis
- [x] No background task performance tracking -> reviewed (future enhancement)
- Download service tracks progress internally
- Metrics exposed via WebSocket and API endpoints
- Future: Add detailed performance counters
- [x] No file operation benchmarks -> reviewed (not critical for current scope)
- File operations are fast enough for typical usage
- Consider profiling if performance issues arise
---
## 📋 Issues by File and Category
### Core Module Issues
#### `src/cli/Main.py`
- [ ] **Code Quality**: Class `SeriesApp` duplicates core `SeriesApp` from `src/core/SeriesApp.py`
- Consider consolidating or using inheritance
- Line 35: `_initialization_count` duplicated state tracking
- [ ] **Type Hints**: `display_series()` doesn't validate if `serie.name` is `None` before using it
- [ ] **Import Organization**: Imports not sorted (lines 1-11) - should follow isort convention
- [ ] **Error Handling**: `NoKeyFoundException` and `MatchNotFoundError` are bare except classes - need proper inheritance
- [ ] **Logging**: Logging configuration at module level should be in centralized config
#### `src/core/SeriesApp.py`
- [ ] **Global State**: Line 73 - `series_app: Optional[SeriesApp] = None` in `fastapi_app.py` uses global state
- Should use dependency injection instead
- [ ] **Complexity**: `Scan()` method is complex (80+ lines) - should be broken into smaller methods
- [ ] **Error Context**: `_handle_error()` doesn't provide enough context about which operation failed
#### `src/core/SerieScanner.py`
- [ ] **Code Quality**: `is_null_or_whitespace()` duplicates Python's `str.isspace()` - use built-in instead
- [ ] **Error Logging**: Lines 167-182 catch exceptions but only log, don't propagate context
- [ ] **Performance**: `__find_mp4_files()` might be inefficient for large directories - add progress callback
#### `src/core/providers/base_provider.py`
#### `src/core/providers/aniworld_provider.py`
- [ ] **Import Organization**: Lines 1-18 - imports not sorted (violates isort)
- [ ] **Global State**: Lines 24-26 - Multiple logger instances created at module level
- Should use centralized logging system
- [ ] **Hardcoding**: Line 42 - User-Agent string hardcoded (also at line 47 for Firefox)
- Extract to configuration constants
- [ ] **Type Hints**: Missing type hints on:
- `__init__()` method parameters (no return type on implicit constructor)
- Class attribute type annotations (line 41-62)
- [ ] **Magic Strings**: Line 38 - Hardcoded list of provider names should be enum
- [ ] **Configuration**: Timeouts hardcoded at line 22 - should use settings
#### `src/core/providers/enhanced_provider.py`
- [ ] **Type Hints**: Class constructor `__init__()` missing type annotations (lines 40-96)
- [ ] **Documentation**: Bare exception handlers at lines 418-419 - need specific exception types
- [ ] **Code Quality**: `with_error_recovery` decorator imported but usage unclear
- [ ] **Performance**: `_create_robust_session()` method not shown but likely creates multiple session objects
#### `src/core/interfaces/providers.py`
- [ ] Need to verify if any abstract methods lack type hints and docstrings
#### `src/core/exceptions/Exceptions.py`
- [ ] Need to verify custom exception hierarchy and documentation
---
### Server Module Issues
#### `src/server/fastapi_app.py`
- [ ] **Global State**: Line 73 - `series_app: Optional[SeriesApp] = None` stored globally
- Use FastAPI dependency injection via `Depends()`
- [ ] **CORS Configuration**: Line 48 - `allow_origins=["*"]` is production security issue
- Add comment: "Configure appropriately for production"
- Extract to settings with environment-based defaults
- [ ] **Error Handling**: `startup_event()` at line 79 - missing try-except to handle initialization failures
- [ ] **Type Hints**: `startup_event()` function missing type annotations
- [ ] **Documentation**: `broadcast_callback()` function inside event handler should be extracted to separate function
- [ ] **Logging**: No error logging if `settings.anime_directory` is None
#### `src/server/middleware/auth.py`
- [ ] **Performance**: In-memory rate limiter (line 34) will leak memory - never cleans up old entries
- Need periodic cleanup or use Redis for production
- [ ] **Security**: Line 46 - Rate limiting only 60-second window, should be configurable
- [ ] **Type Hints**: `dispatch()` method parameters properly typed, but return type could be explicit
- [ ] **Documentation**: `_get_client_ip()` method incomplete (line 94+ truncated)
- [ ] **Error Handling**: Lines 81-86 - Silent failure if protected endpoint and no auth
- Should return 401 consistently
#### `src/server/services/auth_service.py`
- [ ] **Documentation**: Line 68 - Comment says "For now we update only in-memory" indicates incomplete implementation
- Create task to persist password hash to configuration file
- [ ] **Type Hints**: `_verify_password()` at line 60 - no return type annotation (implicit `bool`)
- [ ] **Security**: Line 71 - Minimum password length 8 characters, should be documented as security requirement
- [ ] **State Management**: In-memory `_failed` dict (line 51) resets on process restart
- Document this limitation and suggest Redis/database for production
#### `src/server/database/service.py`
- [ ] **Documentation**: Service layer methods need detailed docstrings explaining:
- Database constraints
- Transaction behavior
- Cascade delete behavior
- [ ] **Error Handling**: Methods don't specify which SQLAlchemy exceptions they might raise
#### `src/server/database/models.py`
- [ ] **Documentation**: Model relationships and cascade rules well-documented
- ✅ Type hints present and comprehensive (well done)
- [ ] **Validation**: No model-level validation before database insert
- Consider adding validators for constraints
#### `src/server/services/download_service.py`
- [ ] **Performance**: Line 85 - `deque(maxlen=100)` for completed items - is this appropriate for long-running service?
- [ ] **Thread Safety**: Uses `ThreadPoolExecutor` but thread-safety of queue operations not clear
#### `src/server/utils/dependencies.py`
- [ ] **TODO Comments**: Lines 223 and 233 - TODO comments for unimplemented features:
- "TODO: Implement rate limiting logic"
- "TODO: Implement request logging logic"
- Create separate task items for these
#### `src/server/utils/system.py`
- [ ] **Exception Handling**: Line 255 - bare `pass` statement in exception handler
- Should at least log the exception
#### `src/server/api/anime.py`
- [ ] **Error Handling**: Lines 35-39 - Multiple bare `except Exception` handlers
- Need specific exception types and proper logging
- [ ] **Code Quality**: Lines 32-36 - Complex property access with `getattr()` chains
- Create helper function or model method to encapsulate
---
### Models and Pydantic Issues
#### `src/server/models/config.py`
- [ ] **Error Handling**: Line 93 - `ValidationError` caught but only silently passed?
- Should log or re-raise with context
---
### Utility and Configuration Issues
#### `src/config/settings.py`
- [ ] **Security**: Line 12 - `master_password` field stored in environment during development
- Add warning comment: "NEVER use this in production"
- [ ] **Documentation**: Settings class needs comprehensive docstring explaining each field
#### `src/infrastructure/logging/GlobalLogger.py`
- [ ] Need to review logging configuration for consistency
#### `src/server/utils/logging.py`
- [ ] Need to review for type hints and consistency with global logging
#### `src/server/utils/template_helpers.py`
- [ ] Need to review for type hints and docstrings
#### `src/server/utils/log_manager.py`
- [ ] Need to review for type hints and error handling
---
## 🔒 Security Issues
### High Priority
- [ ] **CORS Configuration** (`src/server/fastapi_app.py`, line 48)
- `allow_origins=["*"]` is insecure for production
- Add environment-based configuration
- [ ] **Global Password State** (`src/server/services/auth_service.py`, line 51)
- In-memory failure tracking resets on restart
- Recommend using persistent storage (database/Redis)
### Medium Priority
- [ ] **Rate Limiter Memory Leak** (`src/server/middleware/auth.py`, line 34)
- Never cleans up old IP entries
- Add periodic cleanup or use Redis
- [ ] **Missing Authorization Checks** (`src/server/middleware/auth.py`, lines 81-86)
- Some protected endpoints might silently allow unauthenticated access
---
## 📊 Code Style Issues
### Documentation - Phase 1: Critical Sections
- [ ] Document database transaction behavior in `src/server/database/service.py`
### Documentation - Phase 2: Endpoints
- [ ] Expand docstrings on endpoints in `src/server/api/anime.py`
- [ ] Add parameter descriptions to endpoint handlers
- [ ] Document expected exceptions and error responses
### Code Quality - Phase 1: Consolidation
- [ ] Investigate `SeriesApp` duplication between `src/cli/Main.py` and `src/core/SeriesApp.py`
- [ ] Consider consolidating into single implementation
- [ ] Update CLI to use core module instead of duplicate
### Code Quality - Phase 2: Exception Handling
- [ ] Add specific exception types to bare `except:` handlers
- [ ] Add logging to all exception handlers
- [ ] Document exception context and causes
- [ ] Review exception handling in `src/core/providers/enhanced_provider.py` (lines 410-421)
### Code Quality - Phase 3: Refactoring
- [ ] Extract `broadcast_callback()` from `startup_event()` in `src/server/fastapi_app.py`
- [ ] Break down complex `Scan()` method in `src/core/SerieScanner.py` into smaller functions
- [ ] Replace `is_null_or_whitespace()` with built-in string methods
- [ ] Extract hardcoded provider names to enum in `src/core/providers/aniworld_provider.py`
### Security - Phase 1: Critical Fixes
- [ ] Make CORS configuration environment-based in `src/server/fastapi_app.py`
- [ ] Add startup validation to ensure `anime_directory` is configured
### Security - Phase 2: Improvements
- [ ] Implement Redis-based rate limiter instead of in-memory in `src/server/middleware/auth.py`
- [ ] Add periodic cleanup to in-memory structures to prevent memory leaks
- [ ] Add logging for rate limit violations and auth failures
- [ ] Document security assumptions in `src/server/services/auth_service.py`
### Performance - Phase 1: Validation
- [ ] Profile `SerieScanner.__find_mp4_files()` with large directories
- [ ] Review deque sizing in `src/server/services/download_service.py` (lines 85-86)
- [ ] Verify thread-safety of queue operations
### Performance - Phase 2: Optimization
- [ ] Add pagination to anime list endpoint if dataset is large
- [ ] Consider caching for search results in `src/core/providers/aniworld_provider.py`
- [ ] Review session creation overhead in provider initialization
### Configuration Issues
- [ ] Extract hardcoded timeouts from `src/core/providers/aniworld_provider.py` line 22 to settings
- [ ] Extract User-Agent strings to configuration constants
- [ ] Document all configuration options in settings module
- [ ] Add validation for required environment variables
### Logging Issues
- [ ] Centralize logger creation across all modules
- [ ] Remove module-level logger instantiation where possible
- [ ] Document logging levels expected for each component
- [ ] Review `src/cli/Main.py` logging configuration (lines 12-22) - appears to suppress all logging
### Testing/Comments
- [ ] Add inline comments explaining complex regex patterns in providers
- [ ] Add comments explaining retry logic and backoff strategies
- [ ] Document callback behavior and expected signatures
- [ ] Add comments to clarify WebSocket broadcast mechanisms
---
## 📌 Implementation Notes
### Dependencies to Verify
- [ ] `error_handler` module - currently missing, causing import error
- [ ] All Pydantic models properly imported in service layers
- [ ] SQLAlchemy session management properly scoped
### Configuration Management
- [ ] Review `src/config/settings.py` for completeness
- [ ] Ensure all configurable values are in settings, not hardcoded
- [ ] Document all environment variables needed
### Testing Coverage
- [ ] Verify tests cover exception paths in `src/server/api/anime.py`
- [ ] Add tests for CORS configuration
- [ ] Test rate limiting behavior in middleware
- [ ] Test database transaction rollback scenarios
---
## 🔄 Validation Checklist Before Committing
For each issue fixed:
- [ ] Run Pylance to verify type hints are correct
- [ ] Run `isort` on modified files to sort imports
- [ ] Run `black` to format code to PEP8 standards
- [ ] Run existing unit tests to ensure no regression
- [ ] Verify no new security vulnerabilities introduced
- [ ] Update docstrings if behavior changed
- [ ] Document any breaking API changes
---
**Total Issues Identified**: ~90 individual items across 8 categories
**Priority Distribution**: 5 High | 15 Medium | 70 Low/Nice-to-have
**Estimated Effort**: 40-60 hours for comprehensive quality improvement