28 KiB
Quality Issues and TODO List
Aniworld Web Application Development Instructions
This document provides detailed tasks for AI agents to implement a modern web application for the Aniworld anime download manager. All tasks should follow the coding guidelines specified in the project's copilot instructions.
Project Overview
The goal is to create a FastAPI-based web application that provides a modern interface for the existing Aniworld anime download functionality. The core anime logic should remain in SeriesApp.py while the web layer provides REST API endpoints and a responsive UI.
Architecture Principles
- Single Responsibility: Each file/class has one clear purpose
- Dependency Injection: Use FastAPI's dependency system
- Clean Separation: Web layer calls core logic, never the reverse
- File Size Limit: Maximum 500 lines per file
- Type Hints: Use comprehensive type annotations
- Error Handling: Proper exception handling and logging
Additional Implementation Guidelines
Code Style and Standards
- Type Hints: Use comprehensive type annotations throughout all modules
- Docstrings: Follow PEP 257 for function and class documentation
- Error Handling: Implement custom exception classes with meaningful messages
- Logging: Use structured logging with appropriate log levels
- Security: Validate all inputs and sanitize outputs
- Performance: Use async/await patterns for I/O operations
📞 Escalation
If you encounter:
- Architecture issues requiring design decisions
- Tests that conflict with documented requirements
- Breaking changes needed
- Unclear requirements or expectations
Document the issue and escalate rather than guessing.
📚 Helpful Commands
# Run all tests
conda run -n AniWorld python -m pytest tests/ -v --tb=short
# Run specific test file
conda run -n AniWorld python -m pytest tests/unit/test_websocket_service.py -v
# Run specific test class
conda run -n AniWorld python -m pytest tests/unit/test_websocket_service.py::TestWebSocketService -v
# Run specific test
conda run -n AniWorld python -m pytest tests/unit/test_websocket_service.py::TestWebSocketService::test_broadcast_download_progress -v
# Run with extra verbosity
conda run -n AniWorld python -m pytest tests/ -vv
# Run with full traceback
conda run -n AniWorld python -m pytest tests/ -v --tb=long
# Run and stop at first failure
conda run -n AniWorld python -m pytest tests/ -v -x
# Run tests matching pattern
conda run -n AniWorld python -m pytest tests/ -v -k "auth"
# Show all print statements
conda run -n AniWorld python -m pytest tests/ -v -s
📊 Detailed Analysis: The 7 Quality Criteria
1️⃣ Code Follows PEP8 and Project Coding Standards
2️⃣ Type Hints Used Where Applicable
Missing Type Hints by Category
Abstract Base Classes (Critical)
Service Classes
API Endpoints
Dependencies and Utils
Core Classes
Invalid Type Hint Syntax
3️⃣ Clear, Self-Documenting Code Written
Missing or Inadequate Docstrings
Module-Level Docstrings
Class Docstrings
src/server/utils/dependencies.py-CommonQueryParamsclass lacks docstring
Method/Function Docstrings
src/server/utils/logging.py- Check helper functionssrc/server/utils/template_helpers.py- Check helper functions
Unclear Variable Names
Unclear Comments or Missing Context
src/core/providers/enhanced_provider.pyline 231- Comment style inconsistent with rest of codebase
src/server/api/download.pyline 51- Backward compatibility comment clear but needs more detail
4️⃣ Complex Logic Commented
Complex Algorithms Without Comments
Episode Discovery Pattern Matching
src/core/providers/streaming/*.pyfiles- Regex patterns for extracting streams (complex)
- Each provider has different extraction logic
- Need comments explaining regex patterns
- Example:
src/core/providers/streaming/doodstream.pyline 35
JSON/HTML Parsing Logic
Session Retry Configuration
src/core/providers/enhanced_provider.pylines 108-125- Session retry configuration with backoff
- Line 115: status codes list needs explanation
- Comment should explain why these specific codes need retry
5️⃣ No Shortcuts or Hacks Used
Code Smells and Shortcuts
src/core/providers/streaming/Provider.pyline 7 - Abstract method implementation should not have pass Bare Pass Statements (Incomplete Implementation)
Duplicate Code
src/cli/Main.pyvssrc/core/SeriesApp.py- Entire
SeriesAppclass duplicated (89 lines vs 590 lines) - CLI version is oversimplified - should use core version
- Line 35-50 in CLI duplicates core initialization logic
- Entire
src/core/providers/aniworld_provider.pyvssrc/core/providers/enhanced_provider.py- Headers dictionary duplicated (lines 38-50 similar)
- Provider list duplicated (line 38 vs line 45)
- User-Agent strings duplicated
Global Variables (Temporary Storage)
src/server/fastapi_app.pyline 73series_app: Optional[SeriesApp] = Noneglobal storage- Should use FastAPI dependency injection instead
- Problematic for testing and multiple instances
Logging Configuration Workarounds
src/cli/Main.pylines 12-22- Manual logger handler removal is hacky
for h in logging.root.handlers: logging.root.removeHandler(h)is a hack- Should use proper logging configuration
- Multiple loggers created with file handlers at odd paths (line 26)
Hardcoded Values
src/core/providers/aniworld_provider.pyline 22timeout = int(os.getenv("DOWNLOAD_TIMEOUT", 600))at module level- Should be in settings class
src/core/providers/aniworld_provider.pylines 38, 47- User-Agent strings hardcoded
- Provider list hardcoded
src/cli/Main.pyline 227- Network path hardcoded:
"\\\\sshfs.r\\ubuntu@192.168.178.43\\media\\serien\\Serien" - Should be configuration
- Network path hardcoded:
Exception Handling Shortcuts
src/core/providers/enhanced_provider.pylines 410-421- Bare
except Exception:without specific types (line 418) - Multiple overlapping exception handlers (lines 410-425)
- Should use specific exception hierarchy
- Bare
src/server/api/anime.pylines 35-39- Bare
except Exception:handlers should specify types
- Bare
src/server/models/config.pyline 93except ValidationError: pass- silently ignores error
Type Casting Workarounds
src/server/api/download.pyline 52- Complex
.model_dump(mode="json")for serialization - Should use proper model serialization methods
- Complex
src/server/utils/dependencies.pyline 36- Type casting with
.get()and defaults scattered throughout
- Type casting with
Conditional Hacks
src/server/utils/dependencies.pyline 260running_tests = "PYTEST_CURRENT_TEST" in os.environ or "pytest" in sys.modules- Hacky test detection - should use proper test mode flag
6️⃣ Security Considerations Addressed
Authentication & Authorization
Weak CORS Configuration
src/server/fastapi_app.pyline 48allow_origins=["*"]allows any origin- HIGH RISK in production
- Should be:
allow_origins=settings.allowed_origins(environment-based)
- No CORS rate limiting by origin
Missing Authorization Checks
src/server/middleware/auth.pylines 81-86- Silent failure on missing auth for protected endpoints
- Should consistently return 401 status
- Some endpoints might bypass auth silently
In-Memory Session Storage
src/server/services/auth_service.pyline 51- In-memory
_faileddict resets on restart - Attacker can restart process to bypass rate limiting
- Should use Redis or database
- In-memory
- Line 51 comment: "For now we update only in-memory"
- Indicates incomplete security implementation
Input Validation
Unvalidated User Input
src/cli/Main.pyline 80- User input for file paths not validated
- Could allow path traversal attacks
src/core/SerieScanner.pyline 37- Directory path
basePathnot validated - Could read files outside intended directory
- Directory path
src/server/api/anime.pyline 70- Search query not validated for injection
src/core/providers/aniworld_provider.pyline 300+- URL parameters not sanitized
Missing Parameter Validation
src/core/providers/enhanced_provider.pyline 280- Season/episode validation present but minimal
src/server/database/models.py- No length validation on string fields
- No range validation on numeric fields
Secrets and Credentials
Hardcoded Secrets
src/config/settings.pyline 9jwt_secret_key: str = Field(default="your-secret-key-here", env="JWT_SECRET_KEY")- Default secret exposed in code
- Should have NO default, or random default
.envfile might contain secrets (if exists)- Should be in .gitignore
Plaintext Password Storage
src/config/settings.pyline 12master_password: Optional[str]stored in env (development only)- Should NEVER be used in production
- Add bold warning comment
Master Password Implementation
src/server/services/auth_service.pyline 71- Minimum 8 character password requirement documented
- Should enforce stronger requirements (uppercase, numbers, symbols)
Data Protection
No Encryption of Sensitive Data
- Downloaded files not verified with checksums
- No integrity checking of stored data
- No encryption of sensitive config values
File Permission Issues
src/core/providers/aniworld_provider.pyline 26- Log file created with default permissions
- Path:
"../../download_errors.log"- relative path is unsafe - Should use absolute paths with secure permissions
Logging of Sensitive Data
- Check all
logger.debug()calls for parameter logging- URLs might contain API keys
- Search queries might contain sensitive terms
- Example:
src/core/providers/enhanced_provider.pyline 260logger.debug()might log URLs with sensitive data
Network Security
Unvalidated External Connections
src/core/providers/aniworld_provider.pyline 60- HTTP retry configuration but no SSL verification flag check
src/core/providers/enhanced_provider.pyline 115- HTTP error codes 500-524 auto-retry without logging suspicious activity
Missing SSL/TLS Configuration
- Verify SSL certificate validation enabled
- Check for
verify=Falsein requests calls (should beTrue)
Database Security
No SQL Injection Protection
- Check
src/server/database/service.pyfor parameterized queries- Should use SQLAlchemy properly (appears to be OK)
- String interpolation in queries should not exist
No Database Access Control
- Single database user for all operations
- No row-level security
- No audit logging of data changes
7️⃣ Performance Validated
Algorithmic Efficiency Issues
File Scanning Performance
src/core/SerieScanner.pyline 105+__find_mp4_files()- potential O(n²) complexity- Recursive directory traversal not profiled
- No caching or incremental scanning
- Large directories (>10K files) might cause timeout
Download Queue Processing
src/server/services/download_service.pyline 240self._pending_queue.remove(item)- O(n) operation in deque- Should use dict for O(1) lookup before removal
- Line 85-86: deque maxlen limits might cause data loss
Provider Search Performance
src/core/providers/enhanced_provider.pyline 220- Multiple parsing strategies tried sequentially
- Should fail fast on obvious errors instead of trying all 3
- No performance metrics logged
String Operations
src/cli/Main.pyline 118- Nested
sum()with comprehensions - O(n*m) complexity total_episodes = sum(sum(len(ep) for ep in serie.episodeDict.values()) for serie in series)- No streaming/generator pattern
- Nested
Regular Expression Compilation
src/core/providers/streaming/doodstream.pyline 35- Regex patterns compiled on every call
- Should compile once at module level
- Example:
r"\$\.get\('([^']*\/pass_md5\/[^']*)'"compiled repeatedly
Resource Usage Issues
Memory Leaks/Unbounded Growth
src/server/middleware/auth.pyline 34self._rate: Dict[str, Dict[str, float]]never cleaned- Old IP addresses accumulate forever
- Solution: add timestamp-based cleanup
src/server/services/download_service.pyline 85-86deque(maxlen=100)anddeque(maxlen=50)drop old items- Might lose important history
Connection Pool Configuration
src/server/database/connection.py- Check if connection pooling is configured
- No explicit pool size limits found
- Could exhaust database connections
Large Data Structure Initialization
src/cli/Main.pyline 118- Loading all series at once
- Should use pagination for large datasets
Caching Opportunities
No Request Caching
src/server/api/anime.py- endpoints hit database every time- No caching headers set
@cachedecorator could be used
src/core/providers/enhanced_provider.py- Search results not cached
- Same search query hits network repeatedly
No Database Query Optimization
src/server/services/anime_service.py- No eager loading (selectinload) for relationships
- N+1 query problems likely
src/server/database/service.pyline 200+- Check for missing
.selectinload()in queries
- Check for missing
Concurrent Request Handling
Thread Pool Sizing
src/server/services/download_service.pyline 85ThreadPoolExecutor(max_workers=max_concurrent_downloads)- Default is 2, should be configurable
- No queue depth limits
Async/Sync Blocking Calls
src/server/api/anime.pyline 30+- Series list operations might block
- Database queries appear async (OK)
src/server/services/auth_service.py- Methods are synchronous but called from async endpoints
- Should verify no blocking calls
I/O Performance
Database Query Count
/api/v1/animeendpoint- Likely makes multiple queries for each series
- Should use single query with joins/eager loading
- Test with N series to find N+1 issues
File I/O Optimization
src/core/SerieScanner.pyline 140+- Each folder reads data file
- Could batch reads or cache
Network Request Optimization
src/core/providers/enhanced_provider.pyline 115- Retry strategy good
- No connection pooling verification
- Should check request timeout values
Performance Metrics Missing
- No performance monitoring for slow endpoints
- No database query logging
- No cache hit/miss metrics
- No background task performance tracking
- No file operation benchmarks
📋 Issues by File and Category
Core Module Issues
src/cli/Main.py
- Code Quality: Class
SeriesAppduplicates coreSeriesAppfromsrc/core/SeriesApp.py- Consider consolidating or using inheritance
- Line 35:
_initialization_countduplicated state tracking
- Type Hints:
display_series()doesn't validate ifserie.nameisNonebefore using it - Import Organization: Imports not sorted (lines 1-11) - should follow isort convention
- Error Handling:
NoKeyFoundExceptionandMatchNotFoundErrorare bare except classes - need proper inheritance - Logging: Logging configuration at module level should be in centralized config
src/core/SeriesApp.py
- Global State: Line 73 -
series_app: Optional[SeriesApp] = Noneinfastapi_app.pyuses global state- Should use dependency injection instead
- Complexity:
Scan()method is complex (80+ lines) - should be broken into smaller methods - Error Context:
_handle_error()doesn't provide enough context about which operation failed
src/core/SerieScanner.py
- Code Quality:
is_null_or_whitespace()duplicates Python'sstr.isspace()- use built-in instead - Error Logging: Lines 167-182 catch exceptions but only log, don't propagate context
- Performance:
__find_mp4_files()might be inefficient for large directories - add progress callback
src/core/providers/base_provider.py
src/core/providers/aniworld_provider.py
- Import Organization: Lines 1-18 - imports not sorted (violates isort)
- Global State: Lines 24-26 - Multiple logger instances created at module level
- Should use centralized logging system
- Hardcoding: Line 42 - User-Agent string hardcoded (also at line 47 for Firefox)
- Extract to configuration constants
- Type Hints: Missing type hints on:
__init__()method parameters (no return type on implicit constructor)- Class attribute type annotations (line 41-62)
- Magic Strings: Line 38 - Hardcoded list of provider names should be enum
- Configuration: Timeouts hardcoded at line 22 - should use settings
src/core/providers/enhanced_provider.py
- Type Hints: Class constructor
__init__()missing type annotations (lines 40-96) - Documentation: Bare exception handlers at lines 418-419 - need specific exception types
- Code Quality:
with_error_recoverydecorator imported but usage unclear - Performance:
_create_robust_session()method not shown but likely creates multiple session objects
src/core/interfaces/providers.py
- Need to verify if any abstract methods lack type hints and docstrings
src/core/exceptions/Exceptions.py
- Need to verify custom exception hierarchy and documentation
Server Module Issues
src/server/fastapi_app.py
- Global State: Line 73 -
series_app: Optional[SeriesApp] = Nonestored globally- Use FastAPI dependency injection via
Depends()
- Use FastAPI dependency injection via
- CORS Configuration: Line 48 -
allow_origins=["*"]is production security issue- Add comment: "Configure appropriately for production"
- Extract to settings with environment-based defaults
- Error Handling:
startup_event()at line 79 - missing try-except to handle initialization failures - Type Hints:
startup_event()function missing type annotations - Documentation:
broadcast_callback()function inside event handler should be extracted to separate function - Logging: No error logging if
settings.anime_directoryis None
src/server/middleware/auth.py
- Performance: In-memory rate limiter (line 34) will leak memory - never cleans up old entries
- Need periodic cleanup or use Redis for production
- Security: Line 46 - Rate limiting only 60-second window, should be configurable
- Type Hints:
dispatch()method parameters properly typed, but return type could be explicit - Documentation:
_get_client_ip()method incomplete (line 94+ truncated) - Error Handling: Lines 81-86 - Silent failure if protected endpoint and no auth
- Should return 401 consistently
src/server/services/auth_service.py
- Documentation: Line 68 - Comment says "For now we update only in-memory" indicates incomplete implementation
- Create task to persist password hash to configuration file
- Type Hints:
_verify_password()at line 60 - no return type annotation (implicitbool) - Security: Line 71 - Minimum password length 8 characters, should be documented as security requirement
- State Management: In-memory
_faileddict (line 51) resets on process restart- Document this limitation and suggest Redis/database for production
src/server/database/service.py
- Documentation: Service layer methods need detailed docstrings explaining:
- Database constraints
- Transaction behavior
- Cascade delete behavior
- Error Handling: Methods don't specify which SQLAlchemy exceptions they might raise
src/server/database/models.py
- Documentation: Model relationships and cascade rules well-documented
- ✅ Type hints present and comprehensive (well done)
- Validation: No model-level validation before database insert
- Consider adding validators for constraints
src/server/services/download_service.py
- Performance: Line 85 -
deque(maxlen=100)for completed items - is this appropriate for long-running service? - Thread Safety: Uses
ThreadPoolExecutorbut thread-safety of queue operations not clear
src/server/utils/dependencies.py
- TODO Comments: Lines 223 and 233 - TODO comments for unimplemented features:
- "TODO: Implement rate limiting logic"
- "TODO: Implement request logging logic"
- Create separate task items for these
src/server/utils/system.py
- Exception Handling: Line 255 - bare
passstatement in exception handler- Should at least log the exception
src/server/api/anime.py
- Error Handling: Lines 35-39 - Multiple bare
except Exceptionhandlers- Need specific exception types and proper logging
- Code Quality: Lines 32-36 - Complex property access with
getattr()chains- Create helper function or model method to encapsulate
Models and Pydantic Issues
src/server/models/config.py
- Error Handling: Line 93 -
ValidationErrorcaught but only silently passed?- Should log or re-raise with context
Utility and Configuration Issues
src/config/settings.py
- Security: Line 12 -
master_passwordfield stored in environment during development- Add warning comment: "NEVER use this in production"
- Documentation: Settings class needs comprehensive docstring explaining each field
src/infrastructure/logging/GlobalLogger.py
- Need to review logging configuration for consistency
src/server/utils/logging.py
- Need to review for type hints and consistency with global logging
src/server/utils/template_helpers.py
- Need to review for type hints and docstrings
src/server/utils/log_manager.py
- Need to review for type hints and error handling
🔒 Security Issues
High Priority
- CORS Configuration (
src/server/fastapi_app.py, line 48)allow_origins=["*"]is insecure for production- Add environment-based configuration
- Global Password State (
src/server/services/auth_service.py, line 51)- In-memory failure tracking resets on restart
- Recommend using persistent storage (database/Redis)
Medium Priority
-
Rate Limiter Memory Leak (
src/server/middleware/auth.py, line 34)- Never cleans up old IP entries
- Add periodic cleanup or use Redis
-
Missing Authorization Checks (
src/server/middleware/auth.py, lines 81-86)- Some protected endpoints might silently allow unauthenticated access
📊 Code Style Issues
Documentation - Phase 1: Critical Sections
- Document database transaction behavior in
src/server/database/service.py
Documentation - Phase 2: Endpoints
- Expand docstrings on endpoints in
src/server/api/anime.py - Add parameter descriptions to endpoint handlers
- Document expected exceptions and error responses
Code Quality - Phase 1: Consolidation
- Investigate
SeriesAppduplication betweensrc/cli/Main.pyandsrc/core/SeriesApp.py - Consider consolidating into single implementation
- Update CLI to use core module instead of duplicate
Code Quality - Phase 2: Exception Handling
- Add specific exception types to bare
except:handlers - Add logging to all exception handlers
- Document exception context and causes
- Review exception handling in
src/core/providers/enhanced_provider.py(lines 410-421)
Code Quality - Phase 3: Refactoring
- Extract
broadcast_callback()fromstartup_event()insrc/server/fastapi_app.py - Break down complex
Scan()method insrc/core/SerieScanner.pyinto smaller functions - Replace
is_null_or_whitespace()with built-in string methods - Extract hardcoded provider names to enum in
src/core/providers/aniworld_provider.py
Security - Phase 1: Critical Fixes
- Make CORS configuration environment-based in
src/server/fastapi_app.py - Add startup validation to ensure
anime_directoryis configured
Security - Phase 2: Improvements
- Implement Redis-based rate limiter instead of in-memory in
src/server/middleware/auth.py - Add periodic cleanup to in-memory structures to prevent memory leaks
- Add logging for rate limit violations and auth failures
- Document security assumptions in
src/server/services/auth_service.py
Performance - Phase 1: Validation
- Profile
SerieScanner.__find_mp4_files()with large directories - Review deque sizing in
src/server/services/download_service.py(lines 85-86) - Verify thread-safety of queue operations
Performance - Phase 2: Optimization
- Add pagination to anime list endpoint if dataset is large
- Consider caching for search results in
src/core/providers/aniworld_provider.py - Review session creation overhead in provider initialization
Configuration Issues
- Extract hardcoded timeouts from
src/core/providers/aniworld_provider.pyline 22 to settings - Extract User-Agent strings to configuration constants
- Document all configuration options in settings module
- Add validation for required environment variables
Logging Issues
- Centralize logger creation across all modules
- Remove module-level logger instantiation where possible
- Document logging levels expected for each component
- Review
src/cli/Main.pylogging configuration (lines 12-22) - appears to suppress all logging
Testing/Comments
- Add inline comments explaining complex regex patterns in providers
- Add comments explaining retry logic and backoff strategies
- Document callback behavior and expected signatures
- Add comments to clarify WebSocket broadcast mechanisms
📌 Implementation Notes
Dependencies to Verify
error_handlermodule - currently missing, causing import error- All Pydantic models properly imported in service layers
- SQLAlchemy session management properly scoped
Configuration Management
- Review
src/config/settings.pyfor completeness - Ensure all configurable values are in settings, not hardcoded
- Document all environment variables needed
Testing Coverage
- Verify tests cover exception paths in
src/server/api/anime.py - Add tests for CORS configuration
- Test rate limiting behavior in middleware
- Test database transaction rollback scenarios
🔄 Validation Checklist Before Committing
For each issue fixed:
- Run Pylance to verify type hints are correct
- Run
isorton modified files to sort imports - Run
blackto format code to PEP8 standards - Run existing unit tests to ensure no regression
- Verify no new security vulnerabilities introduced
- Update docstrings if behavior changed
- Document any breaking API changes
Total Issues Identified: ~90 individual items across 8 categories Priority Distribution: 5 High | 15 Medium | 70 Low/Nice-to-have Estimated Effort: 40-60 hours for comprehensive quality improvement