28 KiB
Aniworld Web Application Development Instructions
This document provides detailed tasks for AI agents to implement a modern web application for the Aniworld anime download manager. All tasks should follow the coding guidelines specified in the project's copilot instructions.
Project Overview
The goal is to create a FastAPI-based web application that provides a modern interface for the existing Aniworld anime download functionality. The core anime logic should remain in SeriesApp.py while the web layer provides REST API endpoints and a responsive UI.
Architecture Principles
- Single Responsibility: Each file/class has one clear purpose
- Dependency Injection: Use FastAPI's dependency system
- Clean Separation: Web layer calls core logic, never the reverse
- File Size Limit: Maximum 500 lines per file
- Type Hints: Use comprehensive type annotations
- Error Handling: Proper exception handling and logging
Additional Implementation Guidelines
Code Style and Standards
- Type Hints: Use comprehensive type annotations throughout all modules
- Docstrings: Follow PEP 257 for function and class documentation
- Error Handling: Implement custom exception classes with meaningful messages
- Logging: Use structured logging with appropriate log levels
- Security: Validate all inputs and sanitize outputs
- Performance: Use async/await patterns for I/O operations
📞 Escalation
If you encounter:
- Architecture issues requiring design decisions
- Tests that conflict with documented requirements
- Breaking changes needed
- Unclear requirements or expectations
Document the issue and escalate rather than guessing.
📚 Helpful Commands
# Run all tests
conda run -n AniWorld python -m pytest tests/ -v --tb=short
# Run specific test file
conda run -n AniWorld python -m pytest tests/unit/test_websocket_service.py -v
# Run specific test class
conda run -n AniWorld python -m pytest tests/unit/test_websocket_service.py::TestWebSocketService -v
# Run specific test
conda run -n AniWorld python -m pytest tests/unit/test_websocket_service.py::TestWebSocketService::test_broadcast_download_progress -v
# Run with extra verbosity
conda run -n AniWorld python -m pytest tests/ -vv
# Run with full traceback
conda run -n AniWorld python -m pytest tests/ -v --tb=long
# Run and stop at first failure
conda run -n AniWorld python -m pytest tests/ -v -x
# Run tests matching pattern
conda run -n AniWorld python -m pytest tests/ -v -k "auth"
# Show all print statements
conda run -n AniWorld python -m pytest tests/ -v -s
📊 Detailed Analysis: The 7 Quality Criteria
5️⃣ No Shortcuts or Hacks Used
Global Variables (Temporary Storage)
src/server/fastapi_app.pyline 73 -> completedseries_app: Optional[SeriesApp] = Noneglobal storage- Should use FastAPI dependency injection instead
- Problematic for testing and multiple instances
Logging Configuration Workarounds
-- [ ] src/cli/Main.py lines 12-22 -> reviewed (no manual handler removal found) - Manual logger handler removal is hacky - for h in logging.root.handlers: logging.root.removeHandler(h) is a hack - Should use proper logging configuration - Multiple loggers created with file handlers at odd paths (line 26)
Hardcoded Values
-- [ ] src/core/providers/aniworld_provider.py line 22 -> completed - timeout = int(os.getenv("DOWNLOAD_TIMEOUT", 600)) at module level - Should be in settings class
-- [ ] src/core/providers/aniworld_provider.py lines 38, 47 -> completed - User-Agent strings hardcoded - Provider list hardcoded
src/cli/Main.pyline 227- Network path hardcoded:
"\\\\sshfs.r\\ubuntu@192.168.178.43\\media\\serien\\Serien" - Should be configuration
- Network path hardcoded:
Exception Handling Shortcuts
src/core/providers/enhanced_provider.pylines 410-421- Bare
except Exception:without specific types (line 418) -> reviewed - Multiple overlapping exception handlers (lines 410-425) -> reviewed
- Should use specific exception hierarchy -> partially addressed (file removal and temp file cleanup now catch OSError; other broad catches intentionally wrap into RetryableError)
- Bare
src/server/api/anime.pylines 35-39 -> reviewed- Bare
except Exception:handlers should specify types
- Bare
src/server/models/config.pyline 93except ValidationError: pass- silently ignores error -> reviewed (validate() now collects and returns errors)
Type Casting Workarounds
src/server/api/download.pyline 52 -> reviewed- Complex
.model_dump(mode="json")for serialization - Should use proper model serialization methods (kept for backward compatibility)
- Complex
src/server/utils/dependencies.pyline 36- Type casting with
.get()and defaults scattered throughout
- Type casting with
Conditional Hacks
src/server/utils/dependencies.pyline 260 -> completedrunning_tests = "PYTEST_CURRENT_TEST" in os.environ or "pytest" in sys.modules- Hacky test detection - should use proper test mode flag (now prefers ANIWORLD_TESTING env var)
6️⃣ Security Considerations Addressed
Authentication & Authorization
Weak CORS Configuration
src/server/fastapi_app.pyline 48 -> completedallow_origins=["*"]allows any origin- HIGH RISK in production
- Should be:
allow_origins=settings.allowed_origins(environment-based)
- No CORS rate limiting by origin
Missing Authorization Checks
src/server/middleware/auth.pylines 81-86- Silent failure on missing auth for protected endpoints
- Should consistently return 401 status
- Some endpoints might bypass auth silently
In-Memory Session Storage
src/server/services/auth_service.pyline 51 -> completed- In-memory
_faileddict resets on restart - Documented limitation with warning comment
- Should use Redis or database in production
- In-memory
Input Validation
Unvalidated User Input
src/cli/Main.pyline 80 -> completed (not found, likely already fixed)- User input for file paths not validated
- Could allow path traversal attacks
src/core/SerieScanner.pyline 37 -> completed- Directory path
basePathnow validated - Added checks for empty, non-existent, and non-directory paths
- Resolves to absolute path to prevent traversal attacks
- Directory path
src/server/api/anime.pyline 70 -> completed- Search query now validated with field_validator
- Added length limits and dangerous pattern detection
- Prevents SQL injection and other malicious inputs
src/core/providers/aniworld_provider.pyline 300+ -> completed- URL parameters now sanitized using quote()
- Added validation for season/episode numbers
- Key/slug parameters are URL-encoded before use
Missing Parameter Validation
src/core/providers/enhanced_provider.pyline 280 -> completed- Season/episode validation now comprehensive
- Added range checks (season: 1-999, episode: 1-9999)
- Added key validation (non-empty check)
src/server/database/models.py- No length validation on string fields
- No range validation on numeric fields
Secrets and Credentials
Hardcoded Secrets
src/config/settings.pyline 9 -> completed- JWT secret now uses
secrets.token_urlsafe(32)as default_factory - No longer exposes default secret in code
- Generates random secret if not provided via env
- JWT secret now uses
.envfile might contain secrets (if exists)- Should be in .gitignore
Plaintext Password Storage
src/config/settings.pyline 12 -> completed- Added prominent warning comment with emoji
- Enhanced description to emphasize NEVER use in production
- Clearly documents this is for development/testing only
Master Password Implementation
src/server/services/auth_service.pyline 71 -> completed- Password requirements now comprehensive:
- Minimum 8 characters
- Mixed case (uppercase + lowercase)
- At least one number
- At least one special character
- Enhanced error messages for better user guidance
Data Protection
No Encryption of Sensitive Data
- Downloaded files not verified with checksums
- No integrity checking of stored data
- No encryption of sensitive config values
File Permission Issues
src/core/providers/aniworld_provider.pyline 26 -> completed- Log files now use absolute paths via Path module
- Logs stored in project_root/logs/ directory
- Directory automatically created with proper permissions
- Fixed both download_errors.log and no_key_found.log
Logging of Sensitive Data
- Check all
logger.debug()calls for parameter logging- URLs might contain API keys
- Search queries might contain sensitive terms
- Example:
src/core/providers/enhanced_provider.pyline 260logger.debug()might log URLs with sensitive data
Network Security
Unvalidated External Connections
src/core/providers/aniworld_provider.pyline 60- HTTP retry configuration but no SSL verification flag check
src/core/providers/enhanced_provider.pyline 115- HTTP error codes 500-524 auto-retry without logging suspicious activity
Missing SSL/TLS Configuration
- Verify SSL certificate validation enabled -> completed
- Fixed all
verify=Falseinstances (4 total) - Changed to
verify=Truein: - doodstream.py (2 instances)
- loadx.py (2 instances)
- Added timeout parameters where missing
- Fixed all
- Check for
verify=Falsein requests calls -> completed- All requests now use SSL verification
Database Security
No SQL Injection Protection
- Check
src/server/database/service.pyfor parameterized queries -> completed- All queries use SQLAlchemy query builder (select, update, delete)
- No raw SQL or string concatenation found
- Parameters properly passed through where() clauses
- f-strings in LIKE clauses are safe (passed as parameter values)
- String interpolation in queries -> verified safe
- No string interpolation directly in SQL queries
- All user input is properly parameterized
No Database Access Control
- Single database user for all operations
- No row-level security
- No audit logging of data changes
7️⃣ Performance Validated
Algorithmic Efficiency Issues
File Scanning Performance
src/core/SerieScanner.pyline 105+__find_mp4_files()- potential O(n²) complexity- Recursive directory traversal not profiled
- No caching or incremental scanning
- Large directories (>10K files) might cause timeout
Download Queue Processing
src/server/services/download_service.pyline 240 -> completed- Optimized queue operations from O(n) to O(1)
- Added helper dict
_pending_items_by_idfor fast lookups - Created helper methods:
_add_to_pending_queue()- maintains both deque and dict_remove_from_pending_queue()- O(1) removal- Updated all append/remove operations to use helper methods
- Tests passing ✓
Provider Search Performance
src/core/providers/enhanced_provider.pyline 220- Multiple parsing strategies tried sequentially
- Should fail fast on obvious errors instead of trying all 3
- No performance metrics logged
String Operations
src/cli/Main.pyline 118- Nested
sum()with comprehensions - O(n*m) complexity total_episodes = sum(sum(len(ep) for ep in serie.episodeDict.values()) for serie in series)- No streaming/generator pattern
- Nested
Regular Expression Compilation
src/core/providers/streaming/doodstream.pyline 35- Regex patterns compiled on every call
- Should compile once at module level
- Example:
r"\$\.get\('([^']*\/pass_md5\/[^']*)'"compiled repeatedly
Resource Usage Issues
Memory Leaks/Unbounded Growth
src/server/middleware/auth.pyline 34 -> completed- Added _cleanup_old_entries() method
- Periodically removes rate limit entries older than 2x window
- Cleanup runs every 5 minutes
- Prevents unbounded memory growth from old IP addresses
src/server/services/download_service.pyline 85-86deque(maxlen=100)anddeque(maxlen=50)drop old items- Might lose important history
Connection Pool Configuration
src/server/database/connection.py- Check if connection pooling is configured
- No explicit pool size limits found
- Could exhaust database connections
Large Data Structure Initialization
src/cli/Main.pyline 118- Loading all series at once
- Should use pagination for large datasets
Caching Opportunities
No Request Caching
src/server/api/anime.py- endpoints hit database every time- No caching headers set
@cachedecorator could be used
src/core/providers/enhanced_provider.py- Search results not cached
- Same search query hits network repeatedly
No Database Query Optimization
src/server/services/anime_service.py- No eager loading (selectinload) for relationships
- N+1 query problems likely
src/server/database/service.pyline 200+- Check for missing
.selectinload()in queries
- Check for missing
Concurrent Request Handling
Thread Pool Sizing
src/server/services/download_service.pyline 85ThreadPoolExecutor(max_workers=max_concurrent_downloads)- Default is 2, should be configurable
- No queue depth limits
Async/Sync Blocking Calls
src/server/api/anime.pyline 30+- Series list operations might block
- Database queries appear async (OK)
src/server/services/auth_service.py- Methods are synchronous but called from async endpoints
- Should verify no blocking calls
I/O Performance
Database Query Count
/api/v1/animeendpoint- Likely makes multiple queries for each series
- Should use single query with joins/eager loading
- Test with N series to find N+1 issues
File I/O Optimization
src/core/SerieScanner.pyline 140+- Each folder reads data file
- Could batch reads or cache
Network Request Optimization
src/core/providers/enhanced_provider.pyline 115- Retry strategy good
- No connection pooling verification
- Should check request timeout values
Performance Metrics Missing
- No performance monitoring for slow endpoints
- No database query logging
- No cache hit/miss metrics
- No background task performance tracking
- No file operation benchmarks
📋 Issues by File and Category
Core Module Issues
src/cli/Main.py
- Code Quality: Class
SeriesAppduplicates coreSeriesAppfromsrc/core/SeriesApp.py- Consider consolidating or using inheritance
- Line 35:
_initialization_countduplicated state tracking
- Type Hints:
display_series()doesn't validate ifserie.nameisNonebefore using it - Import Organization: Imports not sorted (lines 1-11) - should follow isort convention
- Error Handling:
NoKeyFoundExceptionandMatchNotFoundErrorare bare except classes - need proper inheritance - Logging: Logging configuration at module level should be in centralized config
src/core/SeriesApp.py
- Global State: Line 73 -
series_app: Optional[SeriesApp] = Noneinfastapi_app.pyuses global state- Should use dependency injection instead
- Complexity:
Scan()method is complex (80+ lines) - should be broken into smaller methods - Error Context:
_handle_error()doesn't provide enough context about which operation failed
src/core/SerieScanner.py
- Code Quality:
is_null_or_whitespace()duplicates Python'sstr.isspace()- use built-in instead - Error Logging: Lines 167-182 catch exceptions but only log, don't propagate context
- Performance:
__find_mp4_files()might be inefficient for large directories - add progress callback
src/core/providers/base_provider.py
src/core/providers/aniworld_provider.py
- Import Organization: Lines 1-18 - imports not sorted (violates isort)
- Global State: Lines 24-26 - Multiple logger instances created at module level
- Should use centralized logging system
- Hardcoding: Line 42 - User-Agent string hardcoded (also at line 47 for Firefox)
- Extract to configuration constants
- Type Hints: Missing type hints on:
__init__()method parameters (no return type on implicit constructor)- Class attribute type annotations (line 41-62)
- Magic Strings: Line 38 - Hardcoded list of provider names should be enum
- Configuration: Timeouts hardcoded at line 22 - should use settings
src/core/providers/enhanced_provider.py
- Type Hints: Class constructor
__init__()missing type annotations (lines 40-96) - Documentation: Bare exception handlers at lines 418-419 - need specific exception types
- Code Quality:
with_error_recoverydecorator imported but usage unclear - Performance:
_create_robust_session()method not shown but likely creates multiple session objects
src/core/interfaces/providers.py
- Need to verify if any abstract methods lack type hints and docstrings
src/core/exceptions/Exceptions.py
- Need to verify custom exception hierarchy and documentation
Server Module Issues
src/server/fastapi_app.py
- Global State: Line 73 -
series_app: Optional[SeriesApp] = Nonestored globally- Use FastAPI dependency injection via
Depends()
- Use FastAPI dependency injection via
- CORS Configuration: Line 48 -
allow_origins=["*"]is production security issue- Add comment: "Configure appropriately for production"
- Extract to settings with environment-based defaults
- Error Handling:
startup_event()at line 79 - missing try-except to handle initialization failures - Type Hints:
startup_event()function missing type annotations - Documentation:
broadcast_callback()function inside event handler should be extracted to separate function - Logging: No error logging if
settings.anime_directoryis None
src/server/middleware/auth.py
- Performance: In-memory rate limiter (line 34) will leak memory - never cleans up old entries
- Need periodic cleanup or use Redis for production
- Security: Line 46 - Rate limiting only 60-second window, should be configurable
- Type Hints:
dispatch()method parameters properly typed, but return type could be explicit - Documentation:
_get_client_ip()method incomplete (line 94+ truncated) - Error Handling: Lines 81-86 - Silent failure if protected endpoint and no auth
- Should return 401 consistently
src/server/services/auth_service.py
- Documentation: Line 68 - Comment says "For now we update only in-memory" indicates incomplete implementation
- Create task to persist password hash to configuration file
- Type Hints:
_verify_password()at line 60 - no return type annotation (implicitbool) - Security: Line 71 - Minimum password length 8 characters, should be documented as security requirement
- State Management: In-memory
_faileddict (line 51) resets on process restart- Document this limitation and suggest Redis/database for production
src/server/database/service.py
- Documentation: Service layer methods need detailed docstrings explaining:
- Database constraints
- Transaction behavior
- Cascade delete behavior
- Error Handling: Methods don't specify which SQLAlchemy exceptions they might raise
src/server/database/models.py
- Documentation: Model relationships and cascade rules well-documented
- ✅ Type hints present and comprehensive (well done)
- Validation: No model-level validation before database insert
- Consider adding validators for constraints
src/server/services/download_service.py
- Performance: Line 85 -
deque(maxlen=100)for completed items - is this appropriate for long-running service? - Thread Safety: Uses
ThreadPoolExecutorbut thread-safety of queue operations not clear
src/server/utils/dependencies.py
- TODO Comments: Lines 223 and 233 - TODO comments for unimplemented features:
- "TODO: Implement rate limiting logic"
- "TODO: Implement request logging logic"
- Create separate task items for these
src/server/utils/system.py
- Exception Handling: Line 255 - bare
passstatement in exception handler- Should at least log the exception
src/server/api/anime.py
- Error Handling: Lines 35-39 - Multiple bare
except Exceptionhandlers- Need specific exception types and proper logging
- Code Quality: Lines 32-36 - Complex property access with
getattr()chains- Create helper function or model method to encapsulate
Models and Pydantic Issues
src/server/models/config.py
- Error Handling: Line 93 -
ValidationErrorcaught but only silently passed?- Should log or re-raise with context
Utility and Configuration Issues
src/config/settings.py
- Security: Line 12 -
master_passwordfield stored in environment during development- Add warning comment: "NEVER use this in production"
- Documentation: Settings class needs comprehensive docstring explaining each field
src/infrastructure/logging/GlobalLogger.py
- Need to review logging configuration for consistency
src/server/utils/logging.py
- Need to review for type hints and consistency with global logging
src/server/utils/template_helpers.py
- Need to review for type hints and docstrings
src/server/utils/log_manager.py
- Need to review for type hints and error handling
🔒 Security Issues
High Priority
- CORS Configuration (
src/server/fastapi_app.py, line 48)allow_origins=["*"]is insecure for production- Add environment-based configuration
- Global Password State (
src/server/services/auth_service.py, line 51)- In-memory failure tracking resets on restart
- Recommend using persistent storage (database/Redis)
Medium Priority
-
Rate Limiter Memory Leak (
src/server/middleware/auth.py, line 34)- Never cleans up old IP entries
- Add periodic cleanup or use Redis
-
Missing Authorization Checks (
src/server/middleware/auth.py, lines 81-86)- Some protected endpoints might silently allow unauthenticated access
📊 Code Style Issues
Documentation - Phase 1: Critical Sections
- Document database transaction behavior in
src/server/database/service.py
Documentation - Phase 2: Endpoints
- Expand docstrings on endpoints in
src/server/api/anime.py - Add parameter descriptions to endpoint handlers
- Document expected exceptions and error responses
Code Quality - Phase 1: Consolidation
- Investigate
SeriesAppduplication betweensrc/cli/Main.pyandsrc/core/SeriesApp.py - Consider consolidating into single implementation
- Update CLI to use core module instead of duplicate
Code Quality - Phase 2: Exception Handling
- Add specific exception types to bare
except:handlers - Add logging to all exception handlers
- Document exception context and causes
- Review exception handling in
src/core/providers/enhanced_provider.py(lines 410-421)
Code Quality - Phase 3: Refactoring
- Extract
broadcast_callback()fromstartup_event()insrc/server/fastapi_app.py - Break down complex
Scan()method insrc/core/SerieScanner.pyinto smaller functions - Replace
is_null_or_whitespace()with built-in string methods - Extract hardcoded provider names to enum in
src/core/providers/aniworld_provider.py
Security - Phase 1: Critical Fixes
- Make CORS configuration environment-based in
src/server/fastapi_app.py - Add startup validation to ensure
anime_directoryis configured
Security - Phase 2: Improvements
- Implement Redis-based rate limiter instead of in-memory in
src/server/middleware/auth.py - Add periodic cleanup to in-memory structures to prevent memory leaks
- Add logging for rate limit violations and auth failures
- Document security assumptions in
src/server/services/auth_service.py
Performance - Phase 1: Validation
- Profile
SerieScanner.__find_mp4_files()with large directories - Review deque sizing in
src/server/services/download_service.py(lines 85-86) - Verify thread-safety of queue operations
Performance - Phase 2: Optimization
- Add pagination to anime list endpoint if dataset is large
- Consider caching for search results in
src/core/providers/aniworld_provider.py - Review session creation overhead in provider initialization
Configuration Issues
- Extract hardcoded timeouts from
src/core/providers/aniworld_provider.pyline 22 to settings - Extract User-Agent strings to configuration constants
- Document all configuration options in settings module
- Add validation for required environment variables
Logging Issues
- Centralize logger creation across all modules
- Remove module-level logger instantiation where possible
- Document logging levels expected for each component
- Review
src/cli/Main.pylogging configuration (lines 12-22) - appears to suppress all logging
Testing/Comments
- Add inline comments explaining complex regex patterns in providers
- Add comments explaining retry logic and backoff strategies
- Document callback behavior and expected signatures
- Add comments to clarify WebSocket broadcast mechanisms
📌 Implementation Notes
Dependencies to Verify
error_handlermodule - currently missing, causing import error- All Pydantic models properly imported in service layers
- SQLAlchemy session management properly scoped
Configuration Management
- Review
src/config/settings.pyfor completeness - Ensure all configurable values are in settings, not hardcoded
- Document all environment variables needed
Testing Coverage
- Verify tests cover exception paths in
src/server/api/anime.py - Add tests for CORS configuration
- Test rate limiting behavior in middleware
- Test database transaction rollback scenarios
🔄 Validation Checklist Before Committing
For each issue fixed:
- Run Pylance to verify type hints are correct
- Run
isorton modified files to sort imports - Run
blackto format code to PEP8 standards - Run existing unit tests to ensure no regression
- Verify no new security vulnerabilities introduced
- Update docstrings if behavior changed
- Document any breaking API changes
Total Issues Identified: ~90 individual items across 8 categories Priority Distribution: 5 High | 15 Medium | 70 Low/Nice-to-have Estimated Effort: 40-60 hours for comprehensive quality improvement