feat: Enhanced anime add flow with sanitized folders and targeted scan

- Add sanitize_folder_name utility for filesystem-safe folder names
- Add sanitized_folder property to Serie entity
- Update SerieList.add() to use sanitized display names for folders
- Add scan_single_series() method for targeted episode scanning
- Enhance add_series endpoint: DB save -> folder create -> targeted scan
- Update response to include missing_episodes and total_missing
- Add comprehensive unit tests for new functionality
- Update API tests with proper mock support
This commit is contained in:
2025-12-26 12:49:23 +01:00
parent f28dc756c5
commit 1b7ca7b4da
11 changed files with 1370 additions and 146 deletions

View File

@@ -100,7 +100,7 @@ For each task completed:
- [ ] Performance validated
- [ ] Code reviewed
- [ ] Task marked as complete in instructions.md
- [ ] Infrastructure.md updated
- [ ] Infrastructure.md updated and other docs
- [ ] Changes committed to git; keep your messages in git short and clear
- [ ] Take the next task
@@ -121,114 +121,115 @@ For each task completed:
---
---
## Task: Enhanced Anime Add Flow
## Task: Add Database Transaction Support
### Overview
### Objective
Implement proper transaction handling across all database write operations using SQLAlchemy's transaction support. This ensures data consistency and prevents partial writes during compound operations.
### Background
Currently, the application uses SQLAlchemy sessions with auto-commit behavior through the `get_db_session()` generator. While individual operations are atomic, compound operations (multiple writes) can result in partial commits if an error occurs mid-operation.
Enhance the anime addition workflow to automatically persist anime to the database, scan for missing episodes immediately, and create folders using the anime display name instead of the internal key.
### Requirements
1. **All database write operations must be wrapped in explicit transactions**
2. **Compound operations must be atomic** - either all writes succeed or all fail
3. **Nested operations should use savepoints** for partial rollback capability
4. **Existing functionality must not break** - backward compatible changes only
5. **All tests must pass after implementation**
---
## Task: Graceful Shutdown Implementation ✅ COMPLETED
### Objective
Implement proper graceful shutdown handling so that Ctrl+C (SIGINT) or SIGTERM triggers a clean shutdown sequence that terminates all concurrent processes and prevents database corruption.
### Background
The application runs multiple concurrent services (WebSocket connections, download service with ThreadPoolExecutor, database sessions) that need to be properly cleaned up during shutdown. Without graceful shutdown, active downloads may corrupt state, database writes may be incomplete, and WebSocket clients won't receive disconnect notifications.
### Implementation Summary
The following components were implemented:
#### 1. WebSocket Service Shutdown ([src/server/services/websocket_service.py](src/server/services/websocket_service.py))
- Added `shutdown()` method to `ConnectionManager` that:
- Broadcasts `{"type": "server_shutdown"}` notification to all connected clients
- Gracefully closes each WebSocket connection with code 1001 (Going Away)
- Clears all connection tracking data structures
- Supports configurable timeout (default 5 seconds)
- Added `shutdown()` method to `WebSocketService` that delegates to the manager
#### 2. Download Service Stop ([src/server/services/download_service.py](src/server/services/download_service.py))
- Enhanced `stop()` method to:
- Persist active downloads back to "pending" status in database (allows resume on restart)
- Cancel active download tasks with proper timeout handling
- Shutdown ThreadPoolExecutor with `wait=True` and configurable timeout (default 10 seconds)
- Fall back to forced shutdown if timeout expires
#### 3. FastAPI Lifespan Shutdown ([src/server/fastapi_app.py](src/server/fastapi_app.py))
- Expanded shutdown sequence in proper order:
1. Broadcast shutdown notification via WebSocket
2. Stop download service and persist state
3. Clean up progress service (clear subscribers and active progress)
4. Close database connections with WAL checkpoint
- Added timeout protection (30 seconds total) with remaining time tracking
- Each step has individual timeout to prevent hanging
#### 4. Uvicorn Graceful Shutdown ([run_server.py](run_server.py))
- Added `timeout_graceful_shutdown=30` parameter to uvicorn.run()
- Ensures uvicorn allows sufficient time for lifespan shutdown to complete
- Updated docstring to document Ctrl+C behavior
#### 5. Stop Script ([stop_server.sh](stop_server.sh))
- Replaced `kill -9` (SIGKILL) with `kill -TERM` (SIGTERM)
- Added `wait_for_process()` function that waits up to 30 seconds for graceful shutdown
- Only falls back to SIGKILL if graceful shutdown times out
- Improved user feedback during shutdown process
#### 6. Database WAL Checkpoint ([src/server/database/connection.py](src/server/database/connection.py))
- Enhanced `close_db()` to run `PRAGMA wal_checkpoint(TRUNCATE)` for SQLite
- Ensures all pending WAL writes are flushed to main database file
- Prevents database corruption during shutdown
### How Graceful Shutdown Works
1. **Ctrl+C or SIGTERM received** → uvicorn catches signal
2. **uvicorn triggers lifespan shutdown** → FastAPI's lifespan context manager exits
3. **WebSocket broadcast** → All connected clients receive shutdown notification
4. **Download service stops** → Active downloads persisted, executor shutdown
5. **Progress service cleanup** → Event subscribers cleared
6. **Database cleanup** → WAL checkpoint, connections disposed
7. **Process exits cleanly** → No data loss or corruption
### Testing
```bash
# Start server
conda run -n AniWorld python run_server.py
# Press Ctrl+C to trigger graceful shutdown
# Or use the stop script:
./stop_server.sh
```
### Verification
- All existing tests pass (websocket, download service, database transactions)
- WebSocket clients receive disconnect notification before connection closes
- Active downloads are preserved and can resume on restart
- SQLite WAL file is checkpointed before shutdown
1. **After anime add → Save to database**: Ensure the anime is persisted to the database via `AnimeDBService.create_series()` immediately after validation
2. **After anime add → Scan for missing episodes**: Trigger a targeted episode scan for only the newly added anime (not the entire library)
3. **After anime add → Create folder with anime name**: Use the anime display name (sanitized) for the folder, not the anime key
### Implementation Steps
#### Step 1: Examine Current Implementation
1. Open and read `src/server/routes/anime_routes.py` - find the `add_series` endpoint
2. Open and read `src/core/SerieScanner.py` - understand how scanning works
3. Open and read `src/core/entities/Serie.py` and `src/core/entities/SerieList.py` - understand folder handling
4. Open and read `src/database/services/anime_db_service.py` - understand database operations
5. Open and read `src/core/providers/AniWorldProvider.py` - understand how folders are created
#### Step 2: Create Utility Function for Folder Name Sanitization
1. Create or update utility module at `src/utils/filesystem.py`
2. Implement `sanitize_folder_name(name: str) -> str` function that:
- Removes/replaces characters invalid for filesystems: `< > : " / \ | ? *`
- Trims leading/trailing whitespace and dots
- Handles edge cases (empty string, only invalid chars)
- Preserves Unicode characters (for Japanese titles, etc.)
#### Step 3: Update Serie Entity
1. Open `src/core/entities/Serie.py`
2. Add a `folder` property that returns sanitized display name instead of key
3. Ensure backward compatibility with existing series
#### Step 4: Update SerieList to Use Display Name for Folders
1. Open `src/core/entities/SerieList.py`
2. In the `add()` method, use `serie.folder` (display name) instead of `serie.key` when creating directories
3. Ensure the folder path is correctly stored in the Serie object
#### Step 5: Add Targeted Episode Scan Method to SerieScanner
1. Open `src/core/SerieScanner.py`
2. Add new method `scan_single_series(self, key: str) -> List[Episode]`:
- Fetches the specific anime from database/SerieList by key
- Calls the provider to get available episodes
- Compares with local files to find missing episodes
- Returns list of missing episodes
- Does NOT trigger a full library rescan
#### Step 6: Update add_series Endpoint
1. Open `src/server/routes/anime_routes.py`
2. Modify the `add_series` endpoint to:
- **Step A**: Validate the request (existing)
- **Step B**: Create Serie object with sanitized folder name
- **Step C**: Save to database via `AnimeDBService.create_series()`
- **Step D**: Add to SerieList (which creates the folder)
- **Step E**: Call `SerieScanner.scan_single_series(key)` for targeted scan
- **Step F**: Return response including:
- Success status
- Created folder path
- List of missing episodes found (if any)
#### Step 7: Update Provider Folder Handling
1. Open `src/core/providers/AniWorldProvider.py`
2. Ensure download operations use `serie.folder` for filesystem paths
3. If `EnhancedProvider.py` exists, update it similarly
### Acceptance Criteria
- [ ] When adding a new anime, it is immediately saved to the database
- [ ] When adding a new anime, only that anime is scanned for missing episodes (not full library)
- [ ] Folder is created using the sanitized display name (e.g., "Attack on Titan" not "attack-on-titan")
- [ ] Special characters in anime names are properly handled (`:`, `?`, etc.)
- [ ] Existing anime entries continue to work (backward compatibility)
- [ ] API response includes the created folder path and missing episodes count
- [ ] Unit tests cover the new functionality
- [ ] No regressions in existing tests
### Testing Requirements
1. **Unit Tests**:
- Test `sanitize_folder_name()` with various inputs (special chars, Unicode, edge cases)
- Test `Serie.folder` property returns sanitized name
- Test `SerieScanner.scan_single_series()` only scans the specified anime
- Test database persistence on anime add
2. **Integration Tests**:
- Test full add flow: request → database → folder creation → scan
- Test that folder is created with correct name
- Test API response contains expected fields
### Error Handling
- If database save fails, return appropriate error and don't create folder
- If folder creation fails (permissions, disk full), return error and rollback database entry
- If scan fails, still return success for add but indicate scan failure in response
- Log all operations with appropriate log levels
### Security Considerations
- Sanitize folder names to prevent path traversal attacks
- Validate anime name length to prevent filesystem issues
- Ensure folder is created within the configured library path only
---