Compare commits

...

135 Commits

Author SHA1 Message Date
dbaf80e941 chore: bump version 2026-06-03 21:49:02 +02:00
4fc597c5de fix: disable startup media scan
Skip incomplete series check on server startup to reduce
startup time. Can be re-enabled by restoring _check_media_scan_status().
2026-06-03 21:47:55 +02:00
a77bb371df chore: bump version 2026-06-03 21:41:30 +02:00
420d10bb34 Fix async/await bug in folder_rename_service
- _update_series_folder: make async, await AnimeSeriesService.update()
- test: fix mock method name get_by_key -> get_by_folder

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-06-03 21:19:03 +02:00
e29918488c fix: correct key_resolution_service import path in scheduler
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-06-03 21:14:12 +02:00
9c3f03d610 refactor(scheduler): separate scheduler logic from scan/rescan logic
- Extract rescan logic into new RescanService (src/server/services/rescan_service.py)
- SchedulerService now only handles APScheduler cron scheduling
- Move scheduler sub-services (folder_rename, folder_scan, key_resolution) to scheduler/ folder
- Keep RescanOrchestrator as backward-compatible alias
- Update all imports across api/, server/, and test files
2026-06-03 20:58:30 +02:00
9d64241230 fix(db): add missing legacy_key_cleanup_completed column migration
Add migrate_schema_if_needed() to handle adding missing columns to existing
tables for backward compatibility. Automatically adds legacy_key_cleanup_completed
BOOLEAN column to system_settings table if missing, preventing 'no such column'
errors on startup for existing databases.
2026-06-03 20:22:59 +02:00
49cd84f3e5 chore: bump version 2026-06-02 20:59:42 +02:00
e46759347e backup 2026-06-02 20:59:13 +02:00
75f743e6cc fix: fetch series name from provider when scanning
Avoid 'Series name cannot be empty' error when a Serie is loaded
from a serie_file with an empty name by fetching the title from the
provider after year-fetching in the scan() method.

fix #?
2026-06-02 20:57:27 +02:00
4dc5ffa19e copy version to docker file 2026-06-02 20:53:11 +02:00
1649a22418 chore: bump version 2026-06-02 20:39:18 +02:00
246752e2fc Add dynamic version from Docker/VERSION file
- Create version.py utility to read version from Docker/VERSION
- Replace hardcoded version '1.0.1' with APP_VERSION from version.py
- Add version logging on FastAPI startup
- Use APP_VERSION in health endpoints and template context
2026-06-02 20:38:42 +02:00
84b24ed79e chore: bump version 2026-06-02 20:33:01 +02:00
bf3954587a fix(folder_rename_service): use get_by_folder instead of get_by_key when looking up by folder name
Update_database_paths and duplicate folder cleanup were using get_by_key()
(provider key lookup) instead of get_by_folder() when operating on folder names.
This caused orphaned DB records when removing duplicate folders like 'Hells Paradise'
that mapped to an already-existing 'Hell\'s Paradise (2023)'.
2026-06-02 20:09:47 +02:00
ed8f5cae10 chore: bump version 2026-06-01 21:38:37 +02:00
a54c285994 fix(folder_scan): await NFO repair before folder rename
folder_rename_service depends on clean NFO files but repair tasks
were fire-and-forget. Now collect all repair tasks and await them
with asyncio.gather before validate_and_rename_series_folders runs.

Also update tests that mock asyncio.create_task to also mock
asyncio.gather since perform_nfo_repair_scan now awaits tasks.
2026-06-01 21:37:28 +02:00
c58b42dfa5 feat(services): add key resolution for orphaned anime folders
- Add key_resolution_service.py to resolve provider keys for folders without key/data files
- Search anime provider and match folder names (case-insensitive, exact match required)
- Only save to DB if exactly one match found; otherwise skip
- Add comprehensive unit tests (28 tests)
- Integrate into scheduler_service after nfo_repair scan
- Update ARCHITECTURE.md documentation
2026-06-01 20:43:13 +02:00
6dfb24de7e backup 2026-06-01 20:07:58 +02:00
6021cdef28 feat: add anime metadata editing and NFO diagnostics
- Add PUT /anime/{key} endpoint for updating anime key, tmdb_id, tvdb_id
- Add NFO diagnostics and repair endpoints (GET/POST /nfo/diagnostics)
- Add edit modal UI with context menu integration
- Add frontend JS modules for context-menu and edit-modal
- Add comprehensive tests for edit, rename, and NFO repair flows
2026-05-31 18:31:56 +02:00
5517ccbab0 style: reformat folder_rename_service import 2026-05-30 12:20:40 +02:00
94ed013172 Revert "feat: add manual TMDB/TVDB ID entry for failed lookups"
This reverts commit 30858f441c.
2026-05-30 12:17:48 +02:00
76b849fc91 chore: bump version 2026-05-30 12:02:48 +02:00
00b26c8cbc fix: validate generated keys before creating Serie objects
- Add is_valid_key check in SerieScanner._read_data_from_file() to prevent
  passing invalid keys to Serie constructor (caused ValueError)
- Improve error message for key generation failures
- Add warning log before removing duplicate source folders in rename service
2026-05-30 11:42:19 +02:00
a6f2399aca chore: bump version 2026-05-29 19:25:30 +02:00
cf001563b3 refactor: add folder rename configuration and service
Add configurable folder rename patterns via settings with anime_folder_rename_regex and custom_pattern options. Integrate into SerieScanner and SeriesApp for consistent episode organization.
2026-05-29 19:24:09 +02:00
38c12638a4 fix HLS stream warning by disabling native downloader and retrying with ffmpeg
- Set hls_prefer_native: False to skip yt-dlp's native HLS downloader which emits
  'Live HLS streams are not supported' warning
- Add retry logic that catches HLS-related exceptions and retries with
  downloader=ffmpeg and hls_use_mpegts=True
2026-05-29 18:53:47 +02:00
765e43c684 fix(key_utils): drop apostrophes in generate_key_from_folder 2026-05-29 18:20:20 +02:00
5190d32665 chore: bump version 2026-05-28 22:03:52 +02:00
4e6afa31b5 Remove legacy key file support after DB migration
- SerieScanner: Remove key file fallback, keep data file fallback
- SystemSettings: Add legacy_key_cleanup_completed flag
- initialization_service: Add cleanup task to remove key files from folders with DB entries
- Tests updated to reflect key file removal from legacy path

Key files caused duplicate key errors on folder rename. DB is now sole source of truth.
2026-05-28 22:01:37 +02:00
1ef59c5283 feat: add duplicate folder detection and /duplicate-folders API endpoint
- Add DuplicateFolderGroup and DuplicateFoldersResponse Pydantic models
- Add /duplicate-folders GET endpoint for listing pre-existing duplicates
- Add _scan_for_pre_existing_duplicates() function for NFO-based detection
- Add _try_merge_duplicate_group() for auto-merging empty/symlink-only duplicates
- Integrate duplicate detection into validate_and_rename_series_folders workflow
- Skip rename for flagged duplicates to prevent data loss during merge
2026-05-28 21:46:08 +02:00
239341629c Add orphaned folder cleanup after rename
- Add _cleanup_orphaned_folder() to delete/move old folder contents after rename
- Empty folders: delete directly via rmdir()
- Non-empty folders: move contents to new path, then delete old folder
- Handle PermissionError and OSError gracefully with logging
- Add dry_run parameter to preview changes without applying them
- Add --dry-run support to validate_and_rename_series_folders()
- Add unit tests for _cleanup_orphaned_folder and dry-run mode
- All 66 related tests pass

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-28 21:24:13 +02:00
51b7f349f8 fix(scheduler): strip null legacy alias fields from config.json on save
SchedulerConfig.__init__ maps legacy auto_download/folder_scan keys to the
primary auto_download_after_rescan/folder_scan_enabled fields. However,
model_dump() was including auto_download=null and folder_scan=null in
serialised output. When this was written to config.json and reloaded,
those keys were present (albeit null), so the alias mapping was skipped
and the primary fields retained default False values instead of the
configured True values.

Fix:
- Override SchedulerConfig.model_dump() to drop None-valued alias fields
  before returning the serialised dict.
- ConfigService.save_config() re-serialises the scheduler field through
  its overridden model_dump() so the fix applies when writing to disk.

Tests added:
- test_roundtrip_excludes_none_alias_fields: verifies model_dump omits
  null auto_download/folder_scan keys.
- test_save_and_load_scheduler_flags_roundtrip: end-to-end roundtrip
  through ConfigService confirms raw JSON and loaded values match.

Pre-existing failure in test_core_error_handler.py is unrelated.
2026-05-28 21:18:16 +02:00
14b8ef7f06 Add Step 4 fallback: generate key from folder name
- SerieScanner: generate key from folder when no key/data files exist
- Handle edge cases: non-Latin characters, special symbols in folder names
- anime_service: expose loading_status and loading_error fields
- Update tests to match new fallback behavior
2026-05-28 18:48:43 +02:00
7abba0dae2 Fix download provider errors with exponential backoff and playmogo support
- Add exponential backoff retry logic to RecoveryStrategies (1s, 2s, 4s...)
- Add TimeoutError to network failure handling for HTTPS timeouts
- Add playmogo.com referer header for Doodstream provider
- Add Optional import to error_handler.py
- Add sanitize_url_for_logging utility function

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-28 18:47:05 +02:00
30858f441c feat: add manual TMDB/TVDB ID entry for failed lookups
- Add PATCH /api/anime/{key}/metadata-ids endpoint to update IDs
- Add POST /api/anime/{key}/refresh-nfo endpoint to force NFO regeneration
- Add Edit Metadata IDs modal in frontend
- Add showEditMetadataModal, saveMetadataIds, refreshSeriesNfo JS functions
- Add edit-metadata-btn to series cards with database icon
- IDs validated as positive integers or null

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-28 18:38:34 +02:00
33f63ca304 feat(SerieScanner): add folder ignore patterns for non-anime content
- Add NFO_FOLDER_IGNORE_PATTERNS setting to skip TV shows like
  The Last of Us, Loki, Chernobyl, Star Trek Discovery
- Update SerieScanner.__find_mp4_files() to skip ignored folders
- Update SerieList.load_series() to skip ignored folders
- Add should_ignore_folder() method for pattern matching
- Add folder_ignore_patterns property for pattern parsing
- Add comprehensive tests for ignore pattern functionality
- Update NFO_GUIDE.md with ignore patterns documentation
- Update CONFIGURATION.md with new setting

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-28 18:11:45 +02:00
fe9284b80e feat(SerieScanner): add warning event for duplicate series keys
- Add on_warning event system with subscribe/unsubscribe methods
- Change duplicate key handling from error to warning
- Fire on_warning event when duplicate series detected
- Include metadata: key, duplicate_folder, existing_folder
2026-05-28 18:05:07 +02:00
12e5526991 chore: remove obsolete migration script
Migration logic moved to serie_scanner. No longer needed.
2026-05-28 17:55:00 +02:00
bc87bee416 refactor(scheduler): drop separate scheduler.db in favour of MemoryJobStore
Scheduler used a separate SQLite file (scheduler.db) only to persist one
cron job. This was originally required because APScheduler's
SQLAlchemyJobStore is sync-only, creating an async/sync driver conflict
when accessing the same file.

The job is rebuilt from config.json on every startup regardless
(replace_existing=True), so the persisted state only served misfire
detection. Moved misfire detection into the app layer by querying
system_settings.last_scan_timestamp on startup: if the last scan is
>23h but <25h ago, an immediate rescan is triggered.

Change summary:
- Remove SQLAlchemyJobStore; use default MemoryJobStore instead
- Add _check_missed_run() that reads last_scan_timestamp from aniworld.db
- Update docs/DEVELOPMENT.md scheduler troubleshooting section
- Update the scheduler unit test that verified SQLAlchemyJobStore
2026-05-27 22:09:18 +02:00
7ded5a6e4d chore: bump version 2026-05-27 21:41:45 +02:00
d596902ca3 Parse existing NFO for TMDB ID to skip redundant search
Check existing tvshow.nfo for TMDB ID before querying TMDB API.
If found, fetch details directly using cached ID instead of searching.
Reduces API calls and improves performance for already-indexed series.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-27 21:22:24 +02:00
d358a07290 fix async handling in SerieScanner and add image_downloader cleanup
- SerieScanner.scan() now detects running event loop and uses create_task()
  when already in async context, avoids RuntimeError
- NFOService.close() now also closes image_downloader to prevent resource leaks
- Add integration tests for TMDBClient lifecycle management

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-27 20:47:29 +02:00
b9c55f9e7a fix: remove double-call on AsyncSession in SerieScanner
get_async_session_factory() returns session directly, not factory.
Calling result again with () caused 'AsyncSession' object is not callable.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-27 20:19:34 +02:00
fc4e52f1a2 chore: bump version 2026-05-26 20:23:20 +02:00
6d30747f25 Fix stale data file updates on download completion
When episodes are downloaded successfully, the in-memory Serie.episodeDict
is updated, but the deprecated data file was not being synced. This caused
UI to show episodes as missing when already downloaded.

Changes:
- Update data file in _remove_episode_from_memory when download completes
- DB is authoritative; data file is optional backup (deprecated)
- Gracefully skip update if data file doesn't exist

New integration tests for episode download sync:
- Verify episode removed from missing list after download
- Verify in-memory cache updated after download
- Verify data file updated after download (when it exists)
- Verify downloads work without data file
2026-05-26 18:57:04 +02:00
ceb6a2aeb4 Rename sync_series_from_data_files to sync_legacy_series_to_db
- Rename function to reflect its legacy status
- Add deprecation warning log on execution
- Update all callers (initialization_service, api/config, fastapi_app)
- Update tests to use new name
- Add deprecation notice to DEVELOPMENT.md

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-26 18:45:22 +02:00
53d6da5dac Add database loading methods to SerieList
- Add load_all_from_db() for bulk loading series from DB
- Add _load_single_series_from_db() for loading single series by folder
- Add invalidate_cache() to clear in-memory cache
- Add tests for all new methods

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-26 18:26:25 +02:00
102d83e947 feat(scanner): replace file writes with DB persistence for series
- SerieScanner.scan() now calls _persist_serie_to_db() instead of serie.save_to_file()
- Added _sync_episodes_to_db() helper to handle episode CRUD during sync
- EpisodeService gains delete_by_series() for targeted episode deletion
- SerieList gains add_to_db() async method for DB-based series addition
- test_serie_scanner_db_writes.py covers create/update/preserve/sync scenarios
- DATABASE.md updated with Series Persistence Flow section

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-26 18:12:01 +02:00
841368bf85 feat(SerieScanner): DB lookup primary, deprecate key file fallback
DB now source of truth for folder -> Serie resolution.

Changes:
- AnimeSeriesService.get_by_folder(): new async lookup by folder name
- SerieScanner.__read_data_from_file(): query DB first, then provider callback, then legacy key file (temporary, removed v3.0.0)
- Serie: reconstruct from DB record with episode dict
- Key file: warn on use, scheduled removal v3.0.0

Add unit tests for DB hit/miss/callback/fallback edge cases

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-26 17:56:37 +02:00
cbd53ef2a0 feat: add legacy key/data file migration to database
- Add migration_legacy_files_completed flag to SystemSettings model
- Create legacy_file_migration service to migrate series from key/data files
- Integrate legacy migration into initialization_service startup flow
- Add integration tests for legacy file migration
- Update DATABASE.md documentation with migration details
- Fix various test and service issues (nfo_repair, tmdb_client, download_service)
- Add test_database_schema unit tests
2026-05-26 17:44:42 +02:00
50a77976d5 chore: bump version 2026-05-26 13:28:12 +02:00
dfc28b8e66 fix(scheduler): ensure scheduler starts after setup/config update
Add ensure_started() to SchedulerService as idempotent entry point.
Start scheduler in auth setup run_initialization() after NFO scan.
Sync anime_directory and start scheduler in config update endpoint.
Add unit and endpoint tests for ensure_started() behavior.
2026-05-26 13:23:48 +02:00
6c9605e896 chore: bump version 2026-05-26 08:58:37 +02:00
3947f6d266 refactor(scheduler): replace structlog with std logging and add extensive diagnostics
- Switch scheduler_service from structlog to standard logging for consistency
- Add detailed lifecycle logging in SchedulerService (start, stop, rescan)
- Add debug logging in fastapi_app scheduler initialization
- Fix test_add_series_episodes to mock EpisodeService.get_by_series
2026-05-26 07:51:22 +02:00
a3176f5ac1 chore: bump version 2026-05-25 21:31:46 +02:00
9a81b04b65 fix: downloaded episodes no longer appear as missing
Use the database as the authoritative source for missing-episode lists so
that episodes marked is_downloaded=True are never shown as missing, even
when the in-memory state is stale.

Key changes:
- EpisodeService.get_by_series() gains only_missing flag
- AnimeService uses DB-backed episodeDict and preserves downloaded episodes
  during sync, skipping them when adding/removing missing episodes
- DownloadService broadcasts series_updated after marking an episode downloaded
  so the frontend reflects the change immediately
- Frontend filters out series with zero missing episodes client-side and
  fixes renderSeries to respect the active filter
- Unit tests updated to assert the broadcast is sent
2026-05-25 21:30:31 +02:00
a336733ea9 fix(nfo): add year field to series and create missing NFO files
- Add missing year field when building series list in anime_service
- Add _create_missing_nfo to generate minimal NFO for series without one
- Update perform_nfo_repair_scan to detect and create missing NFOs
- Add semaphore-protected async creation with TMDB rate limiting

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-25 16:32:54 +02:00
ca93bb740a feat(providers): detect HTML encoding before parsing
Add chardet-based _decode_html_content() to aniworld_provider. Apply
to all BeautifulSoup parsing calls to prevent decoding warnings on
pages with mismatched encoding declarations. Falls back to utf-8
with errors='replace' when confidence < 0.7.

Also fix test_enhanced_provider HLS test signature and add HLS
pattern unit tests.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-25 16:30:36 +02:00
d5e955a731 Add year extraction from folder names for existing series
- New migration script: populate year from folder (YYYY) pattern
- SerieScanner: refactor year extraction logic
- anime_service: pass year when syncing from data files
2026-05-25 15:30:28 +02:00
e2a373816a feat(nfo): add minimal NFO fallback when TMDB fails
- Add create_minimal_nfo() method to NFOService for fallback when TMDB lookup fails
- Update API endpoints (single and batch) to use minimal NFO fallback on TMDBAPIError
- Document fallback behavior in NFO_GUIDE.md section 3.6
- Add unit tests for minimal NFO creation (11 tests passing)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-25 15:19:50 +02:00
a115215416 fix(providers): rotate, probe and fall back on 404
Iterate providers actually advertised on the episode page (ordered by
SUPPORTED_PROVIDERS preference) instead of always re-resolving VOE.
Each candidate is HEAD-probed before yt-dlp runs, so dead links are
skipped immediately; direct video URLs use a streaming fast path that
bypasses yt-dlp; total failure now logs the exhausted provider list.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-25 14:32:10 +02:00
c579235af0 feat(download): persist retry state and dead-letter
Retry count and queue status were in-memory only and lost on
restart, so failed downloads could not be safely resumed and
permanently-failed episodes silently blocked re-queueing via the
episode-id unique index.

- Add status + retry_count columns to DownloadQueueItem
- Replace unique(episode_id) with unique(episode_id, status) so
  permanently_failed rows do not block new pending entries
- Add PERMANENTLY_FAILED to DownloadStatus enum
- Persist retry_count on each failure; mark permanently_failed once
  max_retries reached
- QueueRepository reads status/retry_count from DB instead of
  defaulting to PENDING/0
- Stop double-incrementing retry_count in retry_failed_items;
  increment only happens in _process_download on failure

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-25 14:24:31 +02:00
0ba2587bc8 refactor(download): mark episode downloaded instead of deleting
Change _remove_episode_from_missing_list to set is_downloaded=True
and populate file_path via EpisodeService.mark_downloaded, instead of
deleting the Episode row. Preserves download history so queries can
distinguish series with downloaded episodes from completely unwatched
series.

- Pass serie_folder to construct file_path
- Look up series_id via AnimeSeriesService.get_by_key
- Update tests to mock mark_downloaded path
- Document episode lifecycle in docs/DEVELOPMENT.md

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-25 14:14:33 +02:00
bda1fe4445 Fix scheduler next_run_time None check; add debug logging
- Fix race condition: next_run_time only available after scheduler.start()
- Handle None gracefully in logging
- Add debug logging to _perform_rescan and _run_rescan_job
- Document scheduler troubleshooting in DEVELOPMENT.md
2026-05-25 13:48:34 +02:00
810346bc8b chore: bump version 2026-05-24 21:17:39 +02:00
daa937bcb7 Fix test isolation: clear logging handlers and reset propagate flags
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-23 22:44:40 +02:00
1c505bd722 Use ffmpeg for HLS streams in aniworld provider
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-23 22:26:48 +02:00
3551838887 Add startup health checks and /health/ready endpoint
- Add _run_startup_health_checks() function in fastapi_app.py
  - Check ffmpeg availability (warning)
  - Check DNS resolution for aniworld.to and api.themoviedb.org (warning)
  - Check anime_directory configuration and writability (error)
- Store startup checks in app.state for health endpoint access
- Add /health/ready endpoint for container orchestrators
  - Returns not_ready with 503 when critical failures present
  - Includes critical_failures list for debugging
- Update /health endpoint to include startup check results
  - Status reflects worst check (error > warning > ok)
- Document health check endpoints in DEVELOPMENT.md
- Add unit tests for startup health checks
- Add unit tests for /health/ready endpoint

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-23 22:12:03 +02:00
9a20541598 feat(NFO): add TMDB search fallback with alt_titles support
- New _search_with_fallback() method tries multiple strategies:
  1. Primary query with year filter (de-DE locale)
  2. Alternative titles with ja-JP / en-US locales
  3. English search (en-US)
  4. Search without year constraint
  5. Punctuation-normalized query
- create_nfo() accepts new alt_titles param for Japanese/title fallback
- Better match rate for anime with non-English titles

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-23 21:57:00 +02:00
3f7651404d fix(tmdb): harden aiohttp session lifecycle
- Add async context manager to NFOService wrapping TMDBClient + ImageDownloader
- Add TMDBClient.__del__ warning when session leaks
- Log exc_info on session recreation for traceback visibility
- Document async-with usage in docs/DEVELOPMENT.md and docs/TESTING.md
- Add unit tests covering leak detection, context-manager cleanup, and connector-closed warning

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-23 21:34:26 +02:00
bee24406e6 Add runner.csx script 2026-05-23 21:28:54 +02:00
31eb0026cf Add queue deduplication to prevent duplicate entries
- In-memory dedup in add_to_queue() using _pending_by_episode dict
- Batch-local dedup via seen_in_batch set (handles duplicates within single call)
- Database unique index on episode_id via __table_args__
- 5-minute cooldown in _auto_download_missing() to prevent rapid re-triggers
- Updated _add_to_pending_queue() and _remove_from_pending_queue() to track episode keys
- Added TestQueueDeduplication with 4 test cases
- Updated DEVELOPMENT.md and TESTING.md with queue dedup docs

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-23 21:27:41 +02:00
24ea12bbaf Add Docs/runner.csx 2026-05-23 21:19:15 +02:00
e74b602f60 Add ffmpeg for HLS stream download support
- Add ffmpeg to Dockerfile.app for container HLS support
- Configure yt-dlp with --downloader ffmpeg --hls-use-mpegts
- Add startup health check warns if ffmpeg missing
- Update DEVELOPMENT.md with ffmpeg prerequisites and troubleshooting
- Add tests for ffmpeg HLS options and health check
2026-05-23 21:18:39 +02:00
db65e28854 backup 2026-05-21 22:10:07 +02:00
11e231a4ab chore: bump version 2026-05-21 21:42:13 +02:00
a11f8c4fa0 fix(vpn): add explicit host route for health-check target
Without a /32 route in the main table, CHECK_HOST (1.1.1.1) fell through
to the VPN default route where source-address selection was defeated by
the priority-100 'from ETH0_IP' policy rule, causing pings to bypass
wg0 and be dropped by the kill switch.

Also add secondary google.com ping to distinguish IP vs DNS failures.
2026-05-21 21:41:51 +02:00
cf5a06af11 chore: bump version 2026-05-21 21:22:08 +02:00
e07f75432e backup 2026-05-21 21:21:13 +02:00
1696d5c65b chore: bump version 2026-05-21 21:04:51 +02:00
c8b386f47a chore: bump version 2026-05-20 20:00:45 +02:00
3888da352a feat(tmdb): improve rate limiting and retry resilience
- Increase max_retries from 3 to 5 with exponential backoff capped at 30s
- Add per-second rate limiter (~35 req/s) to stay under TMDB's ~40/s limit
- Replace small semaphore (4) with larger one (30) + token-bucket throttle
- Abort retries immediately on DNS/name-resolution failures
- Increase rate-limit fallback wait from default to max(delay*2, 10)s
2026-05-20 20:00:11 +02:00
06e104db42 chore: bump version 2026-05-20 19:41:58 +02:00
d4594bd1d9 chore: bump version 2026-05-20 19:40:17 +02:00
d866e836f6 backup 2026-05-20 19:39:08 +02:00
195dae13cb test: add integration tests for NFO content and repair
- test_add_anime_nfo_content.py: verify required NFO tags after anime add
- test_sacrificial_princess_nfo.py: test full NFO generation and repair path
2026-05-20 19:38:43 +02:00
51be777e7d fix: strip all trailing year suffixes to prevent duplication
- series.py: use regex to remove all trailing (YYYY) before appending year
- nfo_service.py: _extract_year_from_name strips all trailing year suffixes
- nfo_repair_service.py: add _read_tmdb_id() helper to extract TMDB ID from NFO
2026-05-20 19:38:37 +02:00
7930e49701 fix: prevent duplicate year suffixes in series name and folder creation
Apply the same duplicate-year prevention logic to additional code paths:

- Serie.name_with_year property: skip adding year suffix if name already ends with it
- add_series API endpoint: avoid duplicating year in folder_name_with_year
- Add integration test for Serie.name_with_year idempotency
- Add API test for add_series endpoint year deduplication

Complements the folder_rename_service fix for comprehensive coverage.
2026-05-19 21:25:21 +02:00
75c22fe296 fix(folder-rename): prevent duplicate year suffixes in series folder names
Use regex to strip all trailing year suffixes before adding the canonical
one, preventing duplication like 'Show (2021) (2021) (2021)'.

- Add regex pattern (\s*\(\d{4}\))+\s*$ to remove all existing year suffixes
- Ensure idempotent behavior across multiple folder rename runs
- Add 7 unit tests covering the bug cases and edge scenarios

Fixes: 86 Eighty Six (2021) (2021)..., Alma-chan (2025) (2025)...
2026-05-19 21:24:07 +02:00
7bcd0600d5 chore: bump version 2026-05-18 09:57:51 +02:00
a333329ae2 backup 2026-05-18 09:56:59 +02:00
363f7899f8 refactor(logging): reduce download log spam and set INFO level
- Pass app logger to yt-dlp so internal [download] progress lines
  are routed through the INFO-level logger instead of stdout.
- Throttle download_progress_handler debug logging to avoid
  flooding logs on every fragment tick.
- Switch key provider lifecycle messages to INFO (start/complete)
  while keeping verbose details at DEBUG.
- Set debug_enabled=False in development config so dev mode
  does not emit extra debug noise.
- Update config docstring example from DEBUG to INFO.
2026-05-18 09:56:19 +02:00
a08a8f7408 backup 2026-05-17 18:57:12 +02:00
54ac5e9ab7 chore: release v1.1.4 2026-05-17 18:51:09 +02:00
c93ac3e7b8 chore: release v1.1.3 2026-05-17 18:40:37 +02:00
68c4335348 feat(vpn): add startup connectivity checks and PersistentKeepalive
Add check_vpn_connectivity() that runs once after wg0 comes up:
- Waits for handshake (up to 15s) and prints public key if missing
- Measures RX bytes before/after curl to detect server-side routing issues
- Tests DNS resolution and dumps resolv.conf on failure
- On failure prints exact server-side commands to fix (sysctl, iptables, wg)

Add PersistentKeepalive=25 to wg0.conf to keep NAT mappings alive.
2026-05-17 18:40:24 +02:00
be87f2e230 chore: release v1.1.2 2026-05-17 18:31:59 +02:00
c56e0f507d fix(vpn): fix DNS iptables rules and add NET_RAW cap
DNS OUTPUT was restricted to -o wg0, but routing decision happens
after iptables OUTPUT — so DNS to VPN-internal addresses (198.18.0.x)
was blocked before the kernel selected the outgoing interface.
Allow DNS unconditionally; routing still sends it through wg0.

Add NET_RAW capability so ping works inside the container.
2026-05-17 18:31:38 +02:00
cb0a36ccc2 chore: release v1.1.1 2026-05-16 21:47:05 +02:00
3644b16447 feat(vpn): add version logging from VERSION file
- Read version from /etc/wireguard/VERSION instead of hardcoding
- Copy VERSION file into container image during build
- Update VERSION to v1.1.0
2026-05-16 21:46:19 +02:00
d5116e378e chore: release v0.1.0 2026-05-16 21:41:40 +02:00
50a7083ce5 fix(vpn): support AllowedIPs=0.0.0.0/0 and multi-DNS configs
- Parse AllowedIPs dynamically from WireGuard config instead of hardcoding routes
- Remove auto-created default route by wg setconf to prevent breaking endpoint connection
- Fix DNS parsing: write comma-separated DNS servers as separate nameserver lines
- Add test for AllowedIPs route verification and DNS configuration
- Update test to skip container runtime tests when not running as root
2026-05-16 21:41:27 +02:00
52c0ff2337 chore(docs): remove temporary planning file docs/bla 2026-05-16 21:22:44 +02:00
a5fd88e224 chore(vpn): update WireGuard endpoint and credentials
- Rotate to new VPN endpoint (91.148.236.64)
- Update private/public keys and client address
- Switch DNS to 198.18.0.1/0.2
- Add local network route preservation via PostUp/PostDown
- Align nl.conf and wg0.conf configurations
2026-05-16 21:22:04 +02:00
98d4edad14 feat(vpn): dynamic AllowedIPs routing and improved test coverage
- Parse AllowedIPs from WireGuard config in entrypoint.sh
- Add/remove routes dynamically instead of hardcoded 0.0.0.0/1 split
- Handle both 0.0.0.0/0 and custom AllowedIPs
- Add route cleanup on VPN stop (endpoint + AllowedIPs)
- Update test_vpn.py with AllowedIPs route verification
- Allow non-root build-only tests with automatic runtime skip
2026-05-16 21:21:56 +02:00
bc8059b453 feat(docker): add release script and enhance push script
- Add release.sh for automated versioning and image pushing
- Enhance push.sh with target selection (app/vpn/all)
- Add docker/podman engine auto-detection
- Improve usage docs and error handling
2026-05-16 21:21:45 +02:00
815a4f1520 chore: release v0.0.1 2026-05-16 21:20:20 +02:00
e3509f5c8f feat(scanner): add DB fallback for series key resolution
When SerieScanner encounters a folder without a local key or data file,
it now optionally falls back to a database lookup by folder name. This
prevents newly-added series from being silently skipped on rescan when
their metadata only lives in the DB.

Changes:
- SerieScanner accepts an optional db_lookup callable
- SeriesApp forwards db_lookup to SerieScanner
- AnimeSeriesService adds get_by_folder_sync() helper
- dependencies.py wires a sync DB lookup into get_series_app()
- Unit tests cover fallback hit, miss, and exception paths
2026-05-14 19:28:43 +02:00
69c2fd01f9 chore: bump version to 1.0.1 2026-05-14 17:30:13 +02:00
0f36afd88c refactor: move NFO repair from initialization_service to folder_scan_service
Moves perform_nfo_repair_scan and its helpers (_repair_one_series,
_NFO_REPAIR_SEMAPHORE) into folder_scan_service.py so NFO repair runs
during the scheduled folder scan instead of on startup.

- Removes NFO repair code from initialization_service.py
- Updates all test imports and patch targets
- Updates docs/NFO_GUIDE.md and docs/CHANGELOG.md references

All 174 related tests pass.
2026-05-14 17:01:01 +02:00
ceac22fc34 test: fix NFO workflow and background loader tests
- Add missing TMDB async mock methods (_ensure_session, close)
  to all TMDB mocks in test_nfo_workflow.py
- Refactor test_anime_add_nfo_isolation.py to mock get_nfo_factory()
  instead of asserting on series_app.nfo_service directly
- Patch get_nfo_factory in test_background_loader_service.py
  to align with factory-based NFOService creation

Fixes test failures caused by NFOService refactoring that introduced
explicit TMDB session lifecycle and NFO factory pattern.
2026-05-13 12:41:22 +02:00
9c0f7ce08d test: add tests for scheduled folder scan and startup NFO repair removal
Add comprehensive test coverage for Tasks 1.1–1.5 and 2.1:

- test_scheduler_config_model.py: folder_scan_enabled defaults, explicit
  values, backward compatibility with old configs, serialization roundtrip
- test_folder_scan_service.py (new): prerequisites, NFO repair integration,
  folder rename integration, poster check/download, semaphore values,
  NFO thumb URL extraction, full end-to-end scan flow
- test_scheduler_service.py: scheduler _perform_rescan integration with
  folder_scan_enabled (called when enabled, skipped when disabled, error
  handling and broadcasting), folder_scan_enabled in get_status output
- test_nfo_repair_startup.py: verify perform_nfo_repair_scan is NOT called
  during FastAPI lifespan startup and IS called from FolderScanService

All 90 tests pass.
2026-05-13 09:43:34 +02:00
756731cd5d feat: remove startup NFO repair, update docs and tests
- Remove NFO repair scan step from ARCHITECTURE.md startup sequence
- Update CHANGELOG.md: rephrase perform_nfo_repair_scan as scheduled scan
- Add test verifying perform_nfo_repair_scan is NOT called in lifespan
- Keep existing folder scan wiring tests and unit tests intact
- NFO_GUIDE.md already correctly describes scheduled scan behavior
2026-05-13 09:23:21 +02:00
eb0e6e8ccb fix: task 1.5 poster check + fix stuck tests
- Fix structlog format string in folder_scan_service (%(key)d -> kwargs)
- Add nfo_download_poster setting check before poster download
- Create missing NFO fixture files (tvshow.nfo.bad/good) for repair tests
- Fix test_context_used_in_logging to check all call args not format string
- Fix test_system_settings_integration isolation via reset_all_scans
2026-05-13 08:07:16 +02:00
eb2fc3c5ab feat: integrate NFO repair into scheduled folder scan
- Add FolderScanService.run_folder_scan() calling perform_nfo_repair_scan()
- Remove startup-time NFO repair from fastapi_app lifespan
- Update docs/NFO_GUIDE.md: repair now runs as part of daily scan
- Update tests to verify integration wiring
- Update ARCHITECTURE.md and scheduler_service for scan scheduling
2026-05-12 20:15:32 +02:00
c39ae9d0fc feat(scheduler): add folder_scan_enabled toggle to SchedulerConfig
- Add folder_scan_enabled boolean field (default false) to SchedulerConfig
- Update data/config.json example with new field
- Add checkbox to setup.html and include in JS payload
- Handle field in auth.py setup endpoint
- Expose field in scheduler API response
- Log and return field in scheduler_service.py
- Update docs/CONFIGURATION.md and docs/ARCHITECTURE.md
- Update index.html UI, app.js and scheduler-config.js handlers
- Verified backward compatibility: old configs load with default False
2026-05-11 21:02:05 +02:00
079f1f99e3 backup 2026-04-19 19:00:05 +02:00
9373f500d3 Commit remaining tracked changes 2026-04-19 18:57:26 +02:00
2274403899 Fix NFO plot fallback by using en-US search overview when German result is empty 2026-04-19 18:53:11 +02:00
6ad14c03b5 Task 3: remove non-reentrant TMDB context in NFOService and mark task done 2026-04-19 18:49:21 +02:00
b10cce0489 Task 2: guard SeriesApp NFOService init on NFOServiceFactory fallback and document config-only TMDB API key support 2026-04-19 18:46:30 +02:00
2aa184c870 Mark Task 1 done for NFOService per-task isolation 2026-04-19 18:41:04 +02:00
92bd55ada1 chore: apply pending code updates 2026-03-17 11:39:27 +01:00
e5fae0a0a2 docs: add logging instruction reference to tasks 2026-03-17 11:38:57 +01:00
151a08e033 fix: support missing/no-episodes library filters (API, UI, docs, tests) 2026-03-16 21:01:59 +01:00
e44a8190d0 chore: update gitignore and vscode settings for venv, clean up instructions 2026-03-14 09:37:30 +01:00
94720f2d61 fix: use worker_tasks list instead of non-existent worker_task attribute 2026-03-14 09:33:32 +01:00
0ec120e08f fix: reset queue progress flag after queue completes
- Reset _queue_progress_initialized after each queue run so the next
  run re-creates the 'download_queue' progress entry
- Handle 'already exists' ProgressServiceError in _init_queue_progress
  as a no-op success to cover concurrent-start edge cases
- Guard stop_downloads() progress update to avoid crashing when the
  entry was never created
2026-03-11 16:41:12 +01:00
db58ea9396 docs: mark NFO plot fix task as done 2026-03-06 21:20:37 +01:00
69b409f42d fix: ensure all NFO properties are written on creation
- Add showtitle and namedseason to mapper output
- Add multi-language fallback (en-US, ja-JP) for empty overview
- Use search result overview as last resort fallback
- Add tests for new NFO creation behavior
2026-03-06 21:20:17 +01:00
b34ee59bca fix: remove missing episode from DB and memory after download completes
- Fixed _remove_episode_from_missing_list to also update in-memory
  Serie.episodeDict and refresh series_list
- Added _remove_episode_from_memory helper method
- Enhanced logging for download completion and episode removal
- Added 5 unit tests for missing episode removal
2026-02-26 21:02:08 +01:00
624c0db16e Remove per-card NFO action buttons; add bulk NFO refresh for selected 2026-02-26 20:52:21 +01:00
e6d9f9f342 feat: add English fallback for empty German TMDB overview in NFO creation
When TMDB returns an empty German (de-DE) overview for anime (e.g.
Basilisk), the NFO plot tag was missing. Now both create and update
paths call _enrich_details_with_fallback() which fetches the English
(en-US) overview as a fallback.

Additionally, the <plot> XML element is always written (even when
empty) via the always_write parameter on _add_element(), ensuring
consistent NFO structure regardless of creation path.

Changes:
- nfo_service.py: add _enrich_details_with_fallback() method, call it
  in create_tvshow_nfo and update_tvshow_nfo
- nfo_generator.py: add always_write param to _add_element(), use it
  for <plot> tag
- test_nfo_service.py: add TestEnrichDetailsWithFallback with 4 tests
2026-02-26 20:48:47 +01:00
fc8cdc538d fix(docker): add missing Python deps, fix VPN routing and healthcheck
- Add missing packages to requirements.txt: requests, beautifulsoup4,
  fake-useragent, yt-dlp, urllib3
- Fix entrypoint.sh: replace grep -oP (GNU) with awk (BusyBox compat)
- Fix entrypoint.sh: add policy routing so LAN clients get responses
  via eth0 instead of through the WireGuard tunnel
- Change healthcheck from ping to curl (VPN provider blocks ICMP)
- Add start_period and increase retries for healthcheck
- Change external port mapping to 2000:8000
- Add podman-compose.prod.yml and push.sh to version control
2026-02-24 19:21:53 +01:00
182 changed files with 22188 additions and 4223 deletions

View File

@@ -17,6 +17,9 @@ __pycache__/
# Docker files (not needed inside the image)
Docker/
# Exception: VERSION is needed by Dockerfile.app
!Docker/VERSION
# Test and dev files
tests/
Temp/

4
.gitignore vendored
View File

@@ -4,6 +4,7 @@
/src/__pycache__/*
/src/__pycache__/
/.vs/*
/.venv/*
/src/Temp/*
/src/Loaders/__pycache__/*
/src/Loaders/provider/__pycache__/*
@@ -81,4 +82,5 @@ Temp/
temp/
tmp/
*.tmp
.coverage
.coverage
.venv/bin/dotenv

View File

@@ -1,8 +1,11 @@
{
"python.defaultInterpreterPath": "C:\\Users\\lukas\\anaconda3\\envs\\AniWorld\\python.exe",
"python.defaultInterpreterPath": "${workspaceFolder}/.venv/bin/python",
"python.terminal.activateEnvironment": true,
"python.condaPath": "C:\\Users\\lukas\\anaconda3\\Scripts\\conda.exe",
"python.terminal.activateEnvInCurrentTerminal": true,
"terminal.integrated.env.linux": {
"VIRTUAL_ENV": "${workspaceFolder}/.venv",
"PATH": "${workspaceFolder}/.venv/bin:${env:PATH}"
},
"python.linting.enabled": true,
"python.linting.flake8Enabled": true,
"python.linting.pylintEnabled": true,

View File

@@ -13,12 +13,13 @@ RUN apk add --no-cache \
# Create wireguard config directory (config is mounted at runtime)
RUN mkdir -p /etc/wireguard
# Copy entrypoint
# Copy version file and entrypoint
COPY VERSION /etc/wireguard/VERSION
COPY entrypoint.sh /entrypoint.sh
RUN chmod +x /entrypoint.sh
# Health check: can we reach the internet through the VPN?
HEALTHCHECK --interval=30s --timeout=10s --retries=3 \
CMD ping -c 1 -W 5 1.1.1.1 || exit 1
HEALTHCHECK --interval=30s --timeout=10s --retries=5 \
CMD curl -sf --max-time 5 http://1.1.1.1 || exit 1
ENTRYPOINT ["/entrypoint.sh"]

View File

@@ -2,12 +2,13 @@ FROM python:3.12-slim
WORKDIR /app
# Install system dependencies for compiled Python packages
# Install system dependencies for compiled Python packages and ffmpeg for HLS support
RUN apt-get update && \
apt-get install -y --no-install-recommends \
gcc \
g++ \
libffi-dev \
ffmpeg \
&& rm -rf /var/lib/apt/lists/*
# Install Python dependencies (cached layer)
@@ -19,6 +20,7 @@ COPY src/ ./src/
COPY run_server.py .
COPY pyproject.toml .
COPY data/config.json ./data/config.json
COPY Docker/VERSION ./Docker/VERSION
# Create runtime directories
RUN mkdir -p /app/data/config_backups /app/logs

1
Docker/VERSION Normal file
View File

@@ -0,0 +1 @@
v1.3.6

View File

@@ -1,6 +1,14 @@
#!/bin/bash
set -e
VERSION_FILE="/etc/wireguard/VERSION"
if [ -f "$VERSION_FILE" ]; then
VERSION=$(cat "$VERSION_FILE")
else
VERSION="unknown"
fi
echo "[init] VPN Container Entrypoint ${VERSION}"
INTERFACE="wg0"
MOUNT_CONFIG="/etc/wireguard/${INTERFACE}.conf"
CONFIG_DIR="/run/wireguard"
@@ -64,9 +72,11 @@ setup_killswitch() {
iptables -A INPUT -i "$INTERFACE" -j ACCEPT
iptables -A OUTPUT -o "$INTERFACE" -j ACCEPT
# Allow DNS to the VPN DNS server (through wg0)
iptables -A OUTPUT -o "$INTERFACE" -p udp --dport 53 -j ACCEPT
iptables -A OUTPUT -o "$INTERFACE" -p tcp --dport 53 -j ACCEPT
# Allow DNS (VPN DNS servers are routed through wg0; allow before routing decision)
iptables -A OUTPUT -p udp --dport 53 -j ACCEPT
iptables -A OUTPUT -p tcp --dport 53 -j ACCEPT
iptables -A INPUT -p udp --sport 53 -j ACCEPT
iptables -A INPUT -p tcp --sport 53 -j ACCEPT
# Allow DHCP (for container networking)
iptables -A OUTPUT -p udp --dport 67:68 -j ACCEPT
@@ -101,7 +111,9 @@ setup_killswitch() {
# ──────────────────────────────────────────────
enable_forwarding() {
echo "[init] Enabling IP forwarding..."
if echo 1 > /proc/sys/net/ipv4/ip_forward 2>/dev/null; then
if cat /proc/sys/net/ipv4/ip_forward 2>/dev/null | grep -q 1; then
echo "[init] IP forwarding already enabled."
elif echo 1 > /proc/sys/net/ipv4/ip_forward 2>/dev/null; then
echo "[init] IP forwarding enabled via /proc."
else
echo "[init] /proc read-only — relying on --sysctl net.ipv4.ip_forward=1"
@@ -118,35 +130,91 @@ start_vpn() {
ip link add "$INTERFACE" type wireguard
# Apply the WireGuard config (keys, peer, endpoint)
# Filter out wg-quick directives that wg setconf doesn't understand
wg setconf "$INTERFACE" <(grep -v -i '^\(Address\|DNS\|MTU\|Table\|PreUp\|PostUp\|PreDown\|PostDown\|SaveConfig\)' "$CONFIG_FILE")
# Log public key so it can be verified against the server's peer list
local PUBKEY
PUBKEY=$(wg show "$INTERFACE" public-key 2>/dev/null || echo "unknown")
echo "[vpn] Public key: ${PUBKEY}"
# Assign the address
ip -4 address add "$VPN_ADDRESS" dev "$INTERFACE"
# Set MTU
# Set MTU and bring up
ip link set mtu 1420 up dev "$INTERFACE"
# Find default gateway/interface for the endpoint route
# ── fwmark-based routing (mirrors wg-quick behavior) ──
# WireGuard marks its own encapsulated UDP packets with this fwmark.
# Policy rules then ensure:
# - Normal packets (no mark) → VPN routing table → wg0
# - WireGuard-encapsulated packets (marked) → main table → eth0
local FW_MARK=51820
local FW_TABLE=51820
wg set "$INTERFACE" fwmark "$FW_MARK"
# Remove any auto-created default route on wg0
ip route del default dev "$INTERFACE" 2>/dev/null || true
# VPN routing table: send everything through the tunnel
ip -4 route add default dev "$INTERFACE" table "$FW_TABLE"
# Policy rules:
# 1. Packets NOT marked by WireGuard use the VPN table (→ wg0)
# 2. suppress_prefixlength 0: ignore bare default routes in main table,
# but keep more-specific routes (e.g. LAN, endpoint) working
ip -4 rule add not fwmark "$FW_MARK" table "$FW_TABLE"
ip -4 rule add table main suppress_prefixlength 0
# Find default gateway/interface
DEFAULT_GW=$(ip route | grep '^default' | head -1 | awk '{print $3}')
DEFAULT_IF=$(ip route | grep '^default' | head -1 | awk '{print $5}')
# Route VPN endpoint through the container's default gateway
# ── Policy routing: ensure responses to incoming LAN traffic go back via eth0 ──
if [ -n "$DEFAULT_GW" ] && [ -n "$DEFAULT_IF" ]; then
ip route add "$VPN_ENDPOINT/32" via "$DEFAULT_GW" dev "$DEFAULT_IF" 2>/dev/null || true
# Get the container's eth0 IP address (BusyBox-compatible, no grep -P)
ETH0_IP=$(ip -4 addr show "$DEFAULT_IF" | awk '/inet / {split($2, a, "/"); print a[1]}' | head -1)
ETH0_SUBNET=$(ip -4 route show dev "$DEFAULT_IF" | grep -v default | head -1 | awk '{print $1}')
if [ -n "$ETH0_IP" ] && [ -n "$ETH0_SUBNET" ]; then
echo "[vpn] Setting up policy routing for incoming traffic (${ETH0_IP} on ${DEFAULT_IF})"
ip route add default via "$DEFAULT_GW" dev "$DEFAULT_IF" table 100 2>/dev/null || true
ip route add "$ETH0_SUBNET" dev "$DEFAULT_IF" table 100 2>/dev/null || true
ip rule add from "$ETH0_IP" table 100 priority 100 2>/dev/null || true
echo "[vpn] Policy routing active — incoming connections will be routed back via ${DEFAULT_IF}"
fi
fi
# Route all traffic through the WireGuard tunnel
ip route add 0.0.0.0/1 dev "$INTERFACE"
ip route add 128.0.0.0/1 dev "$INTERFACE"
# Set up DNS
# Set up DNS (handle comma-separated DNS servers)
VPN_DNS=$(grep -i '^DNS' "$CONFIG_FILE" | head -1 | sed 's/.*= *//;s/ //g')
if [ -n "$VPN_DNS" ]; then
echo "nameserver $VPN_DNS" > /etc/resolv.conf
echo "[vpn] DNS set to ${VPN_DNS}"
# Clear resolv.conf and add each DNS server on its own line
> /etc/resolv.conf
for dns in $(echo "$VPN_DNS" | tr ',' ' '); do
echo "nameserver $dns" >> /etc/resolv.conf
# Add explicit route to DNS server through wg0 so it's found in main table
# (suppress_prefixlength 0 ignores default routes but allows host routes)
ip -4 route add "$dns" dev "$INTERFACE" 2>/dev/null || true
done
echo "[vpn] DNS set to: ${VPN_DNS}"
fi
# Add explicit host route for the health-check target so it is picked up by
# the 'lookup main suppress_prefixlength 0' rule (same as DNS servers above).
# Without this, CHECK_HOST falls through to the VPN table default route whose
# source-address selection can be defeated by the priority-100 'from ETH0_IP'
# policy rule, causing pings to bypass wg0 and be dropped by the kill switch.
ip -4 route add "${CHECK_HOST}" dev "$INTERFACE" 2>/dev/null || true
echo "[vpn] Health-check route: ${CHECK_HOST}${INTERFACE}"
echo "[vpn] WireGuard interface ${INTERFACE} is up."
echo "[vpn] Main routes:"
ip route show | sed 's/^/[vpn] /'
echo "[vpn] VPN table ($FW_TABLE):"
ip route show table "$FW_TABLE" 2>/dev/null | sed 's/^/[vpn] /'
echo "[vpn] Policy rules:"
ip rule show | sed 's/^/[vpn] /'
echo "[vpn] WireGuard status:"
wg show "$INTERFACE" 2>/dev/null | sed 's/^/[vpn] /'
}
# ──────────────────────────────────────────────
@@ -154,6 +222,21 @@ start_vpn() {
# ──────────────────────────────────────────────
stop_vpn() {
echo "[vpn] Stopping WireGuard interface ${INTERFACE}..."
local FW_MARK=51820
local FW_TABLE=51820
# Remove fwmark-based policy rules
ip -4 rule del not fwmark "$FW_MARK" table "$FW_TABLE" 2>/dev/null || true
ip -4 rule del table main suppress_prefixlength 0 2>/dev/null || true
# Flush VPN routing table
ip -4 route flush table "$FW_TABLE" 2>/dev/null || true
# Remove LAN policy routing
ip -4 rule del table 100 2>/dev/null || true
ip -4 route flush table 100 2>/dev/null || true
ip link del "$INTERFACE" 2>/dev/null || true
}
@@ -174,9 +257,26 @@ health_loop() {
echo "[health] VPN recovered."
failures=0
fi
# Secondary DNS check
if ping -c 1 -W 5 "google.com" > /dev/null 2>&1; then
: # DNS OK — silent
else
echo "[health] WARN google.com unreachable — possible DNS issue"
fi
else
failures=$((failures + 1))
echo "[health] Ping failed ($failures/$max_failures)"
echo "[health] Check failed ($failures/$max_failures) — ping ${CHECK_HOST} failed"
# Secondary check: distinguish IP failure from DNS failure
if ping -c 1 -W 5 "google.com" > /dev/null 2>&1; then
echo "[health] INFO google.com reachable — DNS works, ${CHECK_HOST} may be filtered"
else
echo "[health] INFO google.com also unreachable — DNS or general routing failure"
fi
# Dump WireGuard stats to show if handshake is stale and how much data flows
echo "[health] wg stats:"
wg show "$INTERFACE" 2>/dev/null | grep -E 'latest handshake|transfer|endpoint' | sed 's/^/[health] /' || echo "[health] wg0 not found"
echo "[health] routes:"
ip route show | grep -E 'wg0|default' | sed 's/^/[health] /'
if [ "$failures" -ge "$max_failures" ]; then
echo "[health] VPN appears down. Restarting WireGuard..."
@@ -205,8 +305,83 @@ cleanup() {
trap cleanup SIGTERM SIGINT
# ──────────────────────────────────────────────
# Startup connectivity checks — diagnose issues early
# ──────────────────────────────────────────────
check_vpn_connectivity() {
echo "[check] ── Startup connectivity checks ──"
# 1. Wait for WireGuard handshake (up to 15s)
local elapsed=0
local handshake_ts=0
echo "[check] Waiting for WireGuard handshake (up to 15s)..."
while [ "$elapsed" -lt 15 ]; do
handshake_ts=$(wg show "$INTERFACE" latest-handshakes 2>/dev/null | awk '{print $2}' | head -1)
if [ -n "$handshake_ts" ] && [ "$handshake_ts" != "0" ]; then
local age=$(( $(date +%s) - handshake_ts ))
echo "[check] OK Handshake established ${age}s ago"
break
fi
sleep 1
elapsed=$((elapsed + 1))
done
if [ "$elapsed" -ge 15 ]; then
echo "[check] FAIL No WireGuard handshake after 15s — tunnel is not up"
echo "[check] This container's public key (must be on the server):"
echo "[check] PublicKey = $(wg show "$INTERFACE" public-key 2>/dev/null || echo 'unknown')"
echo "[check] AllowedIPs = ${VPN_ADDRESS}"
echo "[check] Verify on server: wg show"
fi
# 2. Check whether traffic actually flows through the tunnel
echo "[check] Testing traffic through tunnel (ping ${CHECK_HOST})..."
local rx_before
rx_before=$(wg show "$INTERFACE" transfer 2>/dev/null | awk '{print $2}' | head -1)
if ping -c 1 -W 8 "${CHECK_HOST}" > /dev/null 2>&1; then
echo "[check] OK Traffic flows — tunnel is fully working"
else
local rx_after
rx_after=$(wg show "$INTERFACE" transfer 2>/dev/null | awk '{print $2}' | head -1)
echo "[check] FAIL ping ${CHECK_HOST} unreachable through tunnel"
if [ -n "$rx_before" ] && [ -n "$rx_after" ]; then
if [ "$rx_after" -le "$rx_before" ]; then
echo "[check] RX bytes unchanged (${rx_before}${rx_after})"
echo "[check] Server receives packets but does NOT route them back"
echo "[check] Fix on VPN server (${VPN_ENDPOINT}):"
echo "[check] sysctl net.ipv4.ip_forward # must output 1"
echo "[check] iptables -t nat -L POSTROUTING -v -n # must have MASQUERADE"
echo "[check] wg show # check peer + AllowedIPs"
else
echo "[check] RX increased (${rx_before}${rx_after}) — tunnel passes data"
echo "[check] Issue may be specific to ${CHECK_HOST} or DNS"
fi
fi
local transfer
transfer=$(wg show "$INTERFACE" transfer 2>/dev/null | awk '{printf "rx=%s tx=%s", $2, $3}')
echo "[check] wg transfer: ${transfer}"
fi
# 3. DNS check
echo "[check] Testing DNS resolution..."
if nslookup 1.1.1.1 > /dev/null 2>&1 || nslookup google.com > /dev/null 2>&1; then
echo "[check] OK DNS resolves"
else
echo "[check] FAIL DNS resolution failed"
echo "[check] resolv.conf: $(tr '\n' ' ' < /etc/resolv.conf)"
echo "[check] Check that DNS servers are reachable through wg0"
echo "[check] ── End of checks ──"
exit 1
fi
echo "[check] ── End of checks ──"
}
# ── Main ──
enable_forwarding
setup_killswitch
start_vpn
check_vpn_connectivity
health_loop

View File

@@ -1,17 +1,16 @@
[Interface]
PrivateKey = iO5spIue/6ciwUoR95hYtuxdtQxV/Q9EOoQ/jHe18kM=
Address = 10.2.0.2/32
DNS = 10.2.0.1
PrivateKey = EPRa2f/v72LvIXAY4yqIRJifsSb+nCcYHqC2rwA94UI=
Address = 100.64.244.78/32
DNS = 198.18.0.1,198.18.0.2
# Route zum VPN-Server direkt über dein lokales Netz
PostUp = ip route add 185.183.34.149 via 192.168.178.1 dev wlp4s0f0
PostUp = ip route add 91.148.236.64 via 192.168.178.1 dev wlp4s0f0
PostUp = ip route add 192.168.178.0/24 via 192.168.178.1 dev wlp4s0f0
PostDown = ip route del 185.183.34.149 via 192.168.178.1 dev wlp4s0f0
PostDown = ip route del 91.148.236.64 via 192.168.178.1 dev wlp4s0f0
PostDown = ip route del 192.168.178.0/24 via 192.168.178.1 dev wlp4s0f0
[Peer]
PublicKey = J4XVdtoBVc/EoI2Yk673Oes97WMnQSH5KfamZNjtM2s=
AllowedIPs = 0.0.0.0/1, 128.0.0.0/1
Endpoint = 185.183.34.149:51820
PublicKey = KgTUh3KLijVluDvNpzDCJJfrJ7EyLzYLmdHCksG4sRg=
AllowedIPs = 0.0.0.0/0
Endpoint = 91.148.236.64:51820

View File

@@ -0,0 +1,56 @@
# Production compose — pulls pre-built images from Gitea registry.
#
# Usage:
# podman login git.lpl-mind.de
# podman-compose -f podman-compose.prod.yml pull
# podman-compose -f podman-compose.prod.yml up -d
#
# Required files:
# - wg0.conf (WireGuard configuration in the same directory)
services:
vpn:
image: git.lpl-mind.de/lukas.pupkalipinski/aniworld/vpn:latest
container_name: vpn-wireguard
cap_add:
- NET_ADMIN
- SYS_MODULE
- NET_RAW
sysctls:
- net.ipv4.ip_forward=1
- net.ipv4.conf.all.src_valid_mark=1
volumes:
- /server/server_aniworld/wg0.conf:/etc/wireguard/wg0.conf:ro
- /lib/modules:/lib/modules:ro
ports:
- "8000:8000"
environment:
- HEALTH_CHECK_INTERVAL=10
- HEALTH_CHECK_HOST=1.1.1.1
- LOCAL_PORTS=8000
- PUID=1013
- PGID=1001
restart: unless-stopped
healthcheck:
test: ["CMD", "curl", "-sf", "--max-time", "5", "http://1.1.1.1"]
interval: 30s
timeout: 10s
retries: 5
start_period: 60s
app:
image: git.lpl-mind.de/lukas.pupkalipinski/aniworld/app:latest
container_name: aniworld-app
network_mode: "service:vpn"
depends_on:
vpn:
condition: service_healthy
environment:
- PYTHONUNBUFFERED=1
- PUID=1013
- PGID=1001
volumes:
- /server/server_aniworld/data:/app/data
- /server/server_aniworld/logs:/app/logs
- /media/serien/Serien:/data
restart: unless-stopped

View File

@@ -7,6 +7,7 @@ services:
cap_add:
- NET_ADMIN
- SYS_MODULE
- NET_RAW
sysctls:
- net.ipv4.ip_forward=1
- net.ipv4.conf.all.src_valid_mark=1

140
Docker/push.sh Normal file
View File

@@ -0,0 +1,140 @@
#!/usr/bin/env bash
#
# Build and push AniWorld container images to the Gitea registry.
#
# Usage:
# ./push.sh # builds & pushes app with tag "latest"
# ./push.sh app # builds & pushes app image
# ./push.sh vpn # builds & pushes vpn image
# ./push.sh all # builds & pushes both images
# ./push.sh app v1.2.3 # builds & pushes app with tag "v1.2.3"
# ./push.sh vpn v1.2.3 # builds & pushes vpn with tag "v1.2.3"
# ./push.sh all v1.2.3 # builds & pushes both images
# ./push.sh app v1.2.3 --no-build # pushes existing image only
#
# Prerequisites:
# podman login git.lpl-mind.de (or: docker login git.lpl-mind.de)
set -euo pipefail
# ---------------------------------------------------------------------------
# Configuration
# ---------------------------------------------------------------------------
REGISTRY="git.lpl-mind.de"
NAMESPACE="lukas.pupkalipinski"
PROJECT="aniworld"
APP_IMAGE="${REGISTRY}/${NAMESPACE}/${PROJECT}/app"
VPN_IMAGE="${REGISTRY}/${NAMESPACE}/${PROJECT}/vpn"
# Parse arguments
TARGET="${1:-app}"
TAG="${2:-latest}"
SKIP_BUILD=false
if [[ "${3:-}" == "--no-build" ]]; then
SKIP_BUILD=true
fi
# Validate target
if [[ "${TARGET}" != "app" && "${TARGET}" != "vpn" && "${TARGET}" != "all" ]]; then
echo "ERROR: Invalid target '${TARGET}'. Must be one of: app, vpn, all" >&2
exit 1
fi
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(cd "${SCRIPT_DIR}/.." && pwd)"
# ---------------------------------------------------------------------------
# Helpers
# ---------------------------------------------------------------------------
log() { echo -e "\n>>> $*"; }
err() { echo -e "\nERROR: $*" >&2; exit 1; }
# Detect container engine (podman preferred, docker fallback)
if command -v podman &>/dev/null; then
ENGINE="podman"
elif command -v docker &>/dev/null; then
ENGINE="docker"
else
err "Neither podman nor docker is installed."
fi
# ---------------------------------------------------------------------------
# Pre-flight checks
# ---------------------------------------------------------------------------
echo "============================================"
echo " AniWorld — Build & Push"
echo " Engine : ${ENGINE}"
echo " Registry : ${REGISTRY}"
echo " Target : ${TARGET}"
echo " Tag : ${TAG}"
echo "============================================"
log "Logging in to ${REGISTRY}"
"${ENGINE}" login "${REGISTRY}"
# ---------------------------------------------------------------------------
# Build
# ---------------------------------------------------------------------------
build_app() {
log "Building app image → ${APP_IMAGE}:${TAG}"
"${ENGINE}" build \
-t "${APP_IMAGE}:${TAG}" \
-f "${SCRIPT_DIR}/Dockerfile.app" \
"${PROJECT_ROOT}"
}
build_vpn() {
log "Building vpn image → ${VPN_IMAGE}:${TAG}"
"${ENGINE}" build \
-t "${VPN_IMAGE}:${TAG}" \
-f "${SCRIPT_DIR}/Containerfile" \
"${SCRIPT_DIR}"
}
if [[ "${SKIP_BUILD}" == false ]]; then
case "${TARGET}" in
app) build_app ;;
vpn) build_vpn ;;
all) build_app; build_vpn ;;
esac
fi
# ---------------------------------------------------------------------------
# Push
# ---------------------------------------------------------------------------
push_app() {
log "Pushing ${APP_IMAGE}:${TAG}"
"${ENGINE}" push "${APP_IMAGE}:${TAG}"
}
push_vpn() {
log "Pushing ${VPN_IMAGE}:${TAG}"
"${ENGINE}" push "${VPN_IMAGE}:${TAG}"
}
case "${TARGET}" in
app) push_app ;;
vpn) push_vpn ;;
all) push_app; push_vpn ;;
esac
# ---------------------------------------------------------------------------
# Summary
# ---------------------------------------------------------------------------
echo ""
echo "============================================"
echo " Push complete!"
echo ""
echo " Images:"
case "${TARGET}" in
app) echo " ${APP_IMAGE}:${TAG}" ;;
vpn) echo " ${VPN_IMAGE}:${TAG}" ;;
all) echo " ${APP_IMAGE}:${TAG}"; echo " ${VPN_IMAGE}:${TAG}" ;;
esac
echo ""
echo " Deploy on server:"
echo " ${ENGINE} login ${REGISTRY}"
echo " ${ENGINE} compose -f Docker/podman-compose.prod.yml pull"
echo " ${ENGINE} compose -f Docker/podman-compose.prod.yml up -d"
echo "============================================"

129
Docker/release.sh Normal file
View File

@@ -0,0 +1,129 @@
#!/usr/bin/env bash
#
# Bump the project version and push images to the registry.
#
# Usage:
# ./release.sh
#
# The current version is stored in VERSION (next to this script).
# You will be asked whether to bump major, minor, or patch.
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
VERSION_FILE="${SCRIPT_DIR}/VERSION"
# ---------------------------------------------------------------------------
# Read current version
# ---------------------------------------------------------------------------
if [[ ! -f "${VERSION_FILE}" ]]; then
echo "0.0.0" > "${VERSION_FILE}"
fi
CURRENT="$(cat "${VERSION_FILE}")"
# Strip leading 'v' for arithmetic
VERSION="${CURRENT#v}"
IFS='.' read -r MAJOR MINOR PATCH <<< "${VERSION}"
echo "============================================"
echo " AniWorld — Release"
echo " Current version: v${MAJOR}.${MINOR}.${PATCH}"
echo "============================================"
echo ""
echo "Which image(s) would you like to release?"
echo " 1) app (Dockerfile.app)"
echo " 2) vpn (Containerfile)"
echo " 3) all (both images)"
echo ""
read -rp "Enter choice [1/2/3]: " TARGET_CHOICE
case "${TARGET_CHOICE}" in
1) TARGET="app" ;;
2) TARGET="vpn" ;;
3) TARGET="all" ;;
*)
echo "Invalid choice. Aborting." >&2
exit 1
;;
esac
echo ""
echo "How would you like to bump the version?"
echo " 1) patch (v${MAJOR}.${MINOR}.${PATCH} → v${MAJOR}.${MINOR}.$((PATCH + 1)))"
echo " 2) minor (v${MAJOR}.${MINOR}.${PATCH} → v${MAJOR}.$((MINOR + 1)).0)"
echo " 3) major (v${MAJOR}.${MINOR}.${PATCH} → v$((MAJOR + 1)).0.0)"
echo ""
read -rp "Enter choice [1/2/3]: " CHOICE
case "${CHOICE}" in
1) NEW_TAG="v${MAJOR}.${MINOR}.$((PATCH + 1))" ;;
2) NEW_TAG="v${MAJOR}.$((MINOR + 1)).0" ;;
3) NEW_TAG="v$((MAJOR + 1)).0.0" ;;
*)
echo "Invalid choice. Aborting." >&2
exit 1
;;
esac
echo ""
echo "New version: ${NEW_TAG}"
echo "Target: ${TARGET}"
read -rp "Confirm? [y/N]: " CONFIRM
if [[ ! "${CONFIRM}" =~ ^[yY]$ ]]; then
echo "Aborted."
exit 0
fi
# ---------------------------------------------------------------------------
# Write new version
# ---------------------------------------------------------------------------
echo "${NEW_TAG}" > "${VERSION_FILE}"
echo "Version file updated → ${VERSION_FILE}"
# Keep root package.json in sync.
FRONT_VERSION="${NEW_TAG#v}"
FRONT_PKG="${SCRIPT_DIR}/../package.json"
if [[ -f "${FRONT_PKG}" ]]; then
sed -i "s/\"version\": \"[^\"]*\"/\"version\": \"${FRONT_VERSION}\"/" "${FRONT_PKG}"
echo "package.json version updated → ${FRONT_VERSION}"
else
echo "Warning: package.json not found, skipping package.json version sync" >&2
fi
# Keep root pyproject.toml in sync.
BACKEND_PYPROJECT="${SCRIPT_DIR}/../pyproject.toml"
if [[ -f "${BACKEND_PYPROJECT}" ]]; then
# Update version under [project] section if present
if grep -q '^\[project\]' "${BACKEND_PYPROJECT}"; then
sed -i "/^\[project\]/,/^\[/ s/^version = \".*\"/version = \"${FRONT_VERSION}\"/" "${BACKEND_PYPROJECT}"
else
sed -i "s/^version = \".*\"/version = \"${FRONT_VERSION}\"/" "${BACKEND_PYPROJECT}"
fi
echo "pyproject.toml version updated → ${FRONT_VERSION}"
else
echo "Warning: pyproject.toml not found, skipping pyproject.toml version sync" >&2
fi
# ---------------------------------------------------------------------------
# Push containers
# ---------------------------------------------------------------------------
bash "${SCRIPT_DIR}/push.sh" "${TARGET}" "${NEW_TAG}"
bash "${SCRIPT_DIR}/push.sh" "${TARGET}"
# ---------------------------------------------------------------------------
# Git tag (local only; push after container build)
# ---------------------------------------------------------------------------
cd "${SCRIPT_DIR}/.."
git add Docker/VERSION package.json pyproject.toml
git commit -m "chore: bump version"
git tag -a "${NEW_TAG}" -m "Release ${NEW_TAG}"
echo "Local git commit + tag ${NEW_TAG} created."
# ---------------------------------------------------------------------------
# Push git commits & tag
# ---------------------------------------------------------------------------
git push origin HEAD
git push origin "${NEW_TAG}"
echo "Git commit and tag ${NEW_TAG} pushed."

View File

@@ -6,22 +6,31 @@ Verifies:
2. The container starts and becomes healthy.
3. The public IP inside the VPN differs from the host IP.
4. Kill switch blocks traffic when WireGuard is down.
5. AllowedIPs routes are set dynamically from the config.
Requirements:
- podman installed
- Root/sudo (NET_ADMIN capability)
- Root/sudo (NET_ADMIN capability) for container runtime tests
- A valid WireGuard config at ./wg0.conf (or ./nl.conf)
Usage:
# Build-only test (no sudo needed):
python3 -m pytest test_vpn.py::TestVPNImage::test_00_build_image -v
# Full integration test (requires sudo):
sudo python3 -m pytest test_vpn.py -v
# or
sudo python3 test_vpn.py
"""
import logging
import os
import subprocess
import sys
import time
import unittest
import os
logger = logging.getLogger(__name__)
IMAGE_NAME = "vpn-wireguard-test"
CONTAINER_NAME = "vpn-test-container"
@@ -32,6 +41,11 @@ STARTUP_TIMEOUT = 30 # seconds to wait for VPN to come up
HEALTH_POLL_INTERVAL = 2 # seconds between health checks
def is_root() -> bool:
"""Check if running as root."""
return os.geteuid() == 0
def run(cmd: list[str], timeout: int = 30, check: bool = True) -> subprocess.CompletedProcess:
"""Run a command and return the result."""
return subprocess.run(cmd, capture_output=True, text=True, timeout=timeout, check=check)
@@ -52,6 +66,7 @@ class TestVPNImage(unittest.TestCase):
"""Test suite for the WireGuard VPN container."""
host_ip: str = ""
container_id: str = ""
@classmethod
def setUpClass(cls):
@@ -63,23 +78,32 @@ class TestVPNImage(unittest.TestCase):
)
# ── 1. Get host public IP before VPN ──
print("\n[setup] Fetching host public IP...")
logger.info("Fetching host public IP...")
cls.host_ip = get_host_ip()
print(f"[setup] Host public IP: {cls.host_ip}")
logger.info("Host public IP: %s", cls.host_ip)
assert cls.host_ip, "Could not determine host public IP"
# ── 2. Build the image ──
print(f"[setup] Building image '{IMAGE_NAME}'...")
logger.info("Building image '%s'...", IMAGE_NAME)
result = run(
["podman", "build", "-t", IMAGE_NAME, BUILD_DIR],
timeout=180,
)
print(result.stdout[-500:] if len(result.stdout) > 500 else result.stdout)
logger.debug(
"Build output: %s",
result.stdout[-500:] if len(result.stdout) > 500 else result.stdout,
)
assert result.returncode == 0, f"Build failed:\n{result.stderr}"
print("[setup] Image built successfully.")
logger.info("Image built successfully.")
# Skip container runtime tests if not root
if not is_root():
logger.warning("Not running as root — skipping container runtime tests.")
cls.container_id = ""
return
# ── 3. Start the container ──
print(f"[setup] Starting container '{CONTAINER_NAME}'...")
logger.info("Starting container '%s'...", CONTAINER_NAME)
result = run(
[
"podman", "run", "-d",
@@ -96,7 +120,7 @@ class TestVPNImage(unittest.TestCase):
)
assert result.returncode == 0, f"Container failed to start:\n{result.stderr}"
cls.container_id = result.stdout.strip()
print(f"[setup] Container started: {cls.container_id[:12]}")
logger.info("Container started: %s", cls.container_id[:12])
# Verify it's running
inspect = run(
@@ -106,17 +130,19 @@ class TestVPNImage(unittest.TestCase):
assert inspect.stdout.strip() == "true", "Container is not running"
# ── 4. Wait for VPN to come up ──
print(f"[setup] Waiting up to {STARTUP_TIMEOUT}s for VPN tunnel...")
logger.info("Waiting up to %d seconds for VPN tunnel...", STARTUP_TIMEOUT)
vpn_up = cls._wait_for_vpn_cls(STARTUP_TIMEOUT)
assert vpn_up, f"VPN did not come up within {STARTUP_TIMEOUT}s"
print("[setup] VPN tunnel is up. Running tests.\n")
logger.info("VPN tunnel is up. Running tests.")
@classmethod
def tearDownClass(cls):
"""Stop and remove the container."""
print("\n[teardown] Cleaning up...")
if not is_root():
return
logger.info("Cleaning up test container...")
subprocess.run(["podman", "rm", "-f", CONTAINER_NAME], capture_output=True, check=False)
print("[teardown] Done.")
logger.info("Cleanup complete.")
@classmethod
def _wait_for_vpn_cls(cls, timeout: int = STARTUP_TIMEOUT) -> bool:
@@ -138,13 +164,25 @@ class TestVPNImage(unittest.TestCase):
)
return result.stdout.strip()
def _skip_if_not_root(self):
"""Skip test if not running as root."""
if not is_root():
self.skipTest("This test requires root/sudo privileges")
# ── Tests ────────────────────────────────────────────────
def test_00_build_image(self):
"""The image builds successfully."""
# This is already verified in setUpClass, just confirm here
result = run(["podman", "images", "--format", "{{.Repository}}:{{.Tag}}"])
self.assertIn(IMAGE_NAME, result.stdout, "Image was not built")
def test_01_ip_differs_from_host(self):
"""Public IP inside VPN is different from host IP."""
self._skip_if_not_root()
vpn_ip = self._get_vpn_ip()
print(f"\n[test] VPN public IP: {vpn_ip}")
print(f"[test] Host public IP: {self.host_ip}")
logger.info("VPN public IP: %s", vpn_ip)
logger.info("Host public IP: %s", self.host_ip)
self.assertTrue(vpn_ip, "Could not fetch IP from inside the container")
self.assertNotEqual(
@@ -155,12 +193,42 @@ class TestVPNImage(unittest.TestCase):
def test_02_wireguard_interface_exists(self):
"""The wg0 interface is present in the container."""
self._skip_if_not_root()
result = podman_exec(CONTAINER_NAME, ["wg", "show", "wg0"])
self.assertEqual(result.returncode, 0, f"wg show failed:\n{result.stderr}")
self.assertIn("peer", result.stdout.lower(), "No peer information in wg show output")
# AllowedIPs should be present in wg show output
self.assertIn("allowed ips", result.stdout.lower(), "AllowedIPs not found in wg show output")
def test_03_kill_switch_blocks_traffic(self):
def test_03_allowedips_routes_set(self):
"""Routes are set dynamically based on AllowedIPs from config."""
self._skip_if_not_root()
# Check that routes exist for the AllowedIPs
result = podman_exec(CONTAINER_NAME, ["ip", "route", "show", "dev", "wg0"])
self.assertEqual(result.returncode, 0, f"ip route show failed:\n{result.stderr}")
# The config has AllowedIPs = 0.0.0.0/0, which should result in:
# 0.0.0.0/1 dev wg0 and 128.0.0.0/1 dev wg0
self.assertIn("0.0.0.0/1", result.stdout, "Route 0.0.0.0/1 not found")
self.assertIn("128.0.0.0/1", result.stdout, "Route 128.0.0.0/1 not found")
# Make sure there is NO default route through wg0 (Table = off should prevent this)
self.assertNotIn("default dev wg0", result.stdout, "Default route through wg0 found — Table = off not working!")
logger.info("AllowedIPs routes verified: %s", result.stdout.strip())
def test_03b_dns_configured(self):
"""DNS is configured correctly with multiple nameserver lines."""
self._skip_if_not_root()
result = podman_exec(CONTAINER_NAME, ["cat", "/etc/resolv.conf"])
self.assertEqual(result.returncode, 0, f"cat /etc/resolv.conf failed:\n{result.stderr}")
# Should have two separate nameserver lines, not one with commas
self.assertIn("nameserver 198.18.0.1", result.stdout, "DNS 198.18.0.1 not found")
self.assertIn("nameserver 198.18.0.2", result.stdout, "DNS 198.18.0.2 not found")
# Make sure there are no commas in nameserver lines
self.assertNotIn("nameserver 198.18.0.1,198.18.0.2", result.stdout, "DNS servers written on one line with comma!")
logger.info("DNS config verified: %s", result.stdout.strip())
def test_04_kill_switch_blocks_traffic(self):
"""When WireGuard is down, traffic is blocked (kill switch)."""
self._skip_if_not_root()
# Bring down the WireGuard interface by deleting it
down_result = podman_exec(CONTAINER_NAME, ["ip", "link", "del", "wg0"], timeout=10)
self.assertEqual(down_result.returncode, 0, f"ip link del wg0 failed:\n{down_result.stderr}")
@@ -178,7 +246,7 @@ class TestVPNImage(unittest.TestCase):
result.returncode, 0,
"Traffic went through even with WireGuard down — kill switch is NOT working!",
)
print("\n[test] Kill switch confirmed: traffic blocked with VPN down")
logger.info("Kill switch confirmed: traffic blocked with VPN down")
if __name__ == "__main__":

View File

@@ -1,10 +1,18 @@
[Interface]
PrivateKey = iO5spIue/6ciwUoR95hYtuxdtQxV/Q9EOoQ/jHe18kM=
Address = 10.2.0.2/32
DNS = 10.2.0.1
PrivateKey = EPRa2f/v72LvIXAY4yqIRJifsSb+nCcYHqC2rwA94UI=
Address = 100.64.244.78/32
#DNS = 198.18.0.1,198.18.0.2
DNS = 8.8.8.8
# Route zum VPN-Server direkt über dein lokales Netz
PostUp = ip route add 91.148.236.64 via 192.168.178.1 dev wlp4s0f0
PostUp = ip route add 192.168.178.0/24 via 192.168.178.1 dev wlp4s0f0
PostDown = ip route del 91.148.236.64 via 192.168.178.1 dev wlp4s0f0
PostDown = ip route del 192.168.178.0/24 via 192.168.178.1 dev wlp4s0f0
[Peer]
PublicKey = J4XVdtoBVc/EoI2Yk673Oes97WMnQSH5KfamZNjtM2s=
PublicKey = KgTUh3KLijVluDvNpzDCJJfrJ7EyLzYLmdHCksG4sRg=
AllowedIPs = 0.0.0.0/0
Endpoint = 185.183.34.149:51820
Endpoint = 91.148.236.64:51820
PersistentKeepalive = 25

View File

@@ -203,14 +203,14 @@ List library series that have missing episodes.
| `page` | int | 1 | Page number (must be positive) |
| `per_page` | int | 20 | Items per page (max 1000) |
| `sort_by` | string | null | Sort field: `title`, `id`, `name`, `missing_episodes` |
| `filter` | string | null | Filter: `no_episodes` (shows only series with missing episodes - episodes in DB that haven't been downloaded yet) |
| `filter` | string | null | Filter: `missing_episodes` (shows series with any missing episodes), `no_episodes` (shows series with zero downloaded episodes) |
**Filter Details:**
- `no_episodes`: Returns series that have at least one episode in the database with `is_downloaded=False`
- `missing_episodes`: Returns series that have at least one missing episode recorded in the database (`is_downloaded=False`)
- `no_episodes`: Returns series that have missing episodes and no downloaded episodes (i.e., only missing episodes exist in the database)
- Episodes in the database represent MISSING episodes (from episodeDict during scanning)
- `is_downloaded=False` means the episode file was not found in the folder
- This effectively shows series where no video files were found for missing episodes
**Response (200 OK):**
@@ -1285,7 +1285,7 @@ Basic health check endpoint.
{
"status": "healthy",
"timestamp": "2025-12-13T10:30:00.000Z",
"version": "1.0.0"
"version": "1.0.1"
}
```
@@ -1303,7 +1303,7 @@ Comprehensive health check with database, filesystem, and system metrics.
{
"status": "healthy",
"timestamp": "2025-12-13T10:30:00.000Z",
"version": "1.0.0",
"version": "1.0.1",
"dependencies": {
"database": {
"status": "healthy",

View File

@@ -81,6 +81,7 @@ src/server/
| +-- websocket_service.py# WebSocket broadcasting
| +-- queue_repository.py # Database persistence
| +-- nfo_service.py # NFO metadata management
| +-- folder_scan_service.py # Daily folder maintenance scan
+-- models/ # Pydantic models
| +-- auth.py # Auth request/response models
| +-- config.py # Configuration models
@@ -290,8 +291,9 @@ The FastAPI lifespan function (`src/server/fastapi_app.py`) runs the following s
8. Background loader service started
9. Scheduler service started
10. NFO repair scan (queue incomplete tvshow.nfo files for background reload)
+-- Cron-based library rescans configured
+-- Optional: auto-download missing episodes after rescan
+-- Optional: folder maintenance (NFO repair, key resolution, renaming, poster checks) during scheduled runs
```
### 12.2 Temp Folder Guarantee

View File

@@ -41,6 +41,15 @@ This changelog follows [Keep a Changelog](https://keepachangelog.com/) principle
### Added
- **Encoding detection for HTML parsing** (`src/core/providers/aniworld_provider.py`):
Added `_decode_html_content()` function that uses `chardet` to detect the actual
encoding of HTML content before parsing. Falls back to UTF-8 with `errors='replace'`
to handle pages with mismatched encoding declarations. Applied to all BeautifulSoup
parsing calls to prevent "Some characters could not be decoded" warnings.
- **chardet dependency**: Added `chardet>=5.2.0` to `requirements.txt` for encoding detection.
### Added
- **Temp file cleanup after every download** (`src/core/providers/aniworld_provider.py`,
`src/core/providers/enhanced_provider.py`): Module-level helper
`_cleanup_temp_file()` removes the working temp file and any yt-dlp `.part`
@@ -73,17 +82,16 @@ This changelog follows [Keep a Changelog](https://keepachangelog.com/) principle
that detects incomplete `tvshow.nfo` files and triggers TMDB re-fetch.
Provides `parse_nfo_tags()`, `find_missing_tags()`, `nfo_needs_repair()`, and
`NfoRepairService.repair_series()`. 13 required tags are checked.
- **`perform_nfo_repair_scan()` startup hook
(`src/server/services/initialization_service.py`)**: New async function
called during application startup. Iterates every series directory, checks
whether `tvshow.nfo` is missing required tags using `nfo_needs_repair()`, and
either queues the series for background reload (when a `background_loader` is
provided) or calls `NfoRepairService.repair_series()` directly. Skips
gracefully when `tmdb_api_key` or `anime_directory` is not configured.
- **NFO repair wired into startup lifespan (`src/server/fastapi_app.py`)**:
`perform_nfo_repair_scan(background_loader)` is called at the end of the
FastAPI lifespan startup, after `perform_media_scan_if_needed`, ensuring
every existing series NFO is checked and repaired on each server start.
- **`perform_nfo_repair_scan()`
(`src/server/services/folder_scan_service.py`)**: New async function
that iterates every series directory, checks whether `tvshow.nfo` is missing
required tags using `nfo_needs_repair()`, and queues the series for background
reload via `asyncio.create_task`. Skips gracefully when `tmdb_api_key` or
`anime_directory` is not configured.
- **NFO repair wired into scheduled folder scan (`src/server/services/folder_scan_service.py`)**:
`perform_nfo_repair_scan(background_loader=None)` is called during the
scheduled daily folder scan, keeping startup fast while ensuring regular
maintenance.
### Changed
@@ -131,6 +139,10 @@ This changelog follows [Keep a Changelog](https://keepachangelog.com/) principle
- Modified `src/server/api/anime.py` to save scanned episodes to database
- Episodes table properly tracks missing episodes with automatic cleanup
### Deprecated
- **Legacy Series Files (key/data)**: File-based series storage is deprecated. `key` and `data` files in anime folders will be removed in v3.0.0. Database storage is now the primary method. See [docs/MIGRATION_GUIDE.md](docs/MIGRATION_GUIDE.md) for details.
---
## Sections for Each Release

View File

@@ -117,7 +117,8 @@ Location: `data/config.json`
"interval_minutes": 60,
"schedule_time": "03:00",
"schedule_days": ["mon", "tue", "wed", "thu", "fri", "sat", "sun"],
"auto_download_after_rescan": false
"auto_download_after_rescan": false,
"folder_scan_enabled": false
},
"logging": {
"level": "INFO",
@@ -143,7 +144,7 @@ Location: `data/config.json`
"master_password_hash": "$pbkdf2-sha256$...",
"anime_directory": "/path/to/anime"
},
"version": "1.0.0"
"version": "1.0.1"
}
```
@@ -173,6 +174,7 @@ Controls automatic cron-based library rescanning (powered by APScheduler).
| `scheduler.schedule_time` | string | `"03:00"` | Daily run time in 24-h `HH:MM` format. |
| `scheduler.schedule_days` | list[string] | `["mon","tue","wed","thu","fri","sat","sun"]` | Days of the week to run the scan. Empty list disables the cron job. |
| `scheduler.auto_download_after_rescan` | bool | `false` | Automatically queue missing episodes for download after each rescan. |
| `scheduler.folder_scan_enabled` | bool | `false` | Run folder maintenance (NFO repair, folder renaming, poster checks) during scheduled runs. **When enabled, series folders are automatically renamed to match the `<title> (<year>)` convention derived from their `tvshow.nfo` files.** |
Valid day abbreviations: `mon`, `tue`, `wed`, `thu`, `fri`, `sat`, `sun`.
@@ -216,7 +218,9 @@ Source: [src/server/models/config.py](../src/server/models/config.py#L15-L24)
- Obtain a TMDB API key from https://www.themoviedb.org/settings/api
- `auto_create` creates NFO files during the download process
- `update_on_scan` refreshes metadata when scanning existing anime
- `download_poster` also controls whether the scheduled folder scan checks for and re-downloads missing or corrupted `poster.jpg` files (see [NFO_GUIDE.md](NFO_GUIDE.md#6-poster-check))
- Image downloads require valid `tmdb_api_key`
- `TMDB_API_KEY` environment variable is optional when `nfo.tmdb_api_key` is configured in `data/config.json`
- Larger image sizes (`w780`, `original`) consume more storage space
Source: [src/server/models/config.py](../src/server/models/config.py#L109-L132)

View File

@@ -83,17 +83,23 @@ Source: [src/server/database/models.py](../src/server/database/models.py), [src/
### 3.2 anime_series
Stores anime series metadata.
Stores anime series metadata. Corresponds to the core `Serie` class.
| Column | Type | Constraints | Description |
| ------------ | ------------- | -------------------------- | ------------------------------------------------------- |
| `id` | INTEGER | PRIMARY KEY, AUTOINCREMENT | Internal database ID |
| `key` | VARCHAR(255) | UNIQUE, NOT NULL, INDEX | **Primary identifier** - provider-assigned URL-safe key |
| `name` | VARCHAR(500) | NOT NULL, INDEX | Display name of the series |
| `site` | VARCHAR(500) | NOT NULL | Provider site URL |
| `folder` | VARCHAR(1000) | NOT NULL | Filesystem folder name (metadata only) |
| `created_at` | DATETIME | NOT NULL, DEFAULT NOW | Record creation timestamp |
| `updated_at` | DATETIME | NOT NULL, ON UPDATE NOW | Last update timestamp |
| Column | Type | Constraints | Description |
| ---------------- | ------------- | -------------------------- | ------------------------------------------------------- |
| `id` | INTEGER | PRIMARY KEY, AUTOINCREMENT | Internal database ID |
| `key` | VARCHAR(255) | UNIQUE, NOT NULL, INDEX | **Primary identifier** - provider-assigned URL-safe key |
| `name` | VARCHAR(500) | NOT NULL, INDEX | Display name of the series |
| `site` | VARCHAR(500) | NOT NULL | Provider site URL |
| `folder` | VARCHAR(1000) | NOT NULL | Filesystem folder name (metadata only) |
| `year` | INTEGER | NULLABLE | Release year of the series |
| `nfo_path` | VARCHAR(1000) | NULLABLE | Path to tvshow.nfo metadata file |
| `tmdb_id` | INTEGER | NULLABLE, INDEX | TMDB (The Movie Database) ID for metadata |
| `tvdb_id` | INTEGER | NULLABLE, INDEX | TVDB (TheTVDB) ID for metadata |
| `has_nfo` | BOOLEAN | NOT NULL, DEFAULT FALSE | Whether tvshow.nfo exists |
| `loading_status` | VARCHAR(50) | NOT NULL, DEFAULT 'completed' | Status: pending, loading_episodes, loading_nfo, completed, failed |
| `created_at` | DATETIME | NOT NULL, DEFAULT NOW | Record creation timestamp |
| `updated_at` | DATETIME | NOT NULL, ON UPDATE NOW | Last update timestamp |
**Identifier Convention:**
@@ -101,7 +107,13 @@ Stores anime series metadata.
- `folder` is **metadata only** for filesystem operations (e.g., `"Attack on Titan (2013)"`)
- `id` is used only for database relationships
Source: [src/server/database/models.py](../src/server/database/models.py#L23-L87)
**EpisodeDict Mapping:**
The `episodeDict` (season → episode numbers mapping) is stored as individual `Episode` records:
- Each `Episode` has `season` and `episode_number` columns
- Relationship: `AnimeSeries.episodes` returns all Episode records for that series
Source: [src/server/database/models.py](../src/server/database/models.py#L23-L150)
### 3.3 episodes
@@ -441,7 +453,187 @@ items = await db.execute(
---
## 12. Database Location
## 12. Series Storage: Database vs Files (Deprecated)
### File-Based Storage (Removed in v2.0)
Prior to v2.0, series metadata was stored in two files per anime folder:
| File | Contents |
| -------- | ------------------------------------------------------- |
| `key` | Series provider key (e.g., `"attack-on-titan"`) |
| `data` | JSON serialization of `Serie` object |
File structure example:
```
/anime/Attack on Titan (2013)/
├── key # Contains: attack-on-titan
├── data # Contains: {"key": "...", "name": "...", "episodeDict": {...}}
├── Season 1/
│ └── ...
```
### Database Storage (Current)
Since v2.0, all series metadata is stored in the `anime_series` table with `Episode` records for episode tracking. This provides:
- **ACID transactions** for data consistency
- **Foreign key constraints** (cascade delete)
- **Indexed queries** for fast lookups
- **No filesystem dependency** for metadata
### Migration from Files to Database
The `Serie.save_to_file()` and `Serie.load_from_file()` methods are deprecated but still functional for backward compatibility during migration:
```python
from src.core.entities.series import Serie
# Old file-based loading (deprecated)
serie = Serie.load_from_file("/anime/Attack on Titan (2013)/data")
# New database-based loading
from src.server.database.service import AnimeSeriesService
serie = await AnimeSeriesService.get_by_key(db, "attack-on-titan")
```
### Removing File Dependencies
After verifying database schema supports all fields, file-based storage can be removed:
1. ✅ Schema verified: All `Serie` fields have corresponding DB columns
2. ✅ Migration complete: All existing series migrated to database
3. ❌ File cleanup: Remove `key` and `data` files (pending)
**Note:** The `save_to_file()` and `load_from_file()` methods will be removed in v3.0.0.
---
## 12. Series Persistence Flow
When a directory scan discovers or updates series, the scanner persists data to the database instead of writing to disk files.
### Scan Flow
```
Scan Directory
Find MP4 Files → Extract Serie Key
Check DB for Existing Series (by key)
├─── EXISTS ──────────────────────► Update Series Metadata
│ │
│ ▼
│ Sync Episodes to DB
│ │
│◄──────────────────────────────────────┘
└─── NEW ───────────────────────────► Create New Series Record
Create Episode Records
Return to Scan Loop
```
### Key Methods
**SerieScanner._persist_serie_to_db()**
- Called after `get_missing_episodes_and_season()` computes episodeDict
- Uses `AnimeSeriesService.get_by_key()` to check if series exists
- If exists: calls `AnimeSeriesService.update()` + `_sync_episodes_to_db()`
- If new: calls `AnimeSeriesService.create()` + creates episodes
**SerieScanner._sync_episodes_to_db()**
- Gets existing episodes from DB via `EpisodeService.get_by_series()`
- Compares with new episodeDict
- Removes episodes no longer missing (unless `is_downloaded=True`)
- Adds new missing episodes
- Preserves `is_downloaded=True` episodes when removing missing ones
**SerieList.add_to_db()**
- Used when adding a new discovered series via API
- Creates filesystem folder + database record + episode records
### Episode Sync Logic
```python
# For each episode in DB but not in new episodeDict:
if episode.is_downloaded:
# Keep - file exists, don't remove
pass
else:
# Remove - no longer missing
EpisodeService.delete()
# For each episode in new episodeDict but not in DB:
# Add as new missing episode
EpisodeService.create(is_downloaded=False)
```
### Transaction Handling
- DB operations use their own session with commit/rollback
- If DB write fails, error is logged and scan continues
- File-based `save_to_file()` no longer called during scan
### Migration Path
1. v2.x: Scanner writes to both DB (primary) and files (fallback)
2. v3.0: Scanner writes only to DB, file methods removed
---
## 13. Series Persistence
### Schema
**AnimeSeries Table**: Stores series metadata (key, name, site, folder, year)
| Column | Type | Constraints | Description |
|-----------|--------------|---------------------------|----------------------|
| `id` | INTEGER | PRIMARY KEY | Auto-increment |
| `key` | VARCHAR(255) | UNIQUE, NOT NULL | Series provider key |
| `name` | VARCHAR(500) | NOT NULL | Display name |
| `site` | VARCHAR(500) | | Provider site URL |
| `folder` | VARCHAR(1000)| | Filesystem folder |
**Episode Table**: Stores per-episode metadata (season, episode_number, is_downloaded)
| Column | Type | Constraints | Description |
|-----------------|--------------|---------------------------|----------------------|
| `id` | INTEGER | PRIMARY KEY | Auto-increment |
| `series_id` | INTEGER | FOREIGN KEY → anime_series| Parent series |
| `season` | INTEGER | NOT NULL | Season number |
| `episode_number`| INTEGER | NOT NULL | Episode number |
| `is_downloaded` | BOOLEAN | DEFAULT FALSE | Download status |
### Relationships
- `AnimeSeries.episodes` → List of Episode objects (one-to-many)
- `Episode.series` → Parent AnimeSeries (many-to-one)
- Cascade delete: Deleting a series removes all its episodes
### Queries
```python
# Get all series with episodes
AnimeSeriesService.get_all(db, with_episodes=True)
# Get by provider key
AnimeSeriesService.get_by_key(db, key)
# Get by folder path
AnimeSeriesService.get_by_folder(db, folder)
```
---
## 14. Database Location
| Environment | Default Location |
| ----------- | ------------------------------------------------- |

View File

@@ -61,4 +61,376 @@ This document provides guidance for developers working on the Aniworld project.
- Commit message format
- Pull request process
8. Common Development Tasks
### Adding Queue Deduplication
The download queue prevents duplicate entries at two levels:
**In-Memory Deduplication** (`src/server/services/download_service.py`):
- `_pending_by_episode` dict tracks pending episodes: key = `(serie_id, season, episode)`
- `_add_to_pending_queue()` updates the dict when adding items
- `add_to_queue()` checks this dict before adding episodes (includes batch-local dedup)
- `_remove_from_pending_queue()` cleans up the dict when items are removed
**Database Constraint** (`src/server/models.py`):
- `DownloadQueueItem` has a unique index on `episode_id` via `__table_args__`
- Prevents duplicate queue entries at the database level
- Unique constraint: `Index("ix_download_queue_episode_pending", "episode_id", unique=True)`
**Scheduler Cooldown** (`src/server/services/scheduler_service.py`):
- `_last_auto_download_time` tracks when auto-download last ran
- 5-minute cooldown prevents rapid re-triggers
- Checked at start of `_auto_download_missing()`
### Episode Lifecycle
Episodes transition through states stored in the `episodes` table:
| State | `is_downloaded` | `file_path` | Description |
|-------|----------------|-------------|-------------|
| Missing | `False` | `NULL` | Episode not yet downloaded |
| Downloaded | `True` | Set | Episode exists on disk |
**State Transitions:**
1. **Missing → Downloaded**: When download completes, `_remove_episode_from_missing_list()` calls `EpisodeService.mark_downloaded()` to set `is_downloaded=True` and populate `file_path`. The episode record is NOT deleted.
**Query Implications:**
- `get_series_with_missing_episodes()`: Filters for `is_downloaded=False` to find series with undownloaded episodes
- `get_series_with_no_episodes()`: Finds series with `is_downloaded=False` episodes but NO `is_downloaded=True` episodes (completely unwatched series)
### Mocking the Download Queue
When testing components that use the download queue:
```python
# Mock repository for unit tests
class MockQueueRepository:
def __init__(self):
self._items: Dict[str, DownloadItem] = {}
async def save_item(self, item: DownloadItem) -> DownloadItem:
self._items[item.id] = item
return item
async def get_all_items(self) -> List[DownloadItem]:
return list(self._items.values())
# Use in fixture
@pytest.fixture
def mock_queue_repository():
return MockQueueRepository()
@pytest.fixture
def download_service(mock_anime_service, mock_queue_repository):
return DownloadService(
anime_service=mock_anime_service,
queue_repository=mock_queue_repository,
max_retries=3,
)
```
9. Troubleshooting Development Issues
### Async Context Managers for aiohttp
All `aiohttp.ClientSession` usages must be wrapped in `async with`:
```python
# Correct — session properly closed on exit
async with TMDBClient(api_key="key") as client:
result = await client.search_tv_show("Show")
# Wrong — session may leak if exception occurs
client = TMDBClient(api_key="key")
result = await client.search_tv_show("Show")
await client.close() # May not be called if exception raised earlier
```
**Why:**
- `aiohttp.ClientSession` holds TCP connections that must be explicitly closed
- If exception occurs before `close()`, session leaks
- Context manager guarantees `__aexit__` runs even on exceptions
**Services that use aiohttp:**
- `TMDBClient` — has `__aenter__`/`__aexit__`, use `async with`
- `ImageDownloader` — has `__aenter__`/`__aexit__`, use `async with`
- `NFOService` — wraps both above, use `async with`
**Verification:**
- Missing context manager usage triggers `__del__` warning on garbage collection
- Integration tests verify no "Unclosed client session" errors in logs
### Scheduler Persistence and Recovery
The scheduler uses APScheduler's in-memory job store. Jobs are reconstructed from `config.json` on every startup — no separate database is needed.
```python
# Jobs are built from config on startup — no persistence DB required
scheduler = AsyncIOScheduler() # default MemoryJobStore
scheduler.add_job(..., replace_existing=True)
```
**Startup misfire recovery:** On `start()`, the scheduler checks `system_settings.last_scan_timestamp` in `aniworld.db`. If the last scan is overdue (>23h but <25h ago), an immediate rescan is triggered. This replaces APScheduler's built-in misfire handling which required a separate SQLite database.
**Grace period:** If the server was down for more than 25 hours, no automatic recovery occurs to avoid surprise rescans after long downtime.
**Health endpoint:** `GET /health` returns `scheduler_next_run` and `scheduler_last_run` for external monitors (Uptime Kuma, Prometheus, etc.).
**If server is down too long:** Manual trigger via `POST /api/scheduler/trigger-rescan` or wait for next scheduled run.
### Database Session Management
`get_async_session_factory()` returns a **new AsyncSession instance** directly (not a factory). The function name is historical — callers receive the session immediately:
```python
# Correct usage:
db = get_async_session_factory() # db IS the session
await db.execute(...)
await db.commit()
await db.close()
```
Do NOT call the result again with `()` — that tries to call an `AsyncSession` object, causing `'AsyncSession' object is not callable`.
For context manager usage, prefer `get_db_session()` (auto-commits) or `get_transactional_session()` (manual commit).
### Health Check Endpoints
The application provides health check endpoints for monitoring and container orchestration:
#### `GET /health`
Basic health check returning service status and startup health check results.
**Response fields:**
- `status`: "healthy", "degraded", or "unhealthy" based on startup checks
- `timestamp`: ISO timestamp of the check
- `series_app_initialized`: Whether the series app is loaded
- `anime_directory_configured`: Whether anime_directory is set
- `scheduler_next_run` / `scheduler_last_run`: Scheduler times
- `checks`: Detailed startup check results (ffmpeg, DNS, anime_directory)
#### `GET /health/ready`
Readiness check for container orchestrators (Kubernetes, Docker Swarm).
**Response when ready:**
```json
{
"status": "ready",
"ready": true,
"timestamp": "2024-01-01T00:00:00",
"checks": {...}
}
```
**Response when not ready (503):**
```json
{
"status": "not_ready",
"ready": false,
"timestamp": "2024-01-01T00:00:00",
"critical_failures": ["anime_directory: not configured"],
"checks": {...}
}
```
#### `GET /health/detailed`
Comprehensive health check including database, filesystem, and system metrics.
#### Startup Health Checks
On application startup, the following checks are performed:
| Check | Failure Status | Impact |
|-------|---------------|--------|
| `ffmpeg` | warning | HLS downloads may fail |
| `dns_aniworld` | warning | Provider requests may fail |
| `dns_tmdb` | warning | TMDB API calls may fail |
| `anime_directory` | error | Download service disabled |
DNS checks are warnings because failures can be transient. anime_directory errors disable the download service to prevent failures.
### Troubleshooting Development Issues
#### Scheduler missed a run
1. Server was down at scheduled time (03:00 UTC by default).
2. On restart, the scheduler checks `last_scan_timestamp` — if overdue by 23-25h, it triggers immediately.
3. If server was down >25 hours, missed job is skipped to avoid surprise rescans.
4. Trigger manually: `POST /api/scheduler/trigger-rescan`
5. Monitor next run: `GET /health``scheduler_next_run`
#### Scheduler not firing (no events at scheduled time)
If the scheduler appears configured but never triggers:
1. **Check application logs for scheduler startup:**
```
grep "Scheduler service started" fastapi_app.log
```
- If missing, the scheduler failed to start — check for errors above this line
- If present, scheduler started successfully
2. **Verify the job is registered:**
```
grep "Scheduler started with cron trigger" fastapi_app.log
```
3. **Verify APScheduler events in logs:**
```
grep "apscheduler.executors.default" fastapi_app.log
```
- `Running job` = job triggered
- `executed successfully` = job completed
- No output = job never fired
4. **Test manual trigger:**
```bash
curl -X POST http://localhost:8000/api/scheduler/trigger-rescan -H "Authorization: Bearer <token>"
```
- If manual trigger works but cron doesn't, the issue is APScheduler configuration
5. **Check next_run_time via health endpoint:**
```bash
curl http://localhost:8000/health | jq .scheduler_next_run
```
- If `null`, the job is not scheduled
- If set, the scheduler knows when to run next
6. **Check timezone handling:**
- APScheduler uses UTC internally
- The schedule_time config (e.g., "03:00") is interpreted as UTC
- If you expect local time, adjust the schedule_time accordingly
#### Startup health check failures
If `/health` returns `unhealthy` status:
1. **anime_directory error**: Directory not configured or not writable
- Check `ANIME_DIRECTORY` environment variable
- Verify directory exists and permissions allow write access
- Download service will not initialize until resolved
2. **ffmpeg warning**: ffmpeg not found in PATH
- HLS stream downloads will fail
- Install ffmpeg: `apt install ffmpeg` or `brew install ffmpeg`
3. **DNS warnings**: Domain resolution failed
- Check network connectivity
- DNS failures are transient — warnings don't block startup
- Retry later to verify: `GET /health`
### Provider Failure Handling
Download providers (VOE, Doodstream, Vidmoly, Vidoza, SpeedFiles, Streamtape,
Luluvdo) regularly break: URLs expire, sites change their player markup, geo
blocks appear, and `yt-dlp` extractors lag behind upstream changes. The
`AniworldLoader.download()` flow is designed to fail fast and rotate.
**Rotation order**
1. The episode page is scraped for the providers AniWorld actually advertises.
2. Results are ordered by the preference in `DEFAULT_PROVIDERS`
(`provider_config.py`); providers not listed run last.
3. For each candidate the loader:
1. Calls `_check_url_alive()` — HEAD probe with GET fallback. Any 4xx
response or connection error skips the provider immediately.
2. Resolves the redirect via `_resolve_direct_link()` to obtain a direct
stream URL plus headers. Provider-specific extractors (e.g. `VOE`) are
preferred; unknown providers fall back to the embed URL so `yt-dlp` can
attempt extraction.
3. Tries `_try_direct_stream()` — straight `requests.get(stream=True)` when
`Content-Type` is `video/*` or `application/octet-stream`. This avoids
`yt-dlp` entirely for direct MP4 links.
4. Falls back to `yt-dlp` with the ffmpeg downloader for HLS streams.
4. On any failure, temp files are cleaned and the loop moves to the next
provider. When the chain is exhausted, the loader logs
`All download providers failed for S{season}E{episode} ...; tried=[...]`
to both the application log and `logs/download_errors.log`.
**Do not hardcode provider URLs.** Provider domains shift constantly (e.g.
Doodstream alternates between `dood.li`, `dood.so`, `dood.la`). Only the
referer hints in `PROVIDER_HEADERS` are persisted — discovery still happens
at runtime through AniWorld's redirect endpoint.
### HLS Stream Handling
HLS (HTTP Live Streaming) manifests (`.m3u8`) require yt-dlp to use the
`ffmpeg` downloader with `--hls-use-mpegts`. Both providers configure this
automatically:
```python
ydl_opts = {
"downloader": "ffmpeg", # Use ffmpeg instead of native
"hls_use_mpegts": True, # Write transport stream (.ts) segments
}
```
**Why this matters**: Without ffmpeg, yt-dlp logs:
`"Live HLS streams are not supported by the native downloader"`
**Requirements**:
- ffmpeg must be installed and in PATH (`which ffmpeg`)
- Install: `apt install ffmpeg` (Debian/Ubuntu) or `brew install ffmpeg` (macOS)
- Startup health check (see Health Check Endpoints) verifies ffmpeg presence
**Trade-offs**:
- HLS downloads are slower than direct MP4 (reassembly of .ts segments)
- Requires more disk space during download
- May need post-processing if .ts format is not desired
**Detection**: VOE provider extracts HLS URLs via `HLS_PATTERN` regex. Other
providers let yt-dlp auto-detect from URL/content-type.
### Updating yt-dlp
When extractors break (typical symptoms: every provider HEAD probe succeeds
but `yt-dlp` raises `Unable to extract` or `HTTP Error 404`):
1. Check the upstream tracker first: https://github.com/yt-dlp/yt-dlp/issues
2. Upgrade in the conda environment:
```bash
conda run -n AniWorld pip install --upgrade yt-dlp
```
3. Smoke-test against a known-good episode before pinning a new floor in
`requirements.txt` (`yt-dlp>=YYYY.MM.DD`).
4. Re-run the provider test suite:
```bash
conda run -n AniWorld python -m pytest tests/unit/test_aniworld_provider.py -v
```
5. If a specific extractor is removed upstream, drop the provider from
`DEFAULT_PROVIDERS` rather than patching `yt-dlp` in tree.
### User Notification on Total Failure
`SeriesApp.download_episode()` already emits a `download_status="failed"`
WebSocket event when `loader.download()` returns `False`. Operators should
forward this to `notification_service.notify_download_failed()` so users see
a HIGH-priority alert. The loader keeps the failure detail in
`logs/download_errors.log` for post-mortem.
## Series Storage
### Overview
Series metadata now stored in the database (SQLAlchemy ORM).
Legacy files (`key` and `data` per folder) are deprecated but preserved
for backward compatibility.
### Architecture
- **Database**: Single source of truth for all series metadata
- **In-Memory Cache**: SeriesApp maintains a cache for performance
- **Filesystem**: Only used for episode files themselves, not metadata
### Migration
First startup after upgrade automatically imports any legacy
series files into the database.
### Legacy Files
- `key` file: Contains series provider key (deprecated)
- `data` file: Contains Serie JSON object (deprecated)
Both are safe to delete after migration; not needed for normal operation.

View File

@@ -0,0 +1,94 @@
# Logging Instructions
This document describes how to write and refactor logging across the AniWorld codebase to make logs **human-readable**, **debug-friendly**, and **noise-free**.
> ✅ Goal: Logs should help a developer understand what happened, why it happened, and what to inspect next — without overwhelming them with duplicates or irrelevant details.
---
## 1. Principles for Great Logs
### 1.1 Use the Right Log Level
- `DEBUG`: Detailed internal state useful when debugging a specific issue (e.g., decision points, returned values, request/response payloads). Not for normal operation.
- `INFO`: High-level events that represent what the system is doing (e.g., "Import started", "New series added", "Config reloaded"). Use sparingly.
- `WARNING`: Something unexpected happened, but the system can continue (e.g., missing optional file, fallback behavior).
- `ERROR`: An operation failed and needs attention (e.g., exception caught, failed database write).
- `CRITICAL`: The system is in an unusable state (e.g., config corruption, failed startup).
### 1.2 Keep Logs Human-Readable
- Write messages in a clear, descriptive sentence-style format.
- Avoid cryptic codes or single-word log messages.
- Prefer `logger.debug("... %s", value)`-style formatting over f-strings to avoid unnecessary work when the log level is disabled.
### 1.3 Avoid Log Spam
- Dont log inside hot loops unless you explicitly aggregate and log a summary (e.g., "Processed 124 files, 3 failures").
- Avoid repeated/logging the same event at the same level (e.g., do not log "Retrying" 10 times at INFO; log once at INFO and then use DEBUG for each retry).
- Use rate limiting or debounce patterns for logs that can fire rapidly (e.g., external service health checks).
- Prefer a single higher-level log with context rather than many low-level logs that clutter output.
### 1.4 Log Objects Usefully
- When logging objects, log the minimal useful representation (e.g., ID, name, status) rather than the full object or its memory address.
- If an object has a `.dict()`, `.to_dict()`, or `.as_dict()` helper (common in Pydantic models), log that rather than relying on `repr()`.
- Add a `__repr__` or `__str__` implementation to domain models that returns a helpful, concise string with key identifiers.
- Use structured logging (e.g., `logger.info("Series added", extra={"series_id": series.id, "title": series.title})`) where supported.
- For exceptions, prefer `logger.exception("Failed to ...")` to capture stack traces.
---
## 2. Refactoring Existing Logs
When improving or refactoring existing log statements, aim to make them:
- **Actionable**: A developer reading the log should know what happened and what to check next.
- **Non-redundant**: Remove duplicates and ensure only one log records the same high-level event at a given level.
- **Context-rich**: Include identifiers (e.g., `series_id`, `file_path`, `user_id`) and key state that explains why a decision was made.
- **Level-appropriate**: Downgrade noisy INFO logs to DEBUG, and elevate critical failures to ERROR/CRITICAL.
### 2.1 Refactor Checklist
1. **Locate noisy logs**: Search for repeated messages (e.g., "Start", "Done") and determine whether they should be DEBUG or removed.
2. **Replace ad-hoc prints**: Remove `print()` statements or `print(obj)` and replace with `logger.*` calls.
3. **Use structured context**: If a function logs multiple related messages, include the same context in each (e.g., `extra={"series_id": series.id}`) or use a context manager that attaches it.
4. **Validate object output**: Ensure any logged object produces a useful representation (add methods or translate to dict). If not, log the key fields explicitly.
5. **Batch repetitive events**: If a loop logs per item, consider collecting stats and logging a summary at the end.
## 3. Adding New Logs
When adding logs to new code paths:
- Log **important state transitions** (e.g., "Queue started", "Download completed", "Config reloaded").
- For error paths, include what failed and why (e.g., "Could not load config from X: {exc}").
- Prefer logging at the boundaries of operations, not deep inside utility functions unless it aids debugging.
- Write logs in full sentences, with a clear subject, verb, and object.
---
## 4. Example Patterns
```python
logger.info("Import completed", extra={"series_id": series.id, "count": len(imported)})
logger.debug(
"Fetched feed items",
extra={"feed_url": feed.url, "item_count": len(items)},
)
try:
result = download_episode(episode)
except Exception:
logger.exception("Failed to download episode %s", episode.id)
```
> 💡 When in doubt, favor **fewer, richer logs** over many noisy logs.
---
## 5. Logging Audit Task List
For a guided checklist of files and logging improvements, see **`docs/tasks.md`**. This is where we track which files have been reviewed and which logging items still need attention.
> ✅ After applying the guidelines above, update `docs/tasks.md` to indicate which tasks are complete.

111
docs/MIGRATION_GUIDE.md Normal file
View File

@@ -0,0 +1,111 @@
# Migration Guide: File-Based to Database Storage
## Overview
This guide covers the transition from file-based series metadata storage to the new database-backed system introduced in v2.0.
## What Changed
**Before v2.0**: Series metadata stored in `key` and `data` files alongside anime folders.
**After v2.0**: All metadata stored in SQLite database (`aniworld.db`). Files are deprecated but still supported for backward compatibility during migration.
## Automated Migration
The application automatically migrates on first startup:
1. Scans anime directory for `key` and `data` files
2. Parses legacy files into `AnimeSeries` and `Episode` records
3. Loads series into in-memory cache
4. Logs migration results
**No manual action required.**
## Manual Verification
After first startup with the new version:
1. **Check logs** for: `"Migrated X series from files to DB"`
2. **Verify series count**: UI shows same number of series as before
3. **Confirm episodes**: Episode counts match expected totals
```bash
# Check migration log
grep "Migrated" logs/app.log
# Verify series via API
curl http://localhost:8000/api/anime | jq '.total'
```
## After Migration
### Safe to Delete
Once verified, these files can be removed:
```
<anime_folder>/
├── Attack on Titan (2013)/
│ ├── key # ❌ Can delete
│ ├── data # ❌ Can delete
│ └── Season 1/
│ └── ...
```
**Deleting these files does not affect the database.** The metadata now lives in `aniworld.db`.
### Backup (Recommended)
Before deleting, backup the files:
```bash
# Create backup directory
mkdir -p backup/legacy_series_files
# Copy all key and data files
find /path/to/anime -name "key" -o -name "data" | while read f; do
cp "$f" "backup/legacy_series_files/"
done
```
## Reverting (Not Recommended)
If you must revert to file-based storage:
1. **Restore from database backup** (if available)
2. **Export manually** (no export script exists)
**Warning**: File-based storage is deprecated and will be removed in v3.0.0.
## Troubleshooting
### Series Not Appearing After Migration
1. Check logs for migration errors: `grep -i error logs/app.log`
2. Verify `key` and `data` files exist and are readable
3. Manually trigger rescan: `POST /api/scheduler/trigger-rescan`
### Duplicate Series
1. Check for duplicate `key` files (same series in multiple folders)
2. Verify series key uniqueness in database:
```bash
sqlite3 aniworld.db "SELECT key, COUNT(*) FROM anime_series GROUP BY key HAVING COUNT(*) > 1;"
```
### Missing Episodes
1. Trigger targeted scan for affected series
2. Check episode sync logs
3. Verify file permissions on anime directory
## Deprecation Timeline
| Version | Status |
|---------|--------|
| v2.0.x | Legacy files supported, migration automated |
| v2.1.x | Legacy files still supported, warnings in logs |
| v3.0.0 | **Legacy files removed** - database only |
Upgrade to v3.0.0 before legacy file support ends.

View File

@@ -171,6 +171,35 @@ Response:
}
```
### 3.6 Fallback Behavior When TMDB is Unavailable
When TMDB lookup fails (network issues, API errors, or no match found), the system creates a **minimal NFO** to ensure the series is still tracked. This behavior applies to:
- Manual NFO creation via API
- Batch NFO creation operations
- Automatic NFO creation during downloads
**What a minimal NFO contains:**
```xml
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<tvshow>
<title>Series Name</title>
<year>2024</year>
<plot>No metadata available for Series Name. TMDB lookup failed.</plot>
</tvshow>
```
**Limitations of minimal NFOs:**
- No poster, logo, or fanart images
- No rating, genre, or studio information
- No TMDB or other provider IDs
- May not display correctly in some media servers
**To upgrade a minimal NFO:**
1. Use the Update endpoint (`PUT /api/nfo/{serie_id}/update`) when TMDB is available
2. Or delete the NFO and recreate it with full metadata
---
## 4. File Structure
@@ -217,6 +246,7 @@ NFO files are created in the anime directory:
<genre>Action</genre>
<genre>Sci-Fi & Fantasy</genre>
<uniqueid type="tmdb">1429</uniqueid>
<tmdbid>1429</tmdbid>
<thumb aspect="poster">https://image.tmdb.org/t/p/w500/...</thumb>
<fanart>
<thumb>https://image.tmdb.org/t/p/original/...</thumb>
@@ -224,6 +254,13 @@ NFO files are created in the anime directory:
</tvshow>
```
**Manual TMDB ID Override**: To skip TMDB search and use a specific ID directly, include `<tmdbid>YOUR_ID</tmdbid>` in the NFO. This is useful when:
- TMDB search fails for your series (e.g., new or obscure anime)
- You already know the correct TMDB ID
- You want to avoid rate limiting from repeated searches
Aniworld reads `<tmdbid>` element and `<uniqueid type="tmdb">` first. If found, it uses the ID directly instead of searching.
### 4.3 Episode NFO Format
```xml
@@ -246,7 +283,84 @@ NFO files are created in the anime directory:
---
## 5. API Reference
## 5. Folder Naming Convention
### 5.1 Expected Format
After the daily folder scan (when **Update on library scan** is enabled), Aniworld validates every series folder against its `tvshow.nfo` metadata. If the folder name does not match the expected convention, it is automatically renamed.
**Format:**
```
{title} ({year})
```
**Examples:**
| NFO `<title>` | NFO `<year>` | Expected Folder Name |
|---------------|--------------|----------------------|
| `Attack on Titan` | `2013` | `Attack on Titan (2013)` |
| `One Piece` | `1999` | `One Piece (1999)` |
| `Demon Slayer: Kimetsu no Yaiba` | `2019` | `Demon Slayer Kimetsu no Yaiba (2019)` |
### 5.2 Sanitization Rules
Illegal filesystem characters are removed or replaced to ensure cross-platform compatibility:
- Removed: `< > : " / \ | ? *` and null bytes
- Control characters stripped
- Multiple spaces collapsed to one
- Leading/trailing dots and whitespace trimmed
- Maximum length: 200 characters (truncated at word boundary if possible)
### 5.3 Skip Conditions
A folder is **not** renamed when any of the following apply:
- `tvshow.nfo` is missing `<title>` or `<year>` (or they are empty)
- The series has an **active or pending download**
- The target folder name already exists (duplicate)
- The resulting path would exceed the OS path-length limit
- The app lacks write permission to the anime directory
All skipped and renamed actions are logged.
---
## 6. Poster Check
### 6.1 Overview
During the daily folder scan, Aniworld checks every series folder for a valid `poster.jpg`. If the file is missing or smaller than 1 KB, the application attempts to re-download it from the URL stored in the series' `tvshow.nfo` file.
### 6.2 How It Works
1. **Scan** — After folder renaming, the scan iterates over all series folders that contain a `tvshow.nfo`.
2. **Validate** — For each folder, it checks whether `poster.jpg` exists and is at least 1 KB.
3. **Parse NFO** — If the poster is missing or too small, the scan reads `tvshow.nfo` and looks for a `<thumb aspect="poster">` (or any `<thumb>`) URL.
4. **Download** — If a URL is found, the poster is downloaded using `ImageDownloader` with a concurrency limit of 3 simultaneous downloads.
5. **Validate Download** — The downloaded image is validated with PIL to ensure it is not corrupted.
### 6.3 Skip Conditions
A folder is **not** processed for poster download when any of the following apply:
- `tvshow.nfo` does not exist in the folder.
- `poster.jpg` already exists and is ≥ 1 KB.
- No `<thumb>` URL is found in the NFO (the NFO may have been created before thumb tags were added).
- The `nfo.download_poster` setting is `false` (poster checks are still performed, but downloads are skipped if the setting is disabled; see [CONFIGURATION.md](CONFIGURATION.md)).
### 6.4 Logging
Every poster check action is logged:
- **INFO** — When a poster is successfully downloaded.
- **WARNING** — When a download fails or no URL is found.
- **ERROR** — When an unexpected exception occurs during download.
---
## 7. API Reference
### 5.1 Check NFO Status
@@ -523,6 +637,36 @@ NFO files are created in the anime directory:
4. Check network speed to TMDB servers
5. Verify disk I/O performance
### 6.7 TMDB Lookup Fails for My Series
**Problem**: TMDB search fails with "No results found" for a valid series.
**Solutions**:
1. **Check if series exists on TMDB**: Visit https://www.themoviedb.org and search for your series
2. **Use manual ID override**: Add TMDB ID directly to `tvshow.nfo`:
```xml
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<tvshow>
<title>Your Series Name</title>
<tmdbid>12345</tmdbid>
<uniqueid type="tmdb">12345</uniqueid>
</tvshow>
```
Aniworld will use this ID directly instead of searching.
3. **Try alternative titles**: Some anime have different titles (Japanese, romaji, English). If you have access to the folder, rename it to match the TMDB title.
4. **Add to existing NFO**: If `tvshow.nfo` exists but has no TMDB ID, edit it to add:
```xml
<tmdbid>YOUR_TMDB_ID</tmdbid>
```
Then use the Update endpoint to refresh metadata.
5. **Check for rate limiting**: If many lookups fail at once, you may be hitting TMDB rate limits. Wait and retry later.
6. **Verify API key**: Ensure your TMDB API key is valid and has not exceeded usage limits.
---
## 7. Best Practices
@@ -675,21 +819,25 @@ The XML serialisation lives in `src/core/utils/nfo_generator.py`
## 11. Automatic NFO Repair
Every time the server starts, Aniworld scans all existing `tvshow.nfo` files and
automatically repairs any that are missing required tags.
NFO repair now runs as part of the scheduled daily folder scan rather than on every
startup. When the scheduler triggers `FolderScanService.run_folder_scan()`, the first
step is `perform_nfo_repair_scan(background_loader=None)`. Each incomplete NFO is
queued as a background `asyncio` task, so the scan returns quickly while repairs
continue asynchronously.
### How It Works
1. **Scan** — `perform_nfo_repair_scan()` in
`src/server/services/initialization_service.py` is called from the FastAPI
lifespan after `perform_media_scan_if_needed()`.
`src/server/services/initialization_service.py` is called from
`FolderScanService.run_folder_scan()` (`src/server/services/folder_scan_service.py`).
2. **Detect** — `nfo_needs_repair(nfo_path)` from
`src/core/services/nfo_repair_service.py` parses each `tvshow.nfo` with
`lxml` and checks for the 13 required tags listed below.
3. **Repair** — Series whose NFO is incomplete are queued for background reload
via `BackgroundLoaderService.add_series_loading_task()`. The background
loader re-fetches metadata from TMDB and rewrites the NFO with all tags
populated.
via `asyncio.create_task`. Each task creates its own isolated
:class:`NFOService` / :class:`TMDBClient` so concurrent tasks never share an
``aiohttp`` session — this prevents "Connector is closed" errors when many repairs
run in parallel. A semaphore caps TMDB concurrency at 3 to stay within rate limits.
### Tags Checked (13 required)
@@ -734,8 +882,7 @@ This calls `NFOService.update_tvshow_nfo()` directly and overwrites the existing
| File | Purpose |
| ----------------------------------------------- | ---------------------------------------------------------------------------------------------- |
| `src/core/services/nfo_repair_service.py` | `REQUIRED_TAGS`, `parse_nfo_tags`, `find_missing_tags`, `nfo_needs_repair`, `NfoRepairService` |
| `src/server/services/initialization_service.py` | `perform_nfo_repair_scan` startup hook |
| `src/server/fastapi_app.py` | Wires `perform_nfo_repair_scan` into the lifespan |
| `src/server/services/folder_scan_service.py` | `perform_nfo_repair_scan` — invoked during the scheduled daily folder scan |
---

View File

@@ -62,6 +62,90 @@ This document describes the testing strategy, guidelines, and practices for the
- What to mock
- Mock patterns
- External service mocks
### Mocking the Download Queue
Use `MockQueueRepository` for testing download queue functionality:
```python
from src.server.models.download import DownloadItem, EpisodeIdentifier
class MockQueueRepository:
def __init__(self):
self._items: Dict[str, DownloadItem] = {}
async def save_item(self, item: DownloadItem) -> DownloadItem:
self._items[item.id] = item
return item
async def get_item(self, item_id: str) -> Optional[DownloadItem]:
return self._items.get(item_id)
async def get_all_items(self) -> List[DownloadItem]:
return list(self._items.values())
async def set_error(self, item_id: str, error: str) -> bool:
if item_id in self._items:
self._items[item_id].error = error
return True
return False
async def delete_item(self, item_id: str) -> bool:
if item_id in self._items:
del self._items[item_id]
return True
return False
async def clear_all(self) -> int:
count = len(self._items)
self._items.clear()
return count
```
**Key points:**
- The mock uses in-memory storage, no database required
- All async methods are implemented (even if just pass-through)
- `save_item` uses `item.id` as key (must be set before calling)
- Suitable for unit tests only (no persistence)
### Mocking aiohttp Sessions
When testing code that uses `aiohttp.ClientSession`:
```python
from unittest.mock import AsyncMock, MagicMock, patch
from aiohttp import ClientSession
# Mock aiohttp session for testing
class MockAiohttpSession:
def __init__(self):
self.closed = False
async def close(self):
self.closed = True
def get(self, url, **kwargs):
mock_response = AsyncMock()
mock_response.status = 200
mock_response.json = AsyncMock(return_value={"data": "test"})
mock_response.__aenter__ = AsyncMock(return_value=mock_response)
mock_response.__aexit__ = AsyncMock(return_value=None)
return mock_response
# Use in fixture
@pytest.fixture
async def mock_tmdb_session():
session = MockAiohttpSession()
yield session
# Cleanup verification
assert session.closed, "Session was not closed"
```
**Key points:**
- Always verify `session.closed` is `True` after context manager exits
- Mock `__aenter__` and `__aexit__` for response context managers
- Set `closed = False` on mock session for unclosed warning tests
7. Coverage Requirements
8. CI/CD Integration
9. Writing Good Tests

View File

@@ -31,14 +31,16 @@ flowchart TB
subgraph Core["Core Layer"]
SeriesApp["SeriesApp"]
SeriesCache["SeriesCache<br/>(In-Memory)"]
SerieScanner["SerieScanner"]
SerieList["SerieList"]
end
subgraph Data["Data Layer"]
SQLite[(SQLite<br/>aniworld.db)]
SQLite[("SQLite<br/>aniworld.db")]
ConfigJSON[(config.json)]
FileSystem[(File System<br/>Anime Directory)]
FileSystem[(File System<br/>Anime Episodes)]
LegacyFiles[("Legacy Files<br/>key/data<br/>(Deprecated)")]
end
subgraph External["External"]
@@ -71,9 +73,13 @@ flowchart TB
AnimeService --> SQLite
%% Core to Data
SeriesApp --> SeriesCache
SeriesCache -.->|Cached Series| SQLite
SeriesApp --> SerieScanner
SeriesApp --> SerieList
SerieScanner --> FileSystem
SerieScanner -->|Scan Episodes| FileSystem
SerieScanner -->|Detect Series| SQLite
SerieScanner -->|Migrate Legacy| LegacyFiles
SerieScanner --> Provider
%% Event flow

View File

@@ -59,6 +59,10 @@ The application now features a comprehensive configuration system that allows us
## Anime Management
- **Anime Library Page**: Display list of anime series with missing episodes
- **Library Filters**:
- "Missing Episodes Only" (shows only series with missing episodes, including series that currently have no downloaded episodes)
- "No Episodes" (shows series that are present in the library but have zero downloaded episodes)
- "Show All Series" (overrides other filters to show every series)
- **Database-Backed Series Storage**: All series metadata and missing episodes stored in SQLite database
- **Automatic Database Synchronization**: Series loaded from database on startup, stays in sync with filesystem
- **Series Selection**: Select individual anime series and add episodes to download queue

0
docs/helper Normal file
View File

View File

@@ -117,6 +117,3 @@ For each task completed:
---
## TODO List:
---

View File

@@ -2,3 +2,50 @@ API key : 299ae8f630a31bda814263c551361448
/mnt/server/serien/Serien/
{
"name": "Aniworld",
"data_dir": "data",
"scheduler": {
"enabled": true,
"interval_minutes": 60,
"schedule_time": "03:00",
"schedule_days": [
"mon",
"tue",
"wed",
"thu",
"fri",
"sat",
"sun"
],
"auto_download_after_rescan": true,
"folder_scan_enabled": true
},
"logging": {
"level": "INFO",
"file": null,
"max_bytes": null,
"backup_count": 3
},
"backup": {
"enabled": false,
"path": "data/backups",
"keep_days": 30
},
"nfo": {
"tmdb_api_key": "9bc3e547caff878615cbdba2cc421d37",
"auto_create": true,
"update_on_scan": true,
"download_poster": true,
"download_logo": true,
"download_fanart": true,
"image_size": "original"
},
"other": {
"master_password_hash": "$pbkdf2-sha256$29000$HQNASKk1xpgTAgAgJGRMaQ$73TOCCM0UEZONyNXQEPa3SmIoXeG6C1l5mMFDNgYfMQ",
"anime_directory": "/data"
},
"version": "1.0.0"
}

154
docs/runner.csx Normal file
View File

@@ -0,0 +1,154 @@
#!/usr/bin/env dotnet-script
#nullable enable
using System;
using System.IO;
using System.Diagnostics;
using System.Threading;
using System.Text;
using System.Text.RegularExpressions;
using System.Linq;
using System.Collections.Generic;
// ── Ctrl+C: kill active process and exit cleanly ──────────────────────────────
var cts = new CancellationTokenSource();
Process? activeProcess = null;
Console.CancelKeyPress += (_, e) =>
{
e.Cancel = true;
Console.WriteLine("\n[runner] Interrupted — shutting down...");
cts.Cancel();
try { activeProcess?.Kill(entireProcessTree: true); } catch { }
};
// ── Paths ─────────────────────────────────────────────────────────────────────
var repoRoot = Directory.GetCurrentDirectory();
var tasksFile = Path.Combine(repoRoot, "Docs", "Tasks.md");
if (!File.Exists(tasksFile))
{
Console.Error.WriteLine($"[runner] ERROR: Tasks.md not found at {tasksFile}");
Console.Error.WriteLine("[runner] Run this script from the repository root.");
Environment.Exit(1);
}
// ── Read & split by "---" separator lines ────────────────────────────────────
var content = File.ReadAllText(tasksFile);
var items = Regex
.Split(content, @"\r?\n---\r?\n")
.Select(s => s.Trim())
.Where(s => s.Length > 0)
.ToList();
Console.WriteLine($"[runner] Found {items.Count} section(s) in Tasks.md");
// ── Helper: run copilot and stream output, return full output ─────────────────
async Task<string> RunCopilot(IEnumerable<string> extraArgs, string prompt)
{
var output = new StringBuilder();
var argList = new List<string> { "launch", "copilot", "--model", "minimax-m2.7:cloud", "--yes", "--", "--allow-all-tools" };
argList.AddRange(extraArgs);
argList.Add("-p");
argList.Add(prompt);
var psi = new ProcessStartInfo("ollama")
{
WorkingDirectory = repoRoot,
RedirectStandardOutput = true,
RedirectStandardError = true,
UseShellExecute = false,
};
foreach (var a in argList)
psi.ArgumentList.Add(a);
activeProcess = new Process { StartInfo = psi };
activeProcess.OutputDataReceived += (_, e) =>
{
if (e.Data is null) return;
Console.WriteLine(e.Data);
output.AppendLine(e.Data);
};
activeProcess.ErrorDataReceived += (_, e) =>
{
if (e.Data is null) return;
Console.Error.WriteLine(e.Data);
output.AppendLine(e.Data);
};
activeProcess.Start();
activeProcess.BeginOutputReadLine();
activeProcess.BeginErrorReadLine();
await activeProcess.WaitForExitAsync(cts.Token);
activeProcess = null;
return output.ToString();
}
// ── Main loop ─────────────────────────────────────────────────────────────────
for (int i = 0; i < items.Count; i++)
{
var item = items[i];
if (cts.IsCancellationRequested) break;
Console.WriteLine();
Console.WriteLine("[runner] ══════════════════════════════════════════════");
Console.WriteLine($"[runner] Task:\n{item}");
Console.WriteLine("[runner] ══════════════════════════════════════════════");
Console.WriteLine();
// Step 1 — run the task prompt
await RunCopilot(Enumerable.Empty<string>(), $"/caveman full");
await RunCopilot(new[] { "--continue" }, $"read ./Docs/instructions.md. {item}");
if (cts.IsCancellationRequested) break;
// Step 2 — confirm completion in the same chat session
Console.WriteLine("\n[runner] Asking for task confirmation...\n");
var confirmation = await RunCopilot(
new[] { "--continue" },
"are you sure tasks is done. reply with yes"
);
if (cts.IsCancellationRequested) break;
// Step 3 — check for "yes" in the reply, with retry logic for issue resolution
int maxRetries = 3;
int retryCount = 0;
bool taskConfirmed = confirmation.Contains("yes", StringComparison.OrdinalIgnoreCase);
while (!taskConfirmed && retryCount < maxRetries)
{
retryCount++;
Console.WriteLine($"\n[runner] Attempt {retryCount}/{maxRetries}: Resolving remaining issues and running tests...\n");
confirmation = await RunCopilot(
new[] { "--continue" },
"resolve any remaining issues, make sure all tests are running and pass. then confirm with yes if done"
);
if (cts.IsCancellationRequested) break;
taskConfirmed = confirmation.Contains("yes", StringComparison.OrdinalIgnoreCase);
}
if (!taskConfirmed)
{
Console.WriteLine($"\n[runner] Task not confirmed as done after {maxRetries} attempts. Stopping.");
break;
}
// Step 4 — commit the work
Console.WriteLine("\n[runner] Task confirmed. Making git commit...\n");
await RunCopilot(Enumerable.Empty<string>(), $"/caveman full");
await RunCopilot(new[] { "--continue" }, "make git commit");
if (cts.IsCancellationRequested) break;
// Step 5 — remove completed task from Tasks.md
var remaining = items.Skip(i + 1).ToList();
File.WriteAllText(tasksFile, string.Join("\n\n---\n\n", remaining));
Console.WriteLine("[runner] Removed completed task from Tasks.md");
}
Console.WriteLine("\n[runner] Finished.");

178
docs/tasks.md Normal file
View File

@@ -0,0 +1,178 @@
# Tasks
## 1. Scheduled Folder Scan
### Task 1.1: Add folder scan scheduler configuration
**Where is that found**
- `src/server/models/config.py` (`SchedulerConfig`)
- `data/config.json` (example/default config)
- `src/server/web/templates/setup.html` (setup UI)
- `src/server/api/auth.py` (config save endpoint, if it validates scheduler fields)
**Goal. How it should be**
Add a new boolean field `folder_scan_enabled` (default `false`) to `SchedulerConfig`. When `true`, the scheduler will execute the folder maintenance routine during its scheduled run. Add the field to the setup page as a checkbox. Ensure existing configs without this field load successfully (Pydantic default handles this).
**Possible traps and issues**
- Backward compatibility: old `data/config.json` files must load without errors. Pydantic defaults solve this, but verify by loading an old config.
- The setup page JavaScript must include the new field in the payload sent to `/api/config`.
- Do not confuse this with `auto_download_after_rescan` — this is a separate toggle.
**Docs changes needed**
- `docs/CONFIGURATION.md`: Document the new `scheduler.folder_scan_enabled` option.
- `docs/ARCHITECTURE.md`: Mention folder scan in the scheduler section.
**Why this is needed**
Users need an opt-in toggle to enable automatic daily folder maintenance (NFO repair, folder renaming, poster checks) without forcing it on everyone.
---
### Task 1.2: Create FolderScanService skeleton
**Where is that found**
- New file: `src/server/services/folder_scan_service.py`
- `src/server/services/scheduler_service.py` (to call it)
**Goal. How it should be**
Create a new `FolderScanService` class with a single async entry point `async def run_folder_scan(self) -> None`. The method should:
1. Log start/completion with structlog.
2. Check prerequisites (`settings.anime_directory` exists, `settings.tmdb_api_key` is set).
3. Skip gracefully with a warning log if prerequisites are missing.
4. Use a module-level semaphore (similar to `_NFO_REPAIR_SEMAPHORE`) to limit concurrent TMDB operations to 3.
Keep the implementation empty for the sub-tasks (1.31.5) to fill in. Just add the skeleton and the semaphore.
**Possible traps and issues**
- Circular imports: `folder_scan_service.py` will import from `initialization_service`, `config.settings`, etc. Keep imports inside methods or at the bottom if circular issues arise.
- The service should follow the singleton pattern like `SchedulerService` and `DownloadService` if it holds state, or be stateless. For simplicity, make it a plain class instantiated per call or a module-level function set.
- Exception handling: any unhandled exception in the scheduled task should be caught and logged so it doesn't crash the scheduler.
**Docs changes needed**
- `docs/ARCHITECTURE.md`: Add `folder_scan_service.py` to the services list.
**Why this is needed**
Encapsulates the new daily maintenance logic in its own module, keeping `scheduler_service.py` clean and allowing the folder scan to be tested independently.
---
### Task 1.3: Integrate NFO repair into folder scan
**Where is that found**
- `src/server/services/folder_scan_service.py`
- `src/server/services/initialization_service.py` (`perform_nfo_repair_scan`)
**Goal. How it should be**
Inside `FolderScanService.run_folder_scan()`, call `perform_nfo_repair_scan(background_loader=None)` as the first step. Reuse the existing function exactly — do not copy its logic. Log a message before and after the call.
**Possible traps and issues**
- `perform_nfo_repair_scan` spawns `asyncio.create_task` for each repair. When called from the scheduler, these background tasks will still run after `run_folder_scan` returns. This is fine, but log that repairs are queued.
- The function already handles missing `tmdb_api_key` and `anime_directory`, so the caller doesn't need to double-check, but the skeleton from Task 1.2 already checks prerequisites.
- `perform_nfo_repair_scan` imports `nfo_needs_repair` and `NfoRepairService` inside the function, so no heavy import-time dependencies.
**Docs changes needed**
- `docs/NFO_GUIDE.md`: Update the "Automatic NFO Repair" section to state that repair now runs as part of the scheduled folder scan instead of every startup.
**Why this is needed**
Reuses the existing, tested NFO repair logic. Moves NFO repair from startup blocking to scheduled background maintenance.
---
### Task 1.4: Validate and rename series folders
**Where is that found**
- `src/server/services/folder_scan_service.py`
- `src/core/services/nfo_repair_service.py` (for `parse_nfo_tags` or similar NFO parsing)
- `src/server/database/models.py` / `src/server/database/system_settings_service.py` (if folder paths are stored in DB)
**Goal. How it should be**
After NFO repair, iterate over every subfolder in `settings.anime_directory` that contains a `tvshow.nfo`. For each folder:
1. Parse the NFO to extract `<title>` and `<year>` text values.
2. Compute the expected folder name: `f"{title} ({year})"`.
3. Sanitize the expected name for filesystem safety (remove/replace illegal characters like `/`, `\`, `:`, etc.).
4. Compare with the current folder name (`series_dir.name`).
5. If different, rename the folder using `series_dir.rename(expected_path)`.
6. If the series path is stored in the database (check `anime_service` or DB models), update the database record to point to the new path.
Skip folders where title or year is missing/empty. Log every rename action.
**Possible traps and issues**
- **Database path consistency**: If `Series` or `Episode` models store absolute or relative paths, renaming the folder on disk without updating the DB will break downloads, NFO updates, and the web UI. Must verify whether paths are stored in the DB and update them.
- **Active downloads**: A series currently being downloaded should not be renamed. Check the download queue or lock status before renaming. If no lock mechanism exists, this is a major trap — document it.
- **Filesystem permissions**: The app may not have write permission to the anime directory. Catch `PermissionError` and `OSError` and log gracefully.
- **Special characters**: Titles like `"A / B"` or `"Show: Subtitle"` contain characters illegal in folder names. Define a sanitization function (e.g., replace `/` with `-`, remove trailing dots on Windows, etc.).
- **Duplicate names**: Two different series could sanitize to the same name. Check if target path already exists before renaming.
- **Path length limits**: Very long titles might exceed OS path limits.
**Docs changes needed**
- `docs/NFO_GUIDE.md`: Add a section "Folder Naming Convention" explaining the `<title> (<year>)` format.
- `docs/CONFIGURATION.md`: Mention that enabling folder scan will rename folders.
**Why this is needed**
Enforces a consistent, predictable folder naming scheme across the library, making it easier for media center apps (Kodi, Jellyfin, Plex) to match metadata.
---
### Task 1.5: Check and download missing poster.jpg
**Where is that found**
- `src/server/services/folder_scan_service.py`
- `src/core/utils/image_downloader.py` (`ImageDownloader`)
- `src/core/services/nfo_service.py` or `src/core/services/nfo_repair_service.py` (to get poster URL from NFO or TMDB)
**Goal. How it should be**
After folder renaming, iterate over series folders again (or combine with Task 1.4 loop). For each folder:
1. Check if `poster.jpg` exists and has a size ≥ `ImageDownloader.min_file_size` (1 KB by default).
2. If missing or too small:
a. Parse `tvshow.nfo` for `<thumb aspect="poster">` or `<thumb>` URL.
b. If no URL in NFO, skip (do not query TMDB again to keep tasks small; the NFO should already have it after repair).
c. Use `ImageDownloader` (with context manager) to download the image to `series_dir / "poster.jpg"`.
d. Validate the downloaded image with `ImageDownloader._validate_image` (or similar existing validation).
3. Use the existing `_NFO_REPAIR_SEMAPHORE` or a new `POSTER_DOWNLOAD_SEMAPHORE` to limit concurrent downloads to 3.
**Possible traps and issues**
- **TMDB rate limiting**: Even downloading images hits TMDB CDN. The semaphore limits concurrency.
- **Invalid images**: A download might produce a 0-byte or corrupted file. `ImageDownloader` already validates with PIL; reuse that.
- **NFO without thumb URL**: If the NFO was created before thumb tags were added, there may be no URL. In that case, skip and log. A future task could query TMDB directly.
- **Write permissions**: Same as Task 1.4.
- **Async session sharing**: `ImageDownloader` manages its own `aiohttp` session. Use `async with ImageDownloader() as downloader:` to ensure cleanup.
**Docs changes needed**
- `docs/NFO_GUIDE.md`: Add "Poster Check" subsection under folder scan.
- `docs/CONFIGURATION.md`: Mention that `nfo.download_poster` setting also affects scheduled poster checks.
**Why this is needed**
Ensures every series has artwork, which is required by most media center front-ends for a polished library view.
---
## 2. Remove startup NFO repair
### Task 2.1: Remove perform_nfo_repair_scan from startup lifespan
**Where is that found**
- `src/server/fastapi_app.py` (lifespan startup block, lines ~245 and ~319)
- `src/server/services/initialization_service.py` (keep the function, just remove the call site)
- `tests/integration/test_nfo_repair_startup.py`
- `tests/unit/test_initialization_service.py` (tests that call `perform_nfo_repair_scan` directly can stay, but integration tests verifying startup wiring must change)
**Goal. How it should be**
1. In `src/server/fastapi_app.py`, remove the import of `perform_nfo_repair_scan` from the `initialization_service` import block.
2. Remove the line `await perform_nfo_repair_scan(background_loader)` from the lifespan startup sequence.
3. Update `tests/integration/test_nfo_repair_startup.py`:
- Remove or modify `test_perform_nfo_repair_scan_imported_in_lifespan` and `test_perform_nfo_repair_scan_called_after_media_scan` since the startup wiring is gone.
- Replace with a test that verifies `perform_nfo_repair_scan` is NOT called during startup (or simply delete the file if it has no other purpose).
4. `tests/unit/test_initialization_service.py` tests for `perform_nfo_repair_scan` can remain because they test the function itself, not the startup wiring.
**Possible traps and issues**
- **Test failures**: `test_nfo_repair_startup.py` will fail immediately after the code change. It must be updated in the same PR.
- **Documentation drift**: `docs/NFO_GUIDE.md`, `docs/CHANGELOG.md`, and `docs/ARCHITECTURE.md` all describe the startup NFO repair behavior. If docs are not updated, users will expect repair on every start.
- **Background loader parameter**: The `background_loader` variable was created partly for `perform_nfo_repair_scan`. After removal, check if `background_loader` is still needed for other startup steps (yes — `perform_media_scan_if_needed` uses it). Do not remove `background_loader` entirely.
- **Import cleanup**: Ensure no unused imports remain in `fastapi_app.py` after removal.
**Docs changes needed**
- `docs/NFO_GUIDE.md`: Update section 11 "Automatic NFO Repair" to remove startup references and state it runs via scheduler.
- `docs/CHANGELOG.md`: Add an entry under "Changed" or "Removed" noting that startup NFO repair is replaced by scheduled folder scan.
- `docs/ARCHITECTURE.md`: Update the startup sequence description.
**Why this is needed**
Running `perform_nfo_repair_scan` on every startup slows down server restarts, especially for large libraries. Moving it to a scheduled task keeps startup fast while still ensuring regular maintenance.

View File

@@ -1,6 +1,6 @@
{
"name": "aniworld-web",
"version": "1.0.0",
"version": "1.3.6",
"description": "Aniworld Anime Download Manager - Web Frontend",
"type": "module",
"scripts": {

View File

@@ -18,4 +18,11 @@ aiosqlite>=0.19.0
aiohttp>=3.9.0
lxml>=5.0.0
pillow>=10.0.0
APScheduler>=3.10.4
APScheduler>=3.10.4
Events>=0.5
requests>=2.31.0
beautifulsoup4>=4.12.0
chardet>=5.2.0
fake-useragent>=1.4.0
yt-dlp>=2024.1.0
urllib3>=2.0.0

View File

@@ -5,6 +5,7 @@ and checking NFO metadata files.
"""
import asyncio
import logging
import sys
from pathlib import Path
@@ -14,48 +15,50 @@ sys.path.insert(0, str(Path(__file__).parent.parent.parent))
from src.config.settings import settings
from src.core.services.series_manager_service import SeriesManagerService
logger = logging.getLogger(__name__)
async def scan_and_create_nfo():
"""Scan all series and create missing NFO files."""
print("=" * 70)
print("NFO Auto-Creation Tool")
print("=" * 70)
logger.info("%s", "=" * 70)
logger.info("NFO Auto-Creation Tool")
logger.info("%s", "=" * 70)
if not settings.tmdb_api_key:
print("\n❌ Error: TMDB_API_KEY not configured")
print(" Set TMDB_API_KEY in .env file or environment")
print(" Get API key from: https://www.themoviedb.org/settings/api")
logger.error("TMDB_API_KEY not configured")
logger.error("Set TMDB_API_KEY in .env file or environment")
logger.error("Get API key from: https://www.themoviedb.org/settings/api")
return 1
if not settings.anime_directory:
print("\n❌ Error: ANIME_DIRECTORY not configured")
logger.error("ANIME_DIRECTORY not configured")
return 1
print(f"\nAnime Directory: {settings.anime_directory}")
print(f"Auto-create NFO: {settings.nfo_auto_create}")
print(f"Update on scan: {settings.nfo_update_on_scan}")
print(f"Download poster: {settings.nfo_download_poster}")
print(f"Download logo: {settings.nfo_download_logo}")
print(f"Download fanart: {settings.nfo_download_fanart}")
logger.info("Anime Directory: %s", settings.anime_directory)
logger.info("Auto-create NFO: %s", settings.nfo_auto_create)
logger.info("Update on scan: %s", settings.nfo_update_on_scan)
logger.info("Download poster: %s", settings.nfo_download_poster)
logger.info("Download logo: %s", settings.nfo_download_logo)
logger.info("Download fanart: %s", settings.nfo_download_fanart)
if not settings.nfo_auto_create:
print("\n⚠️ Warning: NFO_AUTO_CREATE is set to False")
print(" Enable it in .env to auto-create NFO files")
print("\n Continuing anyway to demonstrate functionality...")
logger.warning("NFO_AUTO_CREATE is set to False")
logger.warning("Enable it in .env to auto-create NFO files")
logger.info("Continuing anyway to demonstrate functionality...")
# Override for demonstration
settings.nfo_auto_create = True
print("\nInitializing series manager...")
logger.info("Initializing series manager...")
manager = SeriesManagerService.from_settings()
# Get series list first
serie_list = manager.get_serie_list()
all_series = serie_list.get_all()
print(f"Found {len(all_series)} series in directory")
logger.info("Found %d series in directory", len(all_series))
if not all_series:
print("\n⚠️ No series found. Add some anime series first.")
logger.warning("No series found. Add some anime series first.")
return 0
# Show series without NFO
@@ -65,25 +68,25 @@ async def scan_and_create_nfo():
series_without_nfo.append(serie)
if series_without_nfo:
print(f"\nSeries without NFO: {len(series_without_nfo)}")
logger.info("Series without NFO: %d", len(series_without_nfo))
for serie in series_without_nfo[:5]: # Show first 5
print(f" - {serie.name} ({serie.folder})")
logger.debug("Missing NFO: %s (%s)", serie.name, serie.folder)
if len(series_without_nfo) > 5:
print(f" ... and {len(series_without_nfo) - 5} more")
logger.info("... and %d more", len(series_without_nfo) - 5)
else:
print("\nAll series already have NFO files!")
logger.info("All series already have NFO files")
if not settings.nfo_update_on_scan:
print("\nNothing to do. Enable NFO_UPDATE_ON_SCAN to update existing NFOs.")
logger.info("Nothing to do. Enable NFO_UPDATE_ON_SCAN to update existing NFOs.")
return 0
print("\nProcessing NFO files...")
print("(This may take a while depending on the number of series)")
logger.info("Processing NFO files...")
logger.info("This may take a while depending on the number of series")
try:
await manager.scan_and_process_nfo()
print("\nNFO processing complete!")
logger.info("NFO processing complete")
# Show updated stats
serie_list.load_series() # Reload to get updated stats
all_series = serie_list.get_all()
@@ -91,17 +94,17 @@ async def scan_and_create_nfo():
series_with_poster = [s for s in all_series if s.has_poster()]
series_with_logo = [s for s in all_series if s.has_logo()]
series_with_fanart = [s for s in all_series if s.has_fanart()]
print("\nFinal Statistics:")
print(f" Series with NFO: {len(series_with_nfo)}/{len(all_series)}")
print(f" Series with poster: {len(series_with_poster)}/{len(all_series)}")
print(f" Series with logo: {len(series_with_logo)}/{len(all_series)}")
print(f" Series with fanart: {len(series_with_fanart)}/{len(all_series)}")
except Exception as e:
print(f"\n❌ Error: {e}")
import traceback
traceback.print_exc()
logger.info("Final statistics", extra={
"total_series": len(all_series),
"with_nfo": len(series_with_nfo),
"with_poster": len(series_with_poster),
"with_logo": len(series_with_logo),
"with_fanart": len(series_with_fanart),
})
except Exception:
logger.exception("Failed to process NFO files")
return 1
finally:
await manager.close()
@@ -111,78 +114,92 @@ async def scan_and_create_nfo():
async def check_nfo_status():
"""Check NFO status for all series."""
print("=" * 70)
print("NFO Status Check")
print("=" * 70)
logger.info("%s", "=" * 70)
logger.info("NFO Status Check")
logger.info("%s", "=" * 70)
if not settings.anime_directory:
print("\n❌ Error: ANIME_DIRECTORY not configured")
logger.error("ANIME_DIRECTORY not configured")
return 1
print(f"\nAnime Directory: {settings.anime_directory}")
logger.info("Anime Directory: %s", settings.anime_directory)
# Create series list (no NFO service needed for status check)
from src.core.entities.SerieList import SerieList
serie_list = SerieList(settings.anime_directory)
all_series = serie_list.get_all()
if not all_series:
print("\n⚠️ No series found")
logger.warning("No series found")
return 0
print(f"\nTotal series: {len(all_series)}")
logger.info("Total series: %d", len(all_series))
# Categorize series
with_nfo = []
without_nfo = []
for serie in all_series:
if serie.has_nfo():
with_nfo.append(serie)
else:
without_nfo.append(serie)
print(f"\nWith NFO: {len(with_nfo)} ({len(with_nfo) * 100 // len(all_series)}%)")
print(f"Without NFO: {len(without_nfo)} ({len(without_nfo) * 100 // len(all_series)}%)")
logger.info(
"Series NFO coverage",
extra={
"with_nfo": len(with_nfo),
"without_nfo": len(without_nfo),
"total": len(all_series),
},
)
if without_nfo:
print("\nSeries missing NFO:")
logger.info("Series missing NFO: %d", len(without_nfo))
for serie in without_nfo[:10]:
print(f"{serie.name} ({serie.folder})")
logger.debug("Missing NFO: %s (%s)", serie.name, serie.folder)
if len(without_nfo) > 10:
print(f" ... and {len(without_nfo) - 10} more")
logger.info("... and %d more", len(without_nfo) - 10)
# Media file statistics
with_poster = sum(1 for s in all_series if s.has_poster())
with_logo = sum(1 for s in all_series if s.has_logo())
with_fanart = sum(1 for s in all_series if s.has_fanart())
print("\nMedia Files:")
print(f" Posters: {with_poster}/{len(all_series)} ({with_poster * 100 // len(all_series)}%)")
print(f" Logos: {with_logo}/{len(all_series)} ({with_logo * 100 // len(all_series)}%)")
print(f" Fanart: {with_fanart}/{len(all_series)} ({with_fanart * 100 // len(all_series)}%)")
logger.info(
"Media file coverage",
extra={
"posters": with_poster,
"logos": with_logo,
"fanart": with_fanart,
"total": len(all_series),
},
)
return 0
async def update_nfo_files():
"""Update existing NFO files with fresh data from TMDB."""
print("=" * 70)
print("NFO Update Tool")
print("=" * 70)
logger.info("%s", "=" * 70)
logger.info("NFO Update Tool")
logger.info("%s", "=" * 70)
if not settings.tmdb_api_key:
print("\n❌ Error: TMDB_API_KEY not configured")
print(" Set TMDB_API_KEY in .env file or environment")
print(" Get API key from: https://www.themoviedb.org/settings/api")
logger.error("TMDB_API_KEY not configured")
logger.error("Set TMDB_API_KEY in .env file or environment")
logger.error("Get API key from: https://www.themoviedb.org/settings/api")
return 1
if not settings.anime_directory:
print("\n❌ Error: ANIME_DIRECTORY not configured")
logger.error("ANIME_DIRECTORY not configured")
return 1
print(f"\nAnime Directory: {settings.anime_directory}")
print(f"Download media: {settings.nfo_download_poster or settings.nfo_download_logo or settings.nfo_download_fanart}")
logger.info("Anime Directory: %s", settings.anime_directory)
logger.info(
"Download media: %s",
settings.nfo_download_poster or settings.nfo_download_logo or settings.nfo_download_fanart,
)
# Get series with NFO
from src.core.entities.SerieList import SerieList
@@ -191,57 +208,55 @@ async def update_nfo_files():
series_with_nfo = [s for s in all_series if s.has_nfo()]
if not series_with_nfo:
print("\n⚠️ No series with NFO files found")
print(" Run 'scan' command first to create NFO files")
logger.warning("No series with NFO files found")
logger.info("Run 'scan' command first to create NFO files")
return 0
print(f"\nFound {len(series_with_nfo)} series with NFO files")
print("Updating NFO files with fresh data from TMDB...")
print("(This may take a while)")
logger.info("Found %d series with NFO files", len(series_with_nfo))
logger.info("Updating NFO files with fresh data from TMDB...")
logger.info("This may take a while")
# Initialize NFO service using factory
from src.core.services.nfo_factory import create_nfo_service
try:
nfo_service = create_nfo_service()
except ValueError as e:
print(f"\nError: {e}")
logger.error("Error creating NFO service: %s", e)
return 1
success_count = 0
error_count = 0
try:
for i, serie in enumerate(series_with_nfo, 1):
print(f"\n[{i}/{len(series_with_nfo)}] Updating: {serie.name}")
logger.info("[%d/%d] Updating: %s", i, len(series_with_nfo), serie.name)
try:
await nfo_service.update_tvshow_nfo(
serie_folder=serie.folder,
download_media=(
settings.nfo_download_poster or
settings.nfo_download_logo or
settings.nfo_download_poster or
settings.nfo_download_logo or
settings.nfo_download_fanart
)
),
)
print(f"Updated successfully")
logger.info("Updated successfully: %s", serie.name)
success_count += 1
# Small delay to respect API rate limits
await asyncio.sleep(0.5)
except Exception as e:
print(f" ❌ Error: {e}")
logger.exception("Failed to update NFO for %s", serie.name)
error_count += 1
print("\n" + "=" * 70)
print(f"Update complete!")
print(f" Success: {success_count}")
print(f" Errors: {error_count}")
except Exception as e:
print(f"\nFatal error: {e}")
import traceback
traceback.print_exc()
logger.info("%s", "=" * 70)
logger.info("Update complete")
logger.info("Success: %d", success_count)
logger.info("Errors: %d", error_count)
except Exception:
logger.exception("Fatal error during NFO update")
return 1
finally:
await nfo_service.close()
@@ -251,20 +266,22 @@ async def update_nfo_files():
def main():
"""Main CLI entry point."""
logging.basicConfig(level=logging.INFO, format="%(message)s")
if len(sys.argv) < 2:
print("NFO Management Tool")
print("\nUsage:")
print(" python -m src.cli.nfo_cli scan # Scan and create missing NFO files")
print(" python -m src.cli.nfo_cli status # Check NFO status for all series")
print(" python -m src.cli.nfo_cli update # Update existing NFO files with fresh data")
print("\nConfiguration:")
print(" Set TMDB_API_KEY in .env file")
print(" Set NFO_AUTO_CREATE=true to enable auto-creation")
print(" Set NFO_UPDATE_ON_SCAN=true to update existing NFOs during scan")
logger.info("NFO Management Tool")
logger.info("\nUsage:")
logger.info(" python -m src.cli.nfo_cli scan # Scan and create missing NFO files")
logger.info(" python -m src.cli.nfo_cli status # Check NFO status for all series")
logger.info(" python -m src.cli.nfo_cli update # Update existing NFO files with fresh data")
logger.info("\nConfiguration:")
logger.info(" Set TMDB_API_KEY in .env file")
logger.info(" Set NFO_AUTO_CREATE=true to enable auto-creation")
logger.info(" Set NFO_UPDATE_ON_SCAN=true to update existing NFOs during scan")
return 1
command = sys.argv[1].lower()
if command == "scan":
return asyncio.run(scan_and_create_nfo())
elif command == "status":
@@ -272,8 +289,8 @@ def main():
elif command == "update":
return asyncio.run(update_nfo_files())
else:
print(f"Unknown command: {command}")
print("Use 'scan', 'status', or 'update'")
logger.error("Unknown command: %s", command)
logger.info("Use 'scan', 'status', or 'update'")
return 1

View File

@@ -1,3 +1,4 @@
import re
import secrets
from typing import Optional
@@ -114,6 +115,40 @@ class Settings(BaseSettings):
validation_alias="NFO_PREFER_FSK_RATING",
description="Prefer German FSK rating over MPAA rating in NFO files"
)
nfo_folder_ignore_patterns: str = Field(
default="The Last of Us|Loki|Chernobyl|Star Trek Discovery|Marvel|Matrix|Fast & Furious|Jurassic|James Bond|Mission: Impossible|Bourne|Hunger Games|Die Hard|John Wick|Pacific Rim|Guardians of the Galaxy|Avengers|Batman|Superman|Wonder Woman|Spider-Man|X-Men|Fantastic Four|Terminator|Predator|Rambo|Rocky|Expendables|Tomb Raider|Jumanji|Jurassic Park|Pirates of the Caribbean|Harry Potter|Lord of the Rings|Hobbit|Game of Thrones|Westworld|Stranger Things|Breaking Bad|Better Call Saul|Sherlock|Downton Abbey|The Crown|Bridgerton|Sex Education|Normal People|Emily in Paris|The Witcher|Servant|Lucifer|Dark|Shadow and Bone|Grimm|Fairytale",
validation_alias="NFO_FOLDER_IGNORE_PATTERNS",
description="Regex patterns for folder names to skip during scan (pipe-separated)"
)
@property
def folder_ignore_patterns(self) -> list[str]:
"""Parse ignore patterns from comma-separated string into list.
Returns:
List of regex patterns to skip during folder scanning.
"""
if not self.nfo_folder_ignore_patterns:
return []
return [
pattern.strip()
for pattern in self.nfo_folder_ignore_patterns.split("|")
if pattern.strip()
]
def should_ignore_folder(self, folder_name: str) -> bool:
"""Check if folder should be ignored based on ignore patterns.
Args:
folder_name: Name of folder to check.
Returns:
True if folder matches any ignore pattern, False otherwise.
"""
for pattern in self.folder_ignore_patterns:
if re.search(pattern, folder_name, re.IGNORECASE):
return True
return False
@property
def allowed_origins(self) -> list[str]:
@@ -134,5 +169,23 @@ class Settings(BaseSettings):
]
return [origin.strip() for origin in raw.split(",") if origin.strip()]
@property
def scan_key_overrides(self) -> dict[str, str]:
"""Return scan key overrides from config.json.
Maps folder names to provider keys for cases where auto-generated
keys from folder names are incorrect.
Returns:
Dict mapping folder names to provider keys.
"""
from src.server.services.config_service import ConfigService
try:
config_service = ConfigService()
config = config_service.load_config()
return config.scan_key_overrides or {}
except Exception:
return {}
settings = Settings()

File diff suppressed because it is too large Load Diff

View File

@@ -14,7 +14,7 @@ import asyncio
import logging
import os
from concurrent.futures import ThreadPoolExecutor
from typing import Any, Dict, List, Optional
from typing import Any, Callable, Dict, List, Optional
from events import Events
@@ -143,12 +143,16 @@ class SeriesApp:
def __init__(
self,
directory_to_search: str,
db_lookup: Optional[Callable[[str], Optional["Serie"]]] = None,
):
"""
Initialize SeriesApp.
Args:
directory_to_search: Base directory for anime series
db_lookup: Optional callable ``(folder_name) -> Serie | None``
passed through to ``SerieScanner`` as a fallback key source
when no local ``key`` or ``data`` file exists.
"""
self.directory_to_search = directory_to_search
@@ -162,7 +166,10 @@ class SeriesApp:
self.loaders = Loaders()
self.loader = self.loaders.GetLoader(key="aniworld.to")
self.serie_scanner = SerieScanner(
directory_to_search, self.loader
directory_to_search,
self.loader,
db_lookup=db_lookup,
scan_key_overrides=settings.scan_key_overrides,
)
# Skip automatic loading from data files - series will be loaded
# from database by the service layer during application setup
@@ -171,23 +178,26 @@ class SeriesApp:
# Initialize empty list - series loaded later via load_series_from_list()
# No need to call _init_list_sync() anymore
# Initialize NFO service if TMDB API key is configured
# Initialize NFO service if a TMDB API key is configured
self.nfo_service: Optional[NFOService] = None
if settings.tmdb_api_key:
try:
from src.core.services.nfo_factory import get_nfo_factory
factory = get_nfo_factory()
self.nfo_service = factory.create()
logger.info("NFO service initialized successfully")
except (ValueError, Exception) as e: # pylint: disable=broad-except
logger.warning(
"Failed to initialize NFO service: %s", str(e)
)
self.nfo_service = None
try:
from src.core.services.nfo_factory import get_nfo_factory
factory = get_nfo_factory()
self.nfo_service = factory.create()
logger.info("NFO service initialized successfully")
except ValueError:
logger.info(
"NFO service not available — TMDB API key not configured"
)
self.nfo_service = None
except Exception as e: # pylint: disable=broad-except
logger.warning("Failed to initialize NFO service: %s", str(e))
self.nfo_service = None
logger.info(
"SeriesApp initialized for directory: %s",
directory_to_search
directory_to_search,
)
@property
@@ -438,9 +448,12 @@ class SeriesApp:
try:
def download_progress_handler(progress_info):
"""Handle download progress events from loader."""
logger.debug(
"download_progress_handler called with: %s", progress_info
)
# Throttle progress logging to avoid spam
status = progress_info.get("status", "")
if status in ("downloading", "finished"):
logger.debug(
"download_progress_handler called with: %s", progress_info
)
downloaded = progress_info.get('downloaded_bytes', 0)
total_bytes = (

View File

@@ -1,320 +1,531 @@
"""Utilities for loading and managing stored anime series metadata.
This module provides the SerieList class for managing collections of anime
series metadata. It uses file-based storage only.
Note:
This module is part of the core domain layer and has no database
dependencies. All database operations are handled by the service layer.
"""
from __future__ import annotations
import logging
import os
import warnings
from json import JSONDecodeError
from typing import Dict, Iterable, List, Optional
from src.core.entities.series import Serie
logger = logging.getLogger(__name__)
class SerieList:
"""
Represents the collection of cached series stored on disk.
Series are identified by their unique 'key' (provider identifier).
The 'folder' is metadata only and not used for lookups.
This class manages in-memory series data loaded from filesystem.
It has no database dependencies - all persistence is handled by
the service layer.
Example:
# File-based mode
serie_list = SerieList("/path/to/anime")
series = serie_list.get_all()
Attributes:
directory: Path to the anime directory
keyDict: Internal dictionary mapping serie.key to Serie objects
"""
def __init__(
self,
base_path: str,
skip_load: bool = False
) -> None:
"""Initialize the SerieList.
Args:
base_path: Path to the anime directory
skip_load: If True, skip automatic loading of series from files.
Useful when planning to load from database instead.
"""
self.directory: str = base_path
# Internal storage using serie.key as the dictionary key
self.keyDict: Dict[str, Serie] = {}
# Only auto-load from files if not skipping
if not skip_load:
self.load_series()
def add(self, serie: Serie, use_sanitized_folder: bool = True) -> str:
"""
Persist a new series if it is not already present (file-based mode).
Uses serie.key for identification. Creates the filesystem folder
using either the sanitized display name (default) or the existing
folder property.
Args:
serie: The Serie instance to add
use_sanitized_folder: If True (default), use serie.sanitized_folder
for the filesystem folder name based on display name.
If False, use serie.folder as-is for backward compatibility.
Returns:
str: The folder path that was created/used
Note:
This method creates data files on disk. For database storage,
use add_to_db() instead.
"""
if self.contains(serie.key):
# Return existing folder path
existing = self.keyDict[serie.key]
return os.path.join(self.directory, existing.folder)
# Determine folder name to use
if use_sanitized_folder:
folder_name = serie.sanitized_folder
# Update the serie's folder property to match what we create
serie.folder = folder_name
else:
folder_name = serie.folder
data_path = os.path.join(self.directory, folder_name, "data")
anime_path = os.path.join(self.directory, folder_name)
os.makedirs(anime_path, exist_ok=True)
if not os.path.isfile(data_path):
serie.save_to_file(data_path)
# Store by key, not folder
self.keyDict[serie.key] = serie
return anime_path
def contains(self, key: str) -> bool:
"""
Return True when a series identified by ``key`` already exists.
Args:
key: The unique provider identifier for the series
Returns:
True if the series exists in the collection
"""
return key in self.keyDict
def load_series(self) -> None:
"""Populate the in-memory map with metadata discovered on disk."""
logging.info("Scanning anime folders in %s", self.directory)
try:
entries: Iterable[str] = os.listdir(self.directory)
except OSError as error:
logging.error(
"Unable to scan directory %s: %s",
self.directory,
error,
)
return
nfo_stats = {"total": 0, "with_nfo": 0, "without_nfo": 0}
media_stats = {
"with_poster": 0,
"without_poster": 0,
"with_logo": 0,
"without_logo": 0,
"with_fanart": 0,
"without_fanart": 0
}
for anime_folder in entries:
anime_path = os.path.join(self.directory, anime_folder, "data")
if os.path.isfile(anime_path):
logging.debug("Found data file for folder %s", anime_folder)
serie = self._load_data(anime_folder, anime_path)
if serie:
nfo_stats["total"] += 1
# Check for NFO file
nfo_file_path = os.path.join(
self.directory, anime_folder, "tvshow.nfo"
)
if os.path.isfile(nfo_file_path):
serie.nfo_path = nfo_file_path
nfo_stats["with_nfo"] += 1
else:
nfo_stats["without_nfo"] += 1
logging.debug(
"Series '%s' (key: %s) is missing tvshow.nfo",
serie.name,
serie.key
)
# Check for media files
folder_path = os.path.join(self.directory, anime_folder)
poster_path = os.path.join(folder_path, "poster.jpg")
if os.path.isfile(poster_path):
media_stats["with_poster"] += 1
else:
media_stats["without_poster"] += 1
logging.debug(
"Series '%s' (key: %s) is missing poster.jpg",
serie.name,
serie.key
)
logo_path = os.path.join(folder_path, "logo.png")
if os.path.isfile(logo_path):
media_stats["with_logo"] += 1
else:
media_stats["without_logo"] += 1
logging.debug(
"Series '%s' (key: %s) is missing logo.png",
serie.name,
serie.key
)
fanart_path = os.path.join(folder_path, "fanart.jpg")
if os.path.isfile(fanart_path):
media_stats["with_fanart"] += 1
else:
media_stats["without_fanart"] += 1
logging.debug(
"Series '%s' (key: %s) is missing fanart.jpg",
serie.name,
serie.key
)
continue
logging.warning(
"Skipping folder %s because no metadata file was found",
anime_folder,
)
# Log summary statistics
if nfo_stats["total"] > 0:
logging.info(
"NFO scan complete: %d series total, %d with NFO, %d without NFO",
nfo_stats["total"],
nfo_stats["with_nfo"],
nfo_stats["without_nfo"]
)
logging.info(
"Media scan complete: Poster (%d/%d), Logo (%d/%d), Fanart (%d/%d)",
media_stats["with_poster"],
nfo_stats["total"],
media_stats["with_logo"],
nfo_stats["total"],
media_stats["with_fanart"],
nfo_stats["total"]
)
def _load_data(self, anime_folder: str, data_path: str) -> Optional[Serie]:
"""
Load a single series metadata file into the in-memory collection.
Args:
anime_folder: The folder name (for logging only)
data_path: Path to the metadata file
Returns:
Serie: The loaded Serie object, or None if loading failed
"""
try:
serie = Serie.load_from_file(data_path)
# Store by key, not folder
self.keyDict[serie.key] = serie
logging.debug(
"Successfully loaded metadata for %s (key: %s)",
anime_folder,
serie.key
)
return serie
except (OSError, JSONDecodeError, KeyError, ValueError) as error:
logging.error(
"Failed to load metadata for folder %s from %s: %s",
anime_folder,
data_path,
error,
)
return None
def GetMissingEpisode(self) -> List[Serie]:
"""Return all series that still contain missing episodes."""
return [
serie
for serie in self.keyDict.values()
if serie.episodeDict
]
def get_missing_episodes(self) -> List[Serie]:
"""PEP8-friendly alias for :meth:`GetMissingEpisode`."""
return self.GetMissingEpisode()
def GetList(self) -> List[Serie]:
"""Return all series instances stored in the list."""
return list(self.keyDict.values())
def get_all(self) -> List[Serie]:
"""PEP8-friendly alias for :meth:`GetList`."""
return self.GetList()
def get_by_key(self, key: str) -> Optional[Serie]:
"""
Get a series by its unique provider key.
This is the primary method for series lookup.
Args:
key: The unique provider identifier (e.g., "attack-on-titan")
Returns:
The Serie instance if found, None otherwise
"""
return self.keyDict.get(key)
def get_by_folder(self, folder: str) -> Optional[Serie]:
"""
Get a series by its folder name.
.. deprecated:: 2.0.0
Use :meth:`get_by_key` instead. Folder-based lookups will be
removed in version 3.0.0. The `folder` field is metadata only
and should not be used for identification.
This method is provided for backward compatibility only.
Prefer using get_by_key() for new code.
Args:
folder: The filesystem folder name (e.g., "Attack on Titan (2013)")
Returns:
The Serie instance if found, None otherwise
"""
warnings.warn(
"get_by_folder() is deprecated and will be removed in v3.0.0. "
"Use get_by_key() instead. The 'folder' field is metadata only.",
DeprecationWarning,
stacklevel=2
)
for serie in self.keyDict.values():
if serie.folder == folder:
return serie
return None
"""Utilities for loading and managing stored anime series metadata.
This module provides the SerieList class for managing collections of anime
series metadata. It supports loading from both filesystem (legacy) and
database (primary).
Note:
This module is part of the core domain layer. Database operations
are handled by the service layer via add_to_db().
"""
from __future__ import annotations
import logging
import os
import warnings
from json import JSONDecodeError
from typing import Dict, Iterable, List, Optional
from src.config.settings import settings
from src.core.entities.series import Serie
logger = logging.getLogger(__name__)
class SerieList:
"""
Represents the collection of cached series stored on disk.
Series are identified by their unique 'key' (provider identifier).
The 'folder' is metadata only and not used for lookups.
This class manages in-memory series data loaded from filesystem.
It has no database dependencies - all persistence is handled by
the service layer.
Example:
# File-based mode
serie_list = SerieList("/path/to/anime")
series = serie_list.get_all()
Attributes:
directory: Path to the anime directory
keyDict: Internal dictionary mapping serie.key to Serie objects
"""
def __init__(
self,
base_path: str,
skip_load: bool = False
) -> None:
"""Initialize the SerieList.
Args:
base_path: Path to the anime directory
skip_load: If True, skip automatic loading of series from files.
Useful when planning to load from database instead.
"""
self.directory: str = base_path
# Internal storage using serie.key as the dictionary key
self.keyDict: Dict[str, Serie] = {}
# Only auto-load from files if not skipping
if not skip_load:
self.load_series()
def add(self, serie: Serie, use_sanitized_folder: bool = True) -> str:
"""
Persist a new series if it is not already present (file-based mode).
Uses serie.key for identification. Creates the filesystem folder
using either the sanitized display name (default) or the existing
folder property.
Args:
serie: The Serie instance to add
use_sanitized_folder: If True (default), use serie.sanitized_folder
for the filesystem folder name based on display name.
If False, use serie.folder as-is for backward compatibility.
Returns:
str: The folder path that was created/used
Note:
This method creates data files on disk. For database storage,
use add_to_db() instead.
"""
if self.contains(serie.key):
# Return existing folder path
existing = self.keyDict[serie.key]
return os.path.join(self.directory, existing.folder)
# Determine folder name to use
if use_sanitized_folder:
folder_name = serie.sanitized_folder
# Update the serie's folder property to match what we create
serie.folder = folder_name
else:
folder_name = serie.folder
data_path = os.path.join(self.directory, folder_name, "data")
anime_path = os.path.join(self.directory, folder_name)
os.makedirs(anime_path, exist_ok=True)
if not os.path.isfile(data_path):
serie.save_to_file(data_path)
# Store by key, not folder
self.keyDict[serie.key] = serie
return anime_path
async def add_to_db(self, serie: Serie) -> bool:
"""Persist a new series to the database.
Creates the filesystem folder using serie.folder, then persists
the series metadata to the database.
Args:
serie: The Serie instance to add
Returns:
True if successful, False otherwise
"""
try:
from src.server.database.connection import get_async_session_factory
from src.server.database.service import AnimeSeriesService, EpisodeService
folder_name = serie.folder
anime_path = os.path.join(self.directory, folder_name)
os.makedirs(anime_path, exist_ok=True)
session_factory = get_async_session_factory()
db = session_factory()
try:
existing = await AnimeSeriesService.get_by_key(db, serie.key)
if existing:
logger.debug(
"Series '%s' (key=%s) already exists in DB, skipping",
serie.name, serie.key
)
return True
anime_series = await AnimeSeriesService.create(
db=db,
key=serie.key,
name=serie.name,
site=serie.site,
folder=folder_name,
year=serie.year
)
for season, eps in serie.episodeDict.items():
for ep in eps:
await EpisodeService.create(
db=db,
series_id=anime_series.id,
season=season,
episode_number=ep
)
await db.commit()
self.keyDict[serie.key] = serie
logger.info(
"Persisted series '%s' to database",
serie.name
)
return True
except Exception as e:
await db.rollback()
logger.error(
"Failed to persist series '%s' to DB: %s",
serie.key, e, exc_info=True
)
return False
finally:
await db.close()
except Exception as e:
logger.error(
"Could not add series '%s' to DB (DB unavailable?): %s",
serie.key, e
)
return False
def contains(self, key: str) -> bool:
"""
Return True when a series identified by ``key`` already exists.
Args:
key: The unique provider identifier for the series
Returns:
True if the series exists in the collection
"""
return key in self.keyDict
def load_series(self) -> None:
"""Populate the in-memory map with metadata discovered on disk."""
logger.info("Scanning anime folders in %s", self.directory)
try:
entries: Iterable[str] = os.listdir(self.directory)
except OSError as error:
logger.error(
"Unable to scan directory %s: %s",
self.directory,
error,
)
return
nfo_stats = {"total": 0, "with_nfo": 0, "without_nfo": 0}
media_stats = {
"with_poster": 0,
"without_poster": 0,
"with_logo": 0,
"without_logo": 0,
"with_fanart": 0,
"without_fanart": 0
}
for anime_folder in entries:
if settings.should_ignore_folder(anime_folder):
logger.debug("Skipping ignored folder: %s", anime_folder)
continue
anime_path = os.path.join(self.directory, anime_folder, "data")
if os.path.isfile(anime_path):
logger.debug("Found data file for folder %s", anime_folder)
serie = self._load_data(anime_folder, anime_path)
if serie:
nfo_stats["total"] += 1
# Check for NFO file
nfo_file_path = os.path.join(
self.directory, anime_folder, "tvshow.nfo"
)
if os.path.isfile(nfo_file_path):
serie.nfo_path = nfo_file_path
nfo_stats["with_nfo"] += 1
else:
nfo_stats["without_nfo"] += 1
logger.debug(
"Series '%s' (key: %s) is missing tvshow.nfo",
serie.name,
serie.key
)
# Check for media files
folder_path = os.path.join(self.directory, anime_folder)
poster_path = os.path.join(folder_path, "poster.jpg")
if os.path.isfile(poster_path):
media_stats["with_poster"] += 1
else:
media_stats["without_poster"] += 1
logger.debug(
"Series '%s' (key: %s) is missing poster.jpg",
serie.name,
serie.key
)
logo_path = os.path.join(folder_path, "logo.png")
if os.path.isfile(logo_path):
media_stats["with_logo"] += 1
else:
media_stats["without_logo"] += 1
logger.debug(
"Series '%s' (key: %s) is missing logo.png",
serie.name,
serie.key
)
fanart_path = os.path.join(folder_path, "fanart.jpg")
if os.path.isfile(fanart_path):
media_stats["with_fanart"] += 1
else:
media_stats["without_fanart"] += 1
logger.debug(
"Series '%s' (key: %s) is missing fanart.jpg",
serie.name,
serie.key
)
continue
logger.warning(
"Skipping folder %s because no metadata file was found",
anime_folder,
)
# Log summary statistics
if nfo_stats["total"] > 0:
logger.info(
"NFO scan complete: %d series total, %d with NFO, %d without NFO",
nfo_stats["total"],
nfo_stats["with_nfo"],
nfo_stats["without_nfo"]
)
logger.info(
"Media scan complete: Poster (%d/%d), Logo (%d/%d), Fanart (%d/%d)",
media_stats["with_poster"],
nfo_stats["total"],
media_stats["with_logo"],
nfo_stats["total"],
media_stats["with_fanart"],
nfo_stats["total"]
)
def _load_data(self, anime_folder: str, data_path: str) -> Optional[Serie]:
"""
Load a single series metadata file into the in-memory collection.
Args:
anime_folder: The folder name (for logging only)
data_path: Path to the metadata file
Returns:
Serie: The loaded Serie object, or None if loading failed
"""
try:
serie = Serie.load_from_file(data_path)
# Store by key, not folder
self.keyDict[serie.key] = serie
logger.debug(
"Successfully loaded metadata for %s (key: %s)",
anime_folder,
serie.key
)
return serie
except (OSError, JSONDecodeError, KeyError, ValueError) as error:
logger.error(
"Failed to load metadata for folder %s from %s: %s",
anime_folder,
data_path,
error,
)
return None
def GetMissingEpisode(self) -> List[Serie]:
"""Return all series that still contain missing episodes."""
return [
serie
for serie in self.keyDict.values()
if serie.episodeDict
]
def get_missing_episodes(self) -> List[Serie]:
"""PEP8-friendly alias for :meth:`GetMissingEpisode`."""
return self.GetMissingEpisode()
def GetList(self) -> List[Serie]:
"""Return all series instances stored in the list."""
return list(self.keyDict.values())
def get_all(self) -> List[Serie]:
"""PEP8-friendly alias for :meth:`GetList`."""
return self.GetList()
def get_by_key(self, key: str) -> Optional[Serie]:
"""
Get a series by its unique provider key.
This is the primary method for series lookup.
Args:
key: The unique provider identifier (e.g., "attack-on-titan")
Returns:
The Serie instance if found, None otherwise
"""
return self.keyDict.get(key)
def get_by_folder(self, folder: str) -> Optional[Serie]:
"""
Get a series by its folder name.
.. deprecated:: 2.0.0
Use :meth:`get_by_key` instead. Folder-based lookups will be
removed in version 3.0.0. The `folder` field is metadata only
and should not be used for identification.
This method is provided for backward compatibility only.
Prefer using get_by_key() for new code.
Args:
folder: The filesystem folder name (e.g., "Attack on Titan (2013)")
Returns:
The Serie instance if found, None otherwise
"""
warnings.warn(
"get_by_folder() is deprecated and will be removed in v3.0.0. "
"Use get_by_key() instead. The 'folder' field is metadata only.",
DeprecationWarning,
stacklevel=2
)
for serie in self.keyDict.values():
if serie.folder == folder:
return serie
return None
async def load_all_from_db(self) -> int:
"""Load all series from database into in-memory cache.
Retrieves all anime series from the database with their episodes
and populates the in-memory keyDict for fast access.
This method replaces file-based loading. Use after initialization
when database is ready.
Returns:
int: Number of series loaded into cache
"""
from src.server.database.connection import get_async_session_factory
from src.server.database.service import AnimeSeriesService
try:
session_factory = get_async_session_factory()
db = session_factory()
try:
anime_series_list = await AnimeSeriesService.get_all(
db, with_episodes=True
)
count = 0
for anime_series in anime_series_list:
episode_dict: Dict[int, List[int]] = {}
if anime_series.episodes:
for ep in anime_series.episodes:
if ep.season not in episode_dict:
episode_dict[ep.season] = []
episode_dict[ep.season].append(ep.episode_number)
serie = Serie(
key=anime_series.key,
name=anime_series.name,
site=anime_series.site,
folder=anime_series.folder,
episodeDict=episode_dict,
year=anime_series.year
)
self.keyDict[serie.key] = serie
count += 1
logger.info(
"Loaded %d series from database into in-memory cache",
count
)
return count
finally:
await db.close()
except RuntimeError:
logger.warning(
"Database not available, skipping DB load"
)
return 0
async def _load_single_series_from_db(
self,
anime_folder: str
) -> Optional[Serie]:
"""Load a single series from database by folder name.
Looks up a series in the database by its folder name and adds
it to the in-memory cache.
Args:
anime_folder: The filesystem folder name to look up
Returns:
Serie if found and loaded, None otherwise
"""
from src.server.database.connection import get_async_session_factory
from src.server.database.service import AnimeSeriesService
try:
session_factory = get_async_session_factory()
db = session_factory()
try:
anime_series = await AnimeSeriesService.get_by_folder(
db, anime_folder
)
if not anime_series:
logger.debug(
"Series with folder '%s' not found in DB",
anime_folder
)
return None
episode_dict: Dict[int, List[int]] = {}
if anime_series.episodes:
for ep in anime_series.episodes:
if ep.season not in episode_dict:
episode_dict[ep.season] = []
episode_dict[ep.season].append(ep.episode_number)
serie = Serie(
key=anime_series.key,
name=anime_series.name,
site=anime_series.site,
folder=anime_series.folder,
episodeDict=episode_dict,
year=anime_series.year
)
self.keyDict[serie.key] = serie
logger.debug(
"Loaded series '%s' (key=%s) from DB",
serie.name, serie.key
)
return serie
finally:
await db.close()
except RuntimeError:
logger.warning(
"Database not available, cannot load series '%s'",
anime_folder
)
return None
def invalidate_cache(self) -> None:
"""Clear the in-memory cache.
Use after database modifications to force reload from DB
on next access.
"""
self.keyDict.clear()
logger.debug("SerieList in-memory cache invalidated")
def reload(self) -> None:
"""Reload series from filesystem (legacy mode).
Warning:
This method uses file-based loading and should only be
used as fallback when database is not available.
"""
self.load_series()

View File

@@ -64,6 +64,16 @@ class Serie:
f"episodeDict={self.episodeDict}{year_str})"
)
def __repr__(self):
"""Concise developer representation of Serie object."""
season_count = len(self.episodeDict)
episode_count = sum(len(eps) for eps in self.episodeDict.values())
year_str = f", year={self.year}" if self.year else ""
return (
f"Serie(key={self.key!r}, name={self.name!r}"
f"{year_str}, seasons={season_count}, episodes={episode_count})"
)
@property
def key(self) -> str:
"""
@@ -261,7 +271,11 @@ class Serie:
'Dororo (2025)'
"""
if self._year:
return f"{self._name} ({self._year})"
import re
year_suffix = f" ({self._year})"
# Strip ALL trailing year suffixes before appending to prevent duplication
clean_name = re.sub(r'(\s*\(\d{4}\))+\s*$', '', self._name).strip()
return f"{clean_name}{year_suffix}"
return self._name
@property

View File

@@ -7,7 +7,7 @@ errors in provider operations with automatic retry mechanisms.
import functools
import logging
from typing import Any, Callable, TypeVar
from typing import Any, Callable, Optional, TypeVar
logger = logging.getLogger(__name__)
@@ -42,39 +42,85 @@ class DownloadError(Exception):
class RecoveryStrategies:
"""Strategies for handling errors and recovering from failures."""
@staticmethod
def handle_network_failure(
func: Callable, *args: Any, **kwargs: Any
) -> Any:
"""Handle network failures with basic retry logic."""
max_retries = 3
for attempt in range(max_retries):
try:
return func(*args, **kwargs)
except (NetworkError, ConnectionError):
if attempt == max_retries - 1:
raise
logger.warning(
f"Network error on attempt {attempt + 1}, retrying..."
)
continue
def __init__(
self,
max_retries: int = 3,
base_delay: float = 1.0,
max_delay: float = 60.0,
exponential_base: float = 2.0,
) -> None:
"""Initialize recovery strategies.
@staticmethod
def handle_download_failure(
Args:
max_retries: Maximum number of retry attempts.
base_delay: Initial delay between retries in seconds.
max_delay: Maximum delay between retries in seconds.
exponential_base: Base for exponential backoff multiplier.
"""
self.max_retries = max_retries
self.base_delay = base_delay
self.max_delay = max_delay
self.exponential_base = exponential_base
def _calculate_delay(self, attempt: int) -> float:
"""Calculate delay for given retry attempt using exponential backoff.
Args:
attempt: Zero-based retry attempt number.
Returns:
Delay in seconds before next retry.
"""
delay = self.base_delay * (self.exponential_base ** attempt)
return min(delay, self.max_delay)
def handle_network_failure(
self,
func: Callable, *args: Any, **kwargs: Any
) -> Any:
"""Handle download failures with retry logic."""
max_retries = 2
for attempt in range(max_retries):
"""Handle network failures with exponential backoff retry logic."""
last_error: Optional[Exception] = None
for attempt in range(self.max_retries):
try:
return func(*args, **kwargs)
except DownloadError:
if attempt == max_retries - 1:
raise
logger.warning(
f"Download error on attempt {attempt + 1}, retrying..."
)
continue
except (NetworkError, ConnectionError, TimeoutError) as exc:
last_error = exc
if attempt < self.max_retries - 1:
delay = self._calculate_delay(attempt)
logger.warning(
"Network error on attempt %d/%d, retrying in %.1fs: %s",
attempt + 1, self.max_retries, delay, exc
)
import time
time.sleep(delay)
continue
if last_error:
raise last_error
raise NetworkError("Network failure after retries")
def handle_download_failure(
self,
func: Callable, *args: Any, **kwargs: Any
) -> Any:
"""Handle download failures with exponential backoff retry logic."""
last_error: Optional[Exception] = None
for attempt in range(self.max_retries):
try:
return func(*args, **kwargs)
except DownloadError as exc:
last_error = exc
if attempt < self.max_retries - 1:
delay = self._calculate_delay(attempt)
logger.warning(
"Download error on attempt %d/%d, retrying in %.1fs: %s",
attempt + 1, self.max_retries, delay, exc
)
import time
time.sleep(delay)
continue
if last_error:
raise last_error
raise DownloadError("Download failed after retries")
class FileCorruptionDetector:
@@ -92,7 +138,7 @@ class FileCorruptionDetector:
# Video files should be at least 1MB
return file_size > 1024 * 1024
except Exception as e:
logger.error(f"Error checking file validity: {e}")
logger.error("Error checking file validity: %s", e)
return False
@@ -123,13 +169,18 @@ def with_error_recovery(
last_error = e
if attempt < max_retries - 1:
logger.warning(
f"Error in {context} (attempt {attempt + 1}/"
f"{max_retries}): {e}, retrying..."
"Error in %s (attempt %d/%d): %s, retrying...",
context,
attempt + 1,
max_retries,
e,
)
else:
logger.error(
f"Error in {context} failed after {max_retries} "
f"attempts: {e}"
"Error in %s failed after %d attempts: %s",
context,
max_retries,
e,
)
if last_error:

View File

@@ -12,6 +12,8 @@ from dataclasses import dataclass, field
from enum import Enum
from typing import Any, Dict, Optional
logger = logging.getLogger(__name__)
class OperationType(str, Enum):
"""Types of operations that can report progress."""
@@ -313,7 +315,7 @@ class CallbackManager:
callback.on_progress(context)
except Exception as e:
# Log but don't let callback errors break the operation
logging.error(
logger.error(
"Error in progress callback %s: %s",
callback,
e,
@@ -332,7 +334,7 @@ class CallbackManager:
callback.on_error(context)
except Exception as e:
# Log but don't let callback errors break the operation
logging.error(
logger.error(
"Error in error callback %s: %s",
callback,
e,
@@ -351,7 +353,7 @@ class CallbackManager:
callback.on_completion(context)
except Exception as e:
# Log but don't let callback errors break the operation
logging.error(
logger.error(
"Error in completion callback %s: %s",
callback,
e,

File diff suppressed because it is too large Load Diff

View File

@@ -87,7 +87,7 @@ class ProviderConfigManager:
settings: Provider settings to apply.
"""
self._provider_settings[provider_name] = settings
logger.info(f"Updated settings for provider: {provider_name}")
logger.info("Updated settings for provider: %s", provider_name)
def update_provider_settings(
self, provider_name: str, **kwargs
@@ -106,7 +106,7 @@ class ProviderConfigManager:
self._provider_settings[provider_name] = ProviderSettings(
name=provider_name, **kwargs
)
logger.info(f"Created new settings for provider: {provider_name}") # noqa: E501
logger.info("Created new settings for provider: %s", provider_name) # noqa: E501
return True
settings = self._provider_settings[provider_name]
@@ -152,7 +152,7 @@ class ProviderConfigManager:
"""
if provider_name in self._provider_settings:
self._provider_settings[provider_name].enabled = True
logger.info(f"Enabled provider: {provider_name}")
logger.info("Enabled provider: %s", provider_name)
return True
return False
@@ -167,7 +167,7 @@ class ProviderConfigManager:
"""
if provider_name in self._provider_settings:
self._provider_settings[provider_name].enabled = False
logger.info(f"Disabled provider: {provider_name}")
logger.info("Disabled provider: %s", provider_name)
return True
return False
@@ -224,7 +224,7 @@ class ProviderConfigManager:
value: Setting value.
"""
self._global_settings[key] = value
logger.info(f"Updated global setting {key}: {value}")
logger.info("Updated global setting %s: %s", key, value)
def get_all_global_settings(self) -> Dict[str, Any]:
"""Get all global settings.
@@ -307,7 +307,7 @@ class ProviderConfigManager:
with open(config_path, "w", encoding="utf-8") as f:
json.dump(data, f, indent=2)
logger.info(f"Saved configuration to {config_path}")
logger.info("Saved configuration to %s", config_path)
return True
except Exception as e:

File diff suppressed because it is too large Load Diff

View File

@@ -207,7 +207,7 @@ class ProviderFailover:
"""
if provider_name not in self._providers:
self._providers.append(provider_name)
logger.info(f"Added provider to failover chain: {provider_name}")
logger.info("Added provider to failover chain: %s", provider_name)
def remove_provider(self, provider_name: str) -> bool:
"""Remove a provider from the failover chain.

View File

@@ -151,7 +151,7 @@ class ProviderHealthMonitor:
except asyncio.CancelledError:
break
except Exception as e:
logger.error(f"Error in health check loop: {e}", exc_info=True)
logger.exception("Error in health check loop: %s", e)
await asyncio.sleep(self._health_check_interval)
async def _perform_health_checks(self) -> None:
@@ -314,7 +314,7 @@ class ProviderHealthMonitor:
)
best_provider = available[0][0]
logger.debug(f"Best provider selected: {best_provider}")
logger.debug("Best provider selected: %s", best_provider)
return best_provider
def _get_recent_metrics(
@@ -355,7 +355,7 @@ class ProviderHealthMonitor:
provider_name=provider_name
)
self._request_history[provider_name].clear()
logger.info(f"Reset metrics for provider: {provider_name}")
logger.info("Reset metrics for provider: %s", provider_name)
return True
def get_health_summary(self) -> Dict[str, Any]:

View File

@@ -16,6 +16,7 @@ from typing import Dict, List
from lxml import etree
from src.core.services.nfo_service import NFOService
from src.core.services.tmdb_client import TMDBAPIError
logger = logging.getLogger(__name__)
@@ -120,6 +121,37 @@ def nfo_needs_repair(nfo_path: Path) -> bool:
return bool(find_missing_tags(nfo_path))
def _read_tmdb_id(nfo_path: Path) -> int | None:
"""Return the TMDB ID stored in an existing NFO, or ``None``.
Checks both ``<tmdbid>`` and ``<uniqueid type="tmdb">`` elements.
Args:
nfo_path: Absolute path to the ``tvshow.nfo`` file.
Returns:
Integer TMDB ID, or ``None`` if not found or not parseable.
"""
if not nfo_path.exists():
return None
try:
root = etree.parse(str(nfo_path)).getroot()
for uniqueid in root.findall(".//uniqueid"):
if uniqueid.get("type") == "tmdb" and uniqueid.text:
return int(uniqueid.text)
tmdbid_elem = root.find(".//tmdbid")
if tmdbid_elem is not None and tmdbid_elem.text:
return int(tmdbid_elem.text)
except (etree.XMLSyntaxError, ValueError):
pass
except Exception: # pylint: disable=broad-except
pass
return None
class NfoRepairService:
"""Service that detects and repairs incomplete tvshow.nfo files.
@@ -171,10 +203,26 @@ class NfoRepairService:
", ".join(missing),
)
await self._nfo_service.update_tvshow_nfo(
series_name,
download_media=False,
)
try:
await self._nfo_service.update_tvshow_nfo(
series_name,
download_media=False,
)
except TMDBAPIError as e:
if "No TMDB ID found" in str(e):
# No TMDB ID in existing NFO — create new one via search
logger.info(
"NFO has no TMDB ID, creating new NFO via TMDB search"
)
await self._nfo_service.create_tvshow_nfo(
serie_name=series_name,
serie_folder=series_name,
download_poster=False,
download_logo=False,
download_fanart=False,
)
else:
raise
logger.info("NFO repair completed: %s", series_name)
return True

View File

@@ -10,6 +10,7 @@ Example:
import logging
import re
import unicodedata
from pathlib import Path
from typing import Any, Dict, List, Optional, Tuple
@@ -19,6 +20,7 @@ from src.core.services.tmdb_client import TMDBAPIError, TMDBClient
from src.core.utils.image_downloader import ImageDownloader
from src.core.utils.nfo_generator import generate_tvshow_nfo
from src.core.utils.nfo_mapper import tmdb_to_nfo_model
from src.core.entities.nfo_models import TVShowNFO
logger = logging.getLogger(__name__)
@@ -53,6 +55,18 @@ class NFOService:
self.image_size = image_size
self.auto_create = auto_create
async def __aenter__(self) -> "NFOService":
"""Enter async context manager."""
await self.tmdb_client.__aenter__()
await self.image_downloader.__aenter__()
return self
async def __aexit__(self, exc_type, exc_val, exc_tb):
"""Exit async context manager and cleanup resources."""
await self.tmdb_client.close()
await self.image_downloader.close()
return False
def has_nfo(self, serie_folder: str) -> bool:
"""Check if tvshow.nfo exists for a series.
@@ -83,11 +97,12 @@ class NFOService:
>>> _extract_year_from_name("Attack on Titan")
("Attack on Titan", None)
"""
# Match year in parentheses at the end: (YYYY)
# Match the last year in parentheses at the end: (YYYY)
match = re.search(r'\((\d{4})\)\s*$', serie_name)
if match:
year = int(match.group(1))
clean_name = serie_name[:match.start()].strip()
# Strip ALL trailing year suffixes to get a fully clean name
clean_name = re.sub(r'(\s*\(\d{4}\))+\s*$', '', serie_name).strip()
return clean_name, year
return serie_name, None
@@ -110,7 +125,8 @@ class NFOService:
year: Optional[int] = None,
download_poster: bool = True,
download_logo: bool = True,
download_fanart: bool = True
download_fanart: bool = True,
alt_titles: Optional[List[str]] = None
) -> Path:
"""Create tvshow.nfo by scraping TMDB.
@@ -122,6 +138,7 @@ class NFOService:
download_poster: Whether to download poster.jpg
download_logo: Whether to download logo.png
download_fanart: Whether to download fanart.jpg
alt_titles: Alternative titles (e.g., Japanese title) for fallback search
Returns:
Path to created NFO file
@@ -134,41 +151,102 @@ class NFOService:
clean_name, extracted_year = self._extract_year_from_name(serie_name)
if year is None and extracted_year is not None:
year = extracted_year
logger.info(f"Extracted year {year} from series name")
logger.info("Extracted year %s from series name", year)
# Use clean name for search
search_name = clean_name
logger.info(f"Creating NFO for {search_name} (year: {year})")
logger.info("Creating NFO for %s (year: %s)", search_name, year)
folder_path = self.anime_directory / serie_folder
if not folder_path.exists():
logger.info(f"Creating series folder: {folder_path}")
logger.info("Creating series folder: %s", folder_path)
folder_path.mkdir(parents=True, exist_ok=True)
async with self.tmdb_client:
# Search for TV show with clean name (without year)
logger.debug(f"Searching TMDB for: {search_name}")
search_results = await self.tmdb_client.search_tv_show(search_name)
if not search_results.get("results"):
raise TMDBAPIError(f"No results found for: {search_name}")
# Find best match (consider year if provided)
tv_show = self._find_best_match(search_results["results"], search_name, year)
tv_id = tv_show["id"]
logger.info(f"Found match: {tv_show['name']} (ID: {tv_id})")
# Get detailed information
details = await self.tmdb_client.get_tv_show_details(
tv_id,
append_to_response="credits,external_ids,images"
)
# Get content ratings for FSK
content_ratings = await self.tmdb_client.get_tv_show_content_ratings(tv_id)
# Check for existing NFO with TMDB ID to skip search
nfo_path = folder_path / "tvshow.nfo"
existing_ids = None
if nfo_path.exists():
try:
existing_ids = self.parse_nfo_ids(nfo_path)
if existing_ids.get("tmdb_id"):
logger.info(
"Found existing TMDB ID %s in NFO, using directly",
existing_ids["tmdb_id"]
)
except Exception as e:
logger.debug("Could not parse existing NFO IDs: %s", e)
try:
await self.tmdb_client._ensure_session()
# Use existing TMDB ID if found, otherwise search
if existing_ids and existing_ids.get("tmdb_id"):
tv_id = existing_ids["tmdb_id"]
logger.info("Fetching details directly for TMDB ID: %s", tv_id)
details = await self.tmdb_client.get_tv_show_details(
tv_id,
append_to_response="credits,external_ids,images"
)
content_ratings = await self.tmdb_client.get_tv_show_content_ratings(tv_id)
tv_show = {"id": tv_id, "name": details.get("name", serie_name)}
search_source = "nfo_override"
else:
# Search for TV show - try multiple strategies
tv_show, search_source = await self._search_with_fallback(
search_name, year, alt_titles
)
tv_id = tv_show["id"]
logger.info("Found match: %s (ID: %s)", tv_show['name'], tv_id)
# Get detailed information with multi-language image support
# Skip if we already fetched details via nfo_override
if search_source != "nfo_override":
details = await self.tmdb_client.get_tv_show_details(
tv_id,
append_to_response="credits,external_ids,images"
)
# Get content ratings for FSK
content_ratings = await self.tmdb_client.get_tv_show_content_ratings(tv_id)
# Enrich with fallback languages for empty overview/tagline
# Pass search result overview as last resort fallback
search_overview = tv_show.get("overview") or None
if not search_overview:
try:
logger.debug(
"No overview in German search result, trying en-US search fallback for: %s",
search_name,
)
en_search_results = await self.tmdb_client.search_tv_show(
search_name,
language="en-US",
)
if en_search_results.get("results"):
en_match = self._find_best_match(
en_search_results["results"], search_name, year
)
search_overview = en_match.get("overview") or None
if search_overview:
logger.info(
"Using en-US search overview fallback for %s",
search_name,
)
except (TMDBAPIError, Exception) as exc:
logger.warning(
"Failed en-US search fallback for overview: %s",
exc,
)
details = await self._enrich_details_with_fallback(
details, search_overview=search_overview
)
else:
# When using nfo_override, content_ratings already fetched
pass
# Convert TMDB data to TVShowNFO model
nfo_model = tmdb_to_nfo_model(
details,
@@ -176,15 +254,15 @@ class NFOService:
self.tmdb_client.get_image_url,
self.image_size,
)
# Generate XML
nfo_xml = generate_tvshow_nfo(nfo_model)
# Save NFO file
nfo_path = folder_path / "tvshow.nfo"
nfo_path.write_text(nfo_xml, encoding="utf-8")
logger.info(f"Created NFO: {nfo_path}")
logger.info("Created NFO: %s", nfo_path)
# Download media files
await self._download_media_files(
details,
@@ -193,8 +271,10 @@ class NFOService:
download_logo=download_logo,
download_fanart=download_fanart
)
return nfo_path
finally:
await self.tmdb_client.close()
async def update_tvshow_nfo(
self,
@@ -220,7 +300,7 @@ class NFOService:
if not nfo_path.exists():
raise FileNotFoundError(f"NFO file not found: {nfo_path}")
logger.info(f"Updating NFO for {serie_folder}")
logger.info("Updating NFO for %s", serie_folder)
# Parse existing NFO to extract TMDB ID
try:
@@ -246,24 +326,26 @@ class NFOService:
f"Delete the NFO and create a new one instead."
)
logger.debug(f"Found TMDB ID: {tmdb_id}")
logger.debug("Found TMDB ID: %s", tmdb_id)
except etree.XMLSyntaxError as e:
raise TMDBAPIError(f"Invalid XML in NFO file: {e}")
except ValueError as e:
raise TMDBAPIError(f"Invalid TMDB ID format in NFO: {e}")
# Fetch fresh data from TMDB
async with self.tmdb_client:
logger.debug(f"Fetching fresh data for TMDB ID: {tmdb_id}")
try:
await self.tmdb_client._ensure_session()
logger.debug("Fetching fresh data for TMDB ID: %s", tmdb_id)
details = await self.tmdb_client.get_tv_show_details(
tmdb_id,
append_to_response="credits,external_ids,images"
)
# Get content ratings for FSK
content_ratings = await self.tmdb_client.get_tv_show_content_ratings(tmdb_id)
# Enrich with fallback languages for empty overview/tagline
details = await self._enrich_details_with_fallback(details)
# Convert TMDB data to TVShowNFO model
nfo_model = tmdb_to_nfo_model(
details,
@@ -271,14 +353,14 @@ class NFOService:
self.tmdb_client.get_image_url,
self.image_size,
)
# Generate XML
nfo_xml = generate_tvshow_nfo(nfo_model)
# Save updated NFO file
nfo_path.write_text(nfo_xml, encoding="utf-8")
logger.info(f"Updated NFO: {nfo_path}")
logger.info("Updated NFO: %s", nfo_path)
# Re-download media files if requested
if download_media:
await self._download_media_files(
@@ -288,8 +370,10 @@ class NFOService:
download_logo=True,
download_fanart=True
)
return nfo_path
finally:
await self.tmdb_client.close()
def parse_nfo_ids(self, nfo_path: Path) -> Dict[str, Optional[int]]:
"""Parse TMDB ID and TVDB ID from an existing NFO file.
@@ -309,7 +393,7 @@ class NFOService:
result = {"tmdb_id": None, "tvdb_id": None}
if not nfo_path.exists():
logger.debug(f"NFO file not found: {nfo_path}")
logger.debug("NFO file not found: %s", nfo_path)
return result
try:
@@ -366,12 +450,143 @@ class NFOService:
)
except etree.XMLSyntaxError as e:
logger.error(f"Invalid XML in NFO file {nfo_path}: {e}")
logger.error("Invalid XML in NFO file %s: %s", nfo_path, e)
except Exception as e: # pylint: disable=broad-except
logger.error(f"Error parsing NFO file {nfo_path}: {e}")
logger.error("Error parsing NFO file %s: %s", nfo_path, e)
return result
def parse_nfo_year(self, nfo_path: Path) -> Optional[int]:
"""Parse year from an existing NFO file.
Extracts year from <year> or <premiered> elements.
Args:
nfo_path: Path to tvshow.nfo file
Returns:
Year as integer if found, None otherwise.
Example:
>>> year = nfo_service.parse_nfo_year(Path("/anime/series/tvshow.nfo"))
>>> print(year)
2013
"""
if not nfo_path.exists():
logger.debug("NFO file not found: %s", nfo_path)
return None
try:
tree = etree.parse(str(nfo_path))
root = tree.getroot()
# Try <year> element first
year_elem = root.find(".//year")
if year_elem is not None and year_elem.text:
try:
year = int(year_elem.text)
if 1900 <= year <= 2100:
logger.debug("Found year in NFO: %d", year)
return year
except ValueError:
pass
# Fallback: try <premiered> element (format: YYYY-MM-DD)
premiered_elem = root.find(".//premiered")
if premiered_elem is not None and premiered_elem.text:
if premiered_elem.text and len(premiered_elem.text) >= 4:
try:
year = int(premiered_elem.text[:4])
if 1900 <= year <= 2100:
logger.debug("Found year from premiered in NFO: %d", year)
return year
except ValueError:
pass
logger.debug("No year found in NFO: %s", nfo_path)
except etree.XMLSyntaxError as e:
logger.error("Invalid XML in NFO file %s: %s", nfo_path, e)
except Exception as e: # pylint: disable=broad-except
logger.error("Error parsing year from NFO file %s: %s", nfo_path, e)
return None
async def _enrich_details_with_fallback(
self,
details: Dict[str, Any],
search_overview: Optional[str] = None,
) -> Dict[str, Any]:
"""Enrich TMDB details with fallback languages for empty fields.
When requesting details in ``de-DE``, some anime have an empty
``overview`` (and potentially other translatable fields). This
method detects empty values and fills them from alternative
languages (``en-US``, then ``ja-JP``) so that NFO files always
contain a ``plot`` regardless of whether the German translation
exists. As a last resort, the overview from the search result
is used.
Args:
details: TMDB TV show details (language ``de-DE``).
search_overview: Overview text from the TMDB search result,
used as a final fallback if all language-specific
requests fail or return empty overviews.
Returns:
The *same* dict, mutated in-place with fallback values
where needed.
"""
overview = details.get("overview") or ""
if overview:
# Overview already populated nothing to do.
return details
tmdb_id = details.get("id")
fallback_languages = ["en-US", "ja-JP"]
for lang in fallback_languages:
if details.get("overview"):
break
logger.debug(
"Trying %s fallback for TMDB ID %s",
lang, tmdb_id,
)
try:
lang_details = await self.tmdb_client.get_tv_show_details(
tmdb_id,
language=lang,
)
if not details.get("overview") and lang_details.get("overview"):
details["overview"] = lang_details["overview"]
logger.info(
"Used %s overview fallback for TMDB ID %s",
lang, tmdb_id,
)
# Also fill tagline if missing
if not details.get("tagline") and lang_details.get("tagline"):
details["tagline"] = lang_details["tagline"]
except Exception as exc: # pylint: disable=broad-except
logger.warning(
"Failed to fetch %s fallback for TMDB ID %s: %s",
lang, tmdb_id, exc,
)
# Last resort: use search result overview
if not details.get("overview") and search_overview:
details["overview"] = search_overview
logger.info(
"Used search result overview fallback for TMDB ID %s",
tmdb_id,
)
return details
def _find_best_match(
self,
results: List[Dict[str, Any]],
@@ -396,12 +611,167 @@ class NFOService:
for result in results:
first_air_date = result.get("first_air_date", "")
if first_air_date.startswith(str(year)):
logger.debug(f"Found year match: {result['name']} ({first_air_date})")
logger.debug("Found year match: %s (%s)", result['name'], first_air_date)
return result
# Return first result (usually best match)
return results[0]
async def _search_with_fallback(
self,
primary_query: str,
year: Optional[int],
alt_titles: Optional[List[str]] = None
) -> Tuple[Dict[str, Any], str]:
"""Search TMDB with fallback strategies.
Tries multiple search strategies in order:
1. Primary query with year filter
2. Alternative titles (e.g., Japanese name)
3. Multi-language search (en-US)
4. Search without year constraint
5. Punctuation-normalized search
Args:
primary_query: Primary search term
year: Release year for filtering
alt_titles: Alternative titles to try if primary fails
Returns:
Tuple of (matched TV show dict, source description string)
Raises:
TMDBAPIError: If all search strategies fail
"""
search_strategies = [
# Strategy 1: Primary query as-is
{"query": primary_query, "year": year, "lang": "de-DE", "desc": "primary"},
]
# Strategy 2: Try alt titles (typically Japanese)
if alt_titles:
for alt in alt_titles:
if alt != primary_query:
search_strategies.append(
{"query": alt, "year": year, "lang": "ja-JP", "desc": f"alt_title:{alt}"}
)
search_strategies.append(
{"query": alt, "year": year, "lang": "en-US", "desc": f"alt_title:{alt}"}
)
# Strategy 3: Try English search
search_strategies.append(
{"query": primary_query, "year": year, "lang": "en-US", "desc": "english"}
)
# Strategy 4: Try without year constraint
if year:
search_strategies.append(
{"query": primary_query, "year": None, "lang": "de-DE", "desc": "no_year"}
)
# Strategy 5: Normalize punctuation
normalized = self._normalize_query_for_search(primary_query)
if normalized != primary_query:
search_strategies.append(
{"query": normalized, "year": year, "lang": "de-DE", "desc": f"normalized:{normalized}"}
)
# Strategy 6: Try search/multi for series indexed as movies
search_strategies.append(
{"query": primary_query, "year": year, "lang": "en-US", "desc": "multi_search", "use_multi": True}
)
last_error = None
for strategy in search_strategies:
query = strategy["query"]
lang = strategy["lang"]
desc = strategy["desc"]
use_multi = strategy.get("use_multi", False)
try:
logger.debug(
"TMDB search attempt: query='%s', lang=%s, year=%s, strategy=%s",
query, lang, strategy["year"], desc
)
# Use search/multi for multi_search strategy
if use_multi:
search_results = await self.tmdb_client.search_multi(
query,
language=lang
)
# Filter for TV shows only
if search_results.get("results"):
tv_results = [
r for r in search_results["results"]
if r.get("media_type") == "tv"
]
if tv_results:
search_results["results"] = tv_results
else:
search_results["results"] = []
else:
search_results = await self.tmdb_client.search_tv_show(
query,
language=lang
)
if search_results.get("results"):
# Apply year filter if we have one
results = search_results["results"]
if strategy["year"]:
year_filtered = [
r for r in results
if r.get("first_air_date", "").startswith(str(strategy["year"]))
]
if year_filtered:
match = year_filtered[0]
else:
# Year didn't match, still use first result but log it
match = results[0]
logger.debug(
"Year %s not found in results for '%s', using: %s",
strategy["year"], query, match["name"]
)
else:
match = results[0]
logger.info(
"TMDB search succeeded: '%s' found via strategy '%s' (ID: %s)",
match["name"], desc, match["id"]
)
return match, desc
else:
logger.debug("No results for '%s' via %s", query, desc)
except TMDBAPIError as e:
last_error = e
logger.debug("Search strategy '%s' failed: %s", desc, e)
continue
# All strategies exhausted
raise TMDBAPIError(
f"No results found for: {primary_query} (tried {len(search_strategies)} strategies)"
)
def _normalize_query_for_search(self, query: str) -> str:
"""Normalize query by removing punctuation and special chars.
Args:
query: Original search query
Returns:
Query with punctuation removed
"""
# Remove common punctuation but keep CJK characters
normalized = unicodedata.normalize('NFKC', query)
# Remove punctuation but not CJK
normalized = re.sub(r'[^\w\s\u3000-\u9fff\u4e00-\u9faf]', '', normalized)
# Collapse multiple spaces
normalized = re.sub(r'\s+', ' ', normalized).strip()
return normalized
async def _download_media_files(
@@ -461,7 +831,7 @@ class NFOService:
skip_existing=True
)
logger.info(f"Media download results: {results}")
logger.info("Media download results: %s", results)
return results
@@ -469,3 +839,53 @@ class NFOService:
async def close(self):
"""Clean up resources."""
await self.tmdb_client.close()
await self.image_downloader.close()
async def create_minimal_nfo(
self,
serie_name: str,
serie_folder: str,
year: Optional[int] = None
) -> Path:
"""Create minimal tvshow.nfo when TMDB lookup fails.
Creates a basic NFO with just the title (and year if available)
so the series is tracked even without TMDB metadata.
Args:
serie_name: Name of the series (may include year in parentheses)
serie_folder: Series folder name
year: Optional release year
Returns:
Path to created NFO file
Raises:
FileNotFoundError: If series folder doesn't exist
"""
# Extract year from name if not provided
clean_name, extracted_year = self._extract_year_from_name(serie_name)
if year is None and extracted_year is not None:
year = extracted_year
folder_path = self.anime_directory / serie_folder
if not folder_path.exists():
logger.info("Creating series folder: %s", folder_path)
folder_path.mkdir(parents=True, exist_ok=True)
# Create minimal NFO model with just title and year
nfo_model = TVShowNFO(
title=clean_name,
year=year,
plot=f"No metadata available for {clean_name}. TMDB lookup failed."
)
# Generate XML
nfo_xml = generate_tvshow_nfo(nfo_model)
# Save NFO file
nfo_path = folder_path / "tvshow.nfo"
nfo_path.write_text(nfo_xml, encoding="utf-8")
logger.info("Created minimal NFO (no TMDB): %s", nfo_path)
return nfo_path

View File

@@ -129,6 +129,9 @@ class SeriesManagerService:
if not self.nfo_service:
return
nfo_exists = False
ids = {}
try:
folder_path = Path(self.anime_directory) / serie_folder
nfo_path = folder_path / "tvshow.nfo"
@@ -136,7 +139,7 @@ class SeriesManagerService:
# If NFO exists, parse IDs and update database
if nfo_exists:
logger.debug(f"Parsing IDs from existing NFO for '{serie_name}'")
logger.debug("Parsing IDs from existing NFO for '%s'", serie_name)
ids = self.nfo_service.parse_nfo_ids(nfo_path)
if ids["tmdb_id"] or ids["tvdb_id"]:
@@ -195,22 +198,49 @@ class SeriesManagerService:
logger.info(
f"Creating NFO for '{serie_name}' ({serie_folder})"
)
await self.nfo_service.create_tvshow_nfo(
serie_name=serie_name,
serie_folder=serie_folder,
year=year,
download_poster=self.download_poster,
download_logo=self.download_logo,
download_fanart=self.download_fanart
)
logger.info(f"Successfully created NFO for '{serie_name}'")
try:
await self.nfo_service.create_tvshow_nfo(
serie_name=serie_name,
serie_folder=serie_folder,
year=year,
download_poster=self.download_poster,
download_logo=self.download_logo,
download_fanart=self.download_fanart
)
logger.info("Successfully created NFO for '%s'", serie_name)
except TMDBAPIError as create_error:
# TMDB lookup failed, create minimal NFO to track the series
logger.warning(
"TMDB lookup failed for '%s', creating minimal NFO: %s",
serie_name, create_error
)
try:
await self.nfo_service.create_minimal_nfo(
serie_name=serie_name,
serie_folder=serie_folder,
year=year
)
logger.info("Created minimal NFO for '%s'", serie_name)
except Exception as minimal_error:
logger.error(
"Failed to create minimal NFO for '%s': %s",
serie_name, minimal_error
)
elif nfo_exists:
logger.debug(
f"NFO exists for '{serie_name}', skipping download"
)
except TMDBAPIError as e:
logger.error(f"TMDB API error processing '{serie_name}': {e}")
# Only log at ERROR if no NFO exists and we have no IDs
# If NFO exists with IDs, this is just a lookup failure, log at DEBUG
if nfo_exists and (ids.get("tmdb_id") or ids.get("tvdb_id")):
logger.debug(
"TMDB API lookup failed for '%s' (has NFO with IDs): %s",
serie_name, e
)
else:
logger.error("TMDB API error processing '%s': %s", serie_name, e)
except Exception as e:
logger.error(
f"Unexpected error processing NFO for '{serie_name}': {e}",
@@ -246,7 +276,7 @@ class SeriesManagerService:
logger.info("No series found in database to process")
return
logger.info(f"Processing NFO for {len(anime_series_list)} series...")
logger.info("Processing NFO for %s series...", len(anime_series_list))
# Create tasks for concurrent processing
# Each task creates its own database session

View File

@@ -12,6 +12,7 @@ Example:
import asyncio
import logging
import time
from pathlib import Path
from typing import Any, Dict, List, Optional
@@ -38,6 +39,7 @@ class TMDBClient:
DEFAULT_BASE_URL = "https://api.themoviedb.org/3"
DEFAULT_IMAGE_BASE_URL = "https://image.tmdb.org/t/p"
NEGATIVE_CACHE_TTL = 86400 # 24 hours
def __init__(
self,
@@ -63,6 +65,12 @@ class TMDBClient:
self.max_connections = max_connections
self.session: Optional[aiohttp.ClientSession] = None
self._cache: Dict[str, Any] = {}
self._negative_cache: Dict[str, float] = {} # query -> timestamp when cached
# TMDB allows ~40 req/s; use 30 concurrent + per-second throttle to stay safe
self._semaphore = asyncio.Semaphore(30)
self._rate_limit_lock = asyncio.Lock()
self._request_timestamps: List[float] = []
self._max_requests_per_second = 35 # Stay under TMDB's ~40/s limit
async def __aenter__(self):
"""Async context manager entry."""
@@ -83,7 +91,7 @@ class TMDBClient:
self,
endpoint: str,
params: Optional[Dict[str, Any]] = None,
max_retries: int = 3
max_retries: int = 5
) -> Dict[str, Any]:
"""Make an async request to TMDB API with retries.
@@ -107,61 +115,103 @@ class TMDBClient:
# Cache key for deduplication
cache_key = f"{endpoint}:{str(sorted(params.items()))}"
if cache_key in self._cache:
logger.debug(f"Cache hit for {endpoint}")
logger.debug("Cache hit for %s", endpoint)
return self._cache[cache_key]
# Check negative cache (cached empty results)
negative_cache_key = f"{endpoint}:{str(sorted(params.items()))}"
if negative_cache_key in self._negative_cache:
if time.monotonic() - self._negative_cache[negative_cache_key] < self.NEGATIVE_CACHE_TTL:
logger.debug("Negative cache hit for %s (cached empty result)", endpoint)
return {"results": []}
else:
# Expired negative cache entry
del self._negative_cache[negative_cache_key]
delay = 1
last_error = None
for attempt in range(max_retries):
try:
# Re-ensure session before each attempt in case it was closed
await self._ensure_session()
if self.session is None:
raise TMDBAPIError("Session is not available")
logger.debug(f"TMDB API request: {endpoint} (attempt {attempt + 1})")
async with self.session.get(url, params=params, timeout=aiohttp.ClientTimeout(total=60)) as resp:
if resp.status == 401:
raise TMDBAPIError("Invalid TMDB API key")
elif resp.status == 404:
raise TMDBAPIError(f"Resource not found: {endpoint}")
elif resp.status == 429:
# Rate limit - wait longer
retry_after = int(resp.headers.get('Retry-After', delay * 2))
logger.warning(f"Rate limited, waiting {retry_after}s")
await asyncio.sleep(retry_after)
continue
resp.raise_for_status()
data = await resp.json()
self._cache[cache_key] = data
return data
except asyncio.TimeoutError as e:
last_error = e
if attempt < max_retries - 1:
logger.warning(f"Request timeout (attempt {attempt + 1}), retrying in {delay}s")
await asyncio.sleep(delay)
delay *= 2
else:
logger.error(f"Request timed out after {max_retries} attempts")
except (aiohttp.ClientError, AttributeError) as e:
last_error = e
# If connector/session was closed, try to recreate it
if "Connector is closed" in str(e) or isinstance(e, AttributeError):
logger.warning(f"Session issue detected, recreating session: {e}")
self.session = None
# Rate limiting: ensure we don't exceed ~35 requests/second
async with self._rate_limit_lock:
now = time.monotonic()
# Remove timestamps older than 1 second
self._request_timestamps = [
ts for ts in self._request_timestamps if now - ts < 1.0
]
if len(self._request_timestamps) >= self._max_requests_per_second:
sleep_time = 1.0 - (now - self._request_timestamps[0])
if sleep_time > 0:
logger.debug("Rate throttling: waiting %.2fs", sleep_time)
await asyncio.sleep(sleep_time)
self._request_timestamps.append(time.monotonic())
async with self._semaphore:
for attempt in range(max_retries):
try:
# Re-ensure session before each attempt in case it was closed
await self._ensure_session()
if attempt < max_retries - 1:
logger.warning(f"Request failed (attempt {attempt + 1}): {e}, retrying in {delay}s")
await asyncio.sleep(delay)
delay *= 2
else:
logger.error(f"Request failed after {max_retries} attempts: {e}")
if self.session is None:
raise TMDBAPIError("Session is not available")
logger.debug("TMDB API request: %s (attempt %s)", endpoint, attempt + 1)
async with self.session.get(url, params=params, timeout=aiohttp.ClientTimeout(total=60)) as resp:
if resp.status == 401:
raise TMDBAPIError("Invalid TMDB API key")
elif resp.status == 404:
raise TMDBAPIError(f"Resource not found: {endpoint}")
elif resp.status == 429:
# Rate limit - wait longer with exponential backoff
retry_after = int(resp.headers.get('Retry-After', max(delay * 2, 2)))
logger.warning("Rate limited, waiting %ss", retry_after)
await asyncio.sleep(retry_after)
continue
resp.raise_for_status()
data = await resp.json()
self._cache[cache_key] = data
# Cache negative result if empty
if endpoint.startswith("search/") and not data.get("results"):
self._negative_cache[negative_cache_key] = time.monotonic()
logger.debug("Cached negative result for %s", endpoint)
return data
except asyncio.TimeoutError as e:
last_error = e
if attempt < max_retries - 1:
logger.warning("Request timeout (attempt %s), retrying in %ss", attempt + 1, delay)
await asyncio.sleep(delay)
delay *= 2
else:
logger.error("Request timed out after %s attempts", max_retries)
except (aiohttp.ClientError, AttributeError) as e:
last_error = e
# If connector/session was closed, try to recreate it
if "Connector is closed" in str(e) or isinstance(e, AttributeError):
logger.warning(
"Session issue detected, recreating session: %s",
e,
exc_info=True,
)
self.session = None
await self._ensure_session()
# DNS / host-unreachable errors are not transient — abort immediately
error_str = str(e)
if "name resolution" in error_str.lower() or (
isinstance(e, aiohttp.ClientConnectorError) and
"Cannot connect to host" in error_str
):
logger.error("Non-transient connection error, aborting retries: %s", e)
raise TMDBAPIError(f"Request failed after {attempt + 1} attempts: {e}") from e
if attempt < max_retries - 1:
logger.warning("Request failed (attempt %s): %s, retrying in %ss", attempt + 1, e, delay)
await asyncio.sleep(delay)
delay *= 2
else:
logger.error("Request failed after %s attempts: %s", max_retries, e)
raise TMDBAPIError(f"Request failed after {max_retries} attempts: {last_error}")
@@ -190,6 +240,34 @@ class TMDBClient:
{"query": query, "language": language, "page": page}
)
async def search_multi(
self,
query: str,
language: str = "en-US",
page: int = 1
) -> Dict[str, Any]:
"""Search for movies and TV shows by name using TMDB multi search.
Multi search returns both movies and TV shows, useful for anime
that might be indexed as movies on TMDB.
Args:
query: Search query (show name)
language: Language for results (default: English)
page: Page number for pagination
Returns:
Search results with list of movies and TV shows
Example:
>>> results = await client.search_multi("Suzume no Tojimari")
>>> shows = [r for r in results["results"] if r["media_type"] == "tv"]
"""
return await self._request(
"search/multi",
{"query": query, "language": language, "page": page}
)
async def get_tv_show_details(
self,
tv_id: int,
@@ -275,7 +353,7 @@ class TMDBClient:
url = f"{self.image_base_url}/{size}{image_path}"
try:
logger.debug(f"Downloading image from {url}")
logger.debug("Downloading image from %s", url)
async with self.session.get(url, timeout=aiohttp.ClientTimeout(total=60)) as resp:
resp.raise_for_status()
@@ -286,7 +364,7 @@ class TMDBClient:
with open(local_path, "wb") as f:
f.write(await resp.read())
logger.info(f"Downloaded image to {local_path}")
logger.info("Downloaded image to %s", local_path)
except aiohttp.ClientError as e:
raise TMDBAPIError(f"Failed to download image: {e}")
@@ -309,8 +387,38 @@ class TMDBClient:
await self.session.close()
self.session = None
logger.debug("TMDB client session closed")
def __del__(self):
"""Warn if session is unclosed during garbage collection."""
if self.session is not None and not self.session.closed:
logger.warning(
"TMDBClient: unclosed session detected. "
"Use 'async with TMDBClient(...)' or call close() explicitly."
)
def clear_cache(self):
"""Clear the request cache."""
self._cache.clear()
logger.debug("TMDB client cache cleared")
def clear_negative_cache(self):
"""Clear the negative result cache."""
self._negative_cache.clear()
logger.debug("TMDB negative cache cleared")
def cleanup_expired_negative_cache(self) -> int:
"""Remove expired entries from negative cache.
Returns:
Number of entries removed
"""
now = time.monotonic()
expired_keys = [
key for key, timestamp in self._negative_cache.items()
if now - timestamp >= self.NEGATIVE_CACHE_TTL
]
for key in expired_keys:
del self._negative_cache[key]
if expired_keys:
logger.debug("Removed %d expired negative cache entries", len(expired_keys))
return len(expired_keys)

View File

@@ -125,7 +125,7 @@ class ImageDownloader:
# Check if file already exists
if skip_existing and local_path.exists():
if local_path.stat().st_size >= self.min_file_size:
logger.debug(f"Image already exists: {local_path}")
logger.debug("Image already exists: %s", local_path)
return True
# Ensure parent directory exists
@@ -137,15 +137,16 @@ class ImageDownloader:
for attempt in range(self.max_retries):
try:
logger.debug(
f"Downloading image from {url} "
f"(attempt {attempt + 1})"
"Downloading image from %s (attempt %d)",
url,
attempt + 1,
)
# Use persistent session
session = self._get_session()
async with session.get(url) as resp:
if resp.status == 404:
logger.warning(f"Image not found: {url}")
logger.warning("Image not found: %s", url)
return False
resp.raise_for_status()
@@ -168,21 +169,25 @@ class ImageDownloader:
local_path.unlink(missing_ok=True)
raise ImageDownloadError("Image validation failed")
logger.info(f"Downloaded image to {local_path}")
logger.info("Downloaded image to %s", local_path)
return True
except (aiohttp.ClientError, IOError, ImageDownloadError) as e:
last_error = e
if attempt < self.max_retries - 1:
logger.warning(
f"Download failed (attempt {attempt + 1}): {e}, "
f"retrying in {delay}s"
"Download failed (attempt %d): %s, retrying in %s",
attempt + 1,
e,
delay,
)
await asyncio.sleep(delay)
delay *= 2
else:
logger.error(
f"Download failed after {self.max_retries} attempts: {e}"
"Download failed after %d attempts: %s",
self.max_retries,
e,
)
raise ImageDownloadError(
@@ -211,7 +216,7 @@ class ImageDownloader:
try:
return await self.download_image(url, local_path, skip_existing)
except ImageDownloadError as e:
logger.warning(f"Failed to download poster: {e}")
logger.warning("Failed to download poster: %s", e)
return False
async def download_logo(
@@ -236,7 +241,7 @@ class ImageDownloader:
try:
return await self.download_image(url, local_path, skip_existing)
except ImageDownloadError as e:
logger.warning(f"Failed to download logo: {e}")
logger.warning("Failed to download logo: %s", e)
return False
async def download_fanart(
@@ -261,7 +266,7 @@ class ImageDownloader:
try:
return await self.download_image(url, local_path, skip_existing)
except ImageDownloadError as e:
logger.warning(f"Failed to download fanart: {e}")
logger.warning("Failed to download fanart: %s", e)
return False
def validate_image(self, image_path: Path) -> bool:
@@ -280,13 +285,13 @@ class ImageDownloader:
# Check file size
if image_path.stat().st_size < self.min_file_size:
logger.warning(f"Image file too small: {image_path}")
logger.warning("Image file too small: %s", image_path)
return False
return True
except Exception as e:
logger.warning(f"Image validation failed for {image_path}: {e}")
logger.warning("Image validation failed for %s: %s", image_path, e)
return False
async def download_all_media(
@@ -341,7 +346,7 @@ class ImageDownloader:
for (media_type, _), result in zip(tasks, task_results):
if isinstance(result, Exception):
logger.error(f"Error downloading {media_type}: {result}")
logger.error("Error downloading %s: %s", media_type, result)
results[media_type] = False
else:
results[media_type] = result

248
src/core/utils/key_utils.py Normal file
View File

@@ -0,0 +1,248 @@
"""Utility functions for generating URL-safe keys from folder names.
This module provides key generation and normalization for anime series,
handling edge cases like non-Latin characters and special symbols.
"""
from __future__ import annotations
import re
import unicodedata
import uuid
from typing import Optional
# Valid key pattern: alphanumeric, hyphens, underscores
# Must be at least 1 char, URL-safe
VALID_KEY_PATTERN = re.compile(r'^[a-zA-Z0-9][a-zA-Z0-9_-]*$')
def normalize_key(key: str) -> str:
"""Normalize a key to a URL-safe format.
Args:
key: The key to normalize
Returns:
Normalized lowercase key with spaces replaced by hyphens
"""
if not key:
return ""
# Convert to lowercase
normalized = key.lower()
# Replace spaces and underscores with hyphens
normalized = re.sub(r'[\s_]+', '-', normalized)
# Remove any characters that aren't alphanumeric or hyphens
normalized = re.sub(r'[^a-z0-9-]', '', normalized)
# Collapse multiple consecutive hyphens
normalized = re.sub(r'-+', '-', normalized)
# Remove leading/trailing hyphens
normalized = normalized.strip('-')
return normalized
def is_valid_key(key: str) -> bool:
"""Check if a key is valid for URL-safe use.
Args:
key: The key to validate
Returns:
True if key is valid (non-empty, URL-safe, alphanumeric start/end, min 2 chars)
"""
if not key or not key.strip():
return False
if len(key) < 2:
return False
return bool(VALID_KEY_PATTERN.match(key))
def generate_key_from_folder(folder_name: str) -> str:
"""Generate a URL-safe key from a folder name.
Handles edge cases:
- Non-Latin characters (Japanese, Chinese, etc.)
- Special characters
- All-invalid names that normalize to empty
Args:
folder_name: The folder name to convert to a key
Returns:
A URL-safe key string. Never returns empty string.
Examples:
>>> generate_key_from_folder("Attack on Titan (2013)")
'attack-on-titan-2013'
>>> generate_key_from_folder("A Time Called You (2023)")
'a-time-called-you-2023'
>>> generate_key_from_folder("25-sai no Joshikousei (2018)")
'25-sai-no-joshikousei-2018'
"""
if not folder_name or not folder_name.strip():
raise ValueError("Folder name cannot be empty")
# Step 1: Unicode NFC normalization (preserves international chars)
normalized = unicodedata.normalize('NFC', folder_name.strip())
# Step 2: Extract alphanumeric parts, preserving international chars
# This keeps Japanese/Chinese characters but removes special symbols
parts = []
for char in normalized:
# Keep Unicode alphanumeric characters (letters/numbers from any script)
if char.isalnum():
parts.append(char)
elif char.isspace():
parts.append(' ')
# Handle apostrophes - treat as part of word (remove, don't replace with space)
# This normalizes e.g., "Hell's" -> "Hells"
# Includes: ' (0x27), ' (0x2018), ' (0x2019), ' (0x02BC), ` (0x0060)
elif char in ("'", "'", "'", "'", "`", """, """):
pass # Skip - drop the apostrophe
else:
parts.append(' ')
working = ''.join(parts)
# Step 3: Split into words and normalize each
words = working.split()
# Step 4: Convert to lowercase and create hyphenated key
key = '-'.join(word.lower() for word in words if word)
# Step 5: If we got a valid key, return it
if key and is_valid_key(key):
return key
# Step 6: Try just alphanumeric characters
alphanumeric_only = re.sub(r'[^a-zA-Z0-9\s]', '', working)
words = alphanumeric_only.split()
key = '-'.join(word.lower() for word in words if word)
if key and is_valid_key(key):
return key
# Step 7: Last resort - use folder name directly with transliteration
# Try to convert non-ASCII to ASCII equivalents
try:
# Use NFD normalization and strip combining characters
# This effectively Latinizes some characters
nfd_form = unicodedata.normalize('NFD', folder_name)
latinized = ''.join(
char for char in nfd_form
if unicodedata.category(char) != 'Mn' # Strip combining marks
)
# Remove non-ASCII letters
latinized = re.sub(r'[^a-zA-Z0-9\s]', ' ', latinized)
words = latinized.split()
key = '-'.join(word.lower() for word in words if word)
if key and is_valid_key(key):
return key
except Exception:
pass
# Step 8: Absolute fallback - generate UUID-based key
# Use first 8 chars of UUID for brevity
uuid_key = uuid.uuid4().hex[:8]
# Try to extract any meaningful words from the original name
meaningful_parts = []
for char in folder_name:
if char.isalnum():
meaningful_parts.append(char.lower())
elif len(meaningful_parts) > 0:
meaningful_parts.append('-')
fallback_base = ''.join(meaningful_parts).strip('-')
if fallback_base and len(fallback_base) >= 2:
# Combine meaningful parts with UUID for uniqueness
# Truncate meaningful parts if too long
if len(fallback_base) > 20:
fallback_base = fallback_base[:20]
return f"{fallback_base}-{uuid_key}"
return f"series-{uuid_key}"
def validate_key_uniqueness(
key: str,
existing_keys: set[str],
) -> tuple[bool, str]:
"""Validate that a key is unique among existing keys.
Args:
key: The key to validate
existing_keys: Set of keys that already exist
Returns:
Tuple of (is_valid, error_message)
"""
if not key or not key.strip():
return False, "Key cannot be empty"
stripped = key.strip()
if len(stripped) < 2:
return False, "Key must be at least 2 characters"
if not is_valid_key(stripped):
return False, "Key must be URL-safe (alphanumeric, hyphens, underscores only)"
if stripped in existing_keys:
return False, f"Key '{stripped}' is already in use"
return True, ""
def sanitize_key_for_url(key: str) -> str:
"""Sanitize a key for safe URL usage.
Args:
key: The key to sanitize
Returns:
URL-safe version of the key
"""
if not key:
return ""
# Replace spaces with hyphens first
sanitized = key.replace(' ', '-')
# Remove any characters that could cause URL issues (keep alphanumerics, hyphens, underscores)
sanitized = re.sub(r'[^\w\-]', '', sanitized)
# Collapse multiple hyphens
sanitized = re.sub(r'-+', '-', sanitized)
return sanitized.strip('-')
def sanitize_url_for_logging(url: str, max_length: int = 100) -> str:
"""Sanitize a URL for safe logging by removing sensitive query parameters.
Removes or truncates query parameters that may contain tokens, keys,
or other sensitive data while preserving enough structure for debugging.
Args:
url: The URL to sanitize
max_length: Maximum length of the returned URL string
Returns:
Sanitized URL safe for logging
"""
if not url:
return ""
# Truncate if too long
if len(url) > max_length:
return url[:max_length] + "..."
return url

View File

@@ -43,8 +43,10 @@ def generate_tvshow_nfo(tvshow: TVShowNFO, pretty_print: bool = True) -> str:
_add_element(root, "sorttitle", tvshow.sorttitle)
_add_element(root, "year", str(tvshow.year) if tvshow.year else None)
# Plot and description
_add_element(root, "plot", tvshow.plot)
# Plot and description always write <plot> even when empty so that
# all NFO files have a consistent set of tags regardless of whether they
# were produced by create or update.
_add_element(root, "plot", tvshow.plot, always_write=True)
_add_element(root, "outline", tvshow.outline)
_add_element(root, "tagline", tvshow.tagline)
@@ -164,13 +166,23 @@ def generate_tvshow_nfo(tvshow: TVShowNFO, pretty_print: bool = True) -> str:
return xml_declaration + xml_str
def _add_element(parent: etree.Element, tag: str, text: Optional[str]) -> Optional[etree.Element]:
def _add_element(
parent: etree.Element,
tag: str,
text: Optional[str],
always_write: bool = False,
) -> Optional[etree.Element]:
"""Add a child element to parent if text is not None or empty.
Args:
parent: Parent XML element
tag: Tag name for child element
text: Text content (None or empty strings are skipped)
text: Text content (None or empty strings are skipped
unless *always_write* is True)
always_write: When True the element is created even when
*text* is None/empty (the element will have
no text content). Useful for tags like
``<plot>`` that should always be present.
Returns:
Created element or None if skipped
@@ -179,6 +191,8 @@ def _add_element(parent: etree.Element, tag: str, text: Optional[str]) -> Option
elem = etree.SubElement(parent, tag)
elem.text = text
return elem
if always_write:
return etree.SubElement(parent, tag)
return None
@@ -195,5 +209,5 @@ def validate_nfo_xml(xml_string: str) -> bool:
etree.fromstring(xml_string.encode('utf-8'))
return True
except etree.XMLSyntaxError as e:
logger.error(f"Invalid NFO XML: {e}")
logger.error("Invalid NFO XML: %s", e)
return False

View File

@@ -14,6 +14,7 @@ from typing import Any, Callable, Dict, List, Optional
from src.core.entities.nfo_models import (
ActorInfo,
ImageInfo,
NamedSeason,
RatingInfo,
TVShowNFO,
UniqueID,
@@ -167,6 +168,17 @@ def tmdb_to_nfo_model(
tmdbid=member["id"],
))
# --- Named seasons ---
named_seasons: List[NamedSeason] = []
for season_info in tmdb_data.get("seasons", []):
season_name = season_info.get("name")
season_number = season_info.get("season_number")
if season_name and season_number is not None:
named_seasons.append(NamedSeason(
number=season_number,
name=season_name,
))
# --- Unique IDs ---
unique_ids: List[UniqueID] = []
if tmdb_data.get("id"):
@@ -194,6 +206,7 @@ def tmdb_to_nfo_model(
return TVShowNFO(
title=title,
originaltitle=original_title,
showtitle=title,
sorttitle=title,
year=year,
plot=tmdb_data.get("overview") or None,
@@ -215,6 +228,7 @@ def tmdb_to_nfo_model(
thumb=thumb_images,
fanart=fanart_images,
actors=actors,
namedseason=named_seasons,
watched=False,
dateadded=datetime.now().strftime("%Y-%m-%d %H:%M:%S"),
)

View File

@@ -36,10 +36,10 @@ class ConfigEncryption:
def _ensure_key_exists(self) -> None:
"""Ensure encryption key exists or create one."""
if not self.key_file.exists():
logger.info(f"Creating new encryption key at {self.key_file}")
logger.info("Creating new encryption key at %s", self.key_file)
self._generate_new_key()
else:
logger.info(f"Using existing encryption key from {self.key_file}")
logger.info("Using existing encryption key from %s", self.key_file)
def _generate_new_key(self) -> None:
"""Generate and store a new encryption key."""
@@ -56,7 +56,7 @@ class ConfigEncryption:
logger.info("Generated new encryption key")
except IOError as e:
logger.error(f"Failed to generate encryption key: {e}")
logger.error("Failed to generate encryption key: %s", e)
raise
def _load_key(self) -> bytes:
@@ -77,7 +77,7 @@ class ConfigEncryption:
key = self.key_file.read_bytes()
return key
except IOError as e:
logger.error(f"Failed to load encryption key: {e}")
logger.error("Failed to load encryption key: %s", e)
raise
def _get_cipher(self) -> Fernet:
@@ -117,7 +117,7 @@ class ConfigEncryption:
return encrypted_str
except Exception as e:
logger.error(f"Failed to encrypt value: {e}")
logger.error("Failed to encrypt value: %s", e)
raise
def decrypt_value(self, encrypted_value: str) -> str:
@@ -149,7 +149,7 @@ class ConfigEncryption:
return decrypted_str
except Exception as e:
logger.error(f"Failed to decrypt value: {e}")
logger.error("Failed to decrypt value: %s", e)
raise
def encrypt_config(self, config: Dict[str, Any]) -> Dict[str, Any]:
@@ -191,9 +191,9 @@ class ConfigEncryption:
'encrypted': True,
'value': self.encrypt_value(value)
}
logger.debug(f"Encrypted config field: {key}")
logger.debug("Encrypted config field: %s", key)
except Exception as e:
logger.warning(f"Failed to encrypt {key}: {e}")
logger.warning("Failed to encrypt %s: %s", key, e)
encrypted_config[key] = value
else:
encrypted_config[key] = value
@@ -222,9 +222,9 @@ class ConfigEncryption:
decrypted_config[key] = self.decrypt_value(
value['value']
)
logger.debug(f"Decrypted config field: {key}")
logger.debug("Decrypted config field: %s", key)
except Exception as e:
logger.error(f"Failed to decrypt {key}: {e}")
logger.error("Failed to decrypt %s: %s", key, e)
decrypted_config[key] = None
else:
decrypted_config[key] = value
@@ -248,7 +248,7 @@ class ConfigEncryption:
if self.key_file.exists():
backup_path = self.key_file.with_suffix('.key.bak')
self.key_file.rename(backup_path)
logger.info(f"Backed up old key to {backup_path}")
logger.info("Backed up old key to %s", backup_path)
# Generate new key
if new_key_file:

View File

@@ -276,13 +276,13 @@ class DatabaseIntegrityChecker:
removed += 1
self.session.commit()
logger.info(f"Removed {removed} orphaned records")
logger.info("Removed %s orphaned records", removed)
return removed
except Exception as e:
self.session.rollback()
logger.error(f"Error removing orphaned records: {e}")
logger.error("Error removing orphaned records: %s", e)
raise

View File

@@ -39,13 +39,15 @@ class FileIntegrityManager:
self.checksums = json.load(f)
count = len(self.checksums)
logger.info(
f"Loaded {count} checksums from {self.checksum_file}"
"Loaded %d checksums from %s",
count,
self.checksum_file,
)
except (json.JSONDecodeError, IOError) as e:
logger.error(f"Failed to load checksums: {e}")
logger.error("Failed to load checksums: %s", e)
self.checksums = {}
else:
logger.info(f"Checksum file does not exist: {self.checksum_file}")
logger.info("Checksum file does not exist: %s", self.checksum_file)
self.checksums = {}
def _save_checksums(self) -> None:
@@ -56,10 +58,12 @@ class FileIntegrityManager:
json.dump(self.checksums, f, indent=2)
count = len(self.checksums)
logger.debug(
f"Saved {count} checksums to {self.checksum_file}"
"Saved %d checksums to %s",
count,
self.checksum_file,
)
except IOError as e:
logger.error(f"Failed to save checksums: {e}")
logger.error("Failed to save checksums: %s", e)
def calculate_checksum(
self, file_path: Path, algorithm: str = "sha256"
@@ -94,12 +98,15 @@ class FileIntegrityManager:
checksum = hash_obj.hexdigest()
filename = file_path.name
logger.debug(
f"Calculated {algorithm} checksum for {filename}: {checksum}"
"Calculated %s checksum for %s: %s",
algorithm,
filename,
checksum,
)
return checksum
except IOError as e:
logger.error(f"Failed to read file {file_path}: {e}")
logger.error("Failed to read file %s: %s", file_path, e)
raise
def store_checksum(
@@ -126,7 +133,7 @@ class FileIntegrityManager:
self.checksums[key] = checksum
self._save_checksums()
logger.info(f"Stored checksum for {file_path.name}")
logger.info("Stored checksum for %s", file_path.name)
return checksum
def verify_checksum(
@@ -197,10 +204,10 @@ class FileIntegrityManager:
if key in self.checksums:
del self.checksums[key]
self._save_checksums()
logger.info(f"Removed checksum for {file_path.name}")
logger.info("Removed checksum for %s", file_path.name)
return True
else:
logger.debug(f"No checksum found to remove for {file_path.name}")
logger.debug("No checksum found to remove for %s", file_path.name)
return False
def has_checksum(self, file_path: Path) -> bool:

View File

@@ -1,12 +1,15 @@
import logging
import warnings
from pathlib import Path
from typing import Any, List, Optional
from fastapi import APIRouter, Depends, HTTPException, status
from pydantic import BaseModel, Field
from sqlalchemy.ext.asyncio import AsyncSession
from src.config.settings import settings
from src.core.entities.series import Serie
from src.core.utils.key_utils import generate_key_from_folder, is_valid_key
from src.server.database.service import AnimeSeriesService
from src.server.exceptions import (
BadRequestError,
@@ -14,11 +17,16 @@ from src.server.exceptions import (
ServerError,
ValidationError,
)
from src.server.models.anime import AnimeMetadataUpdate
from src.server.services.anime_service import AnimeService, AnimeServiceError
from src.server.services.background_loader_service import BackgroundLoaderService
from src.server.services.scheduler.folder_rename_service import (
_scan_for_pre_existing_duplicates,
)
from src.server.utils.dependencies import (
get_anime_service,
get_background_loader_service,
get_database_session,
get_optional_database_session,
get_series_app,
require_auth,
@@ -70,6 +78,100 @@ async def get_anime_status(
) from exc
class DuplicateFolderGroup(BaseModel):
"""A group of duplicate folders for the same series.
Attributes:
key: Series key (provider-assigned unique identifier)
folders: List of folder names that are duplicates
folder_count: Number of duplicate folders
"""
key: str = Field(..., description="Series key (unique identifier)")
folders: List[str] = Field(..., description="List of duplicate folder names")
folder_count: int = Field(..., description="Number of duplicate folders")
class DuplicateFoldersResponse(BaseModel):
"""Response model for duplicate folders listing.
Attributes:
total_groups: Total number of duplicate groups found
duplicate_groups: List of duplicate folder groups
message: Human-readable summary
"""
total_groups: int = Field(..., description="Total number of duplicate groups")
duplicate_groups: List[DuplicateFolderGroup] = Field(
..., description="List of duplicate folder groups"
)
message: str = Field(..., description="Human-readable summary")
@router.get("/duplicate-folders", response_model=DuplicateFoldersResponse)
async def get_duplicate_folders(
_auth: dict = Depends(require_auth),
) -> DuplicateFoldersResponse:
"""List all pre-existing duplicate folder groups.
Scans the anime directory for folders with tvshow.nfo files that
map to the same series key. Returns groups of duplicates for
manual review and cleanup.
Returns:
DuplicateFoldersResponse with groups of duplicate folders
Note:
Not all duplicate folders are safe to merge - some may belong
to different releases (e.g., dubbed vs. subbed). Review carefully
before taking action.
"""
try:
if not settings.anime_directory:
return DuplicateFoldersResponse(
total_groups=0,
duplicate_groups=[],
message="Anime directory not configured",
)
anime_dir = Path(settings.anime_directory)
if not anime_dir.is_dir():
return DuplicateFoldersResponse(
total_groups=0,
duplicate_groups=[],
message=f"Anime directory not found: {anime_dir}",
)
duplicates = _scan_for_pre_existing_duplicates(anime_dir)
groups = [
DuplicateFolderGroup(
key=dup.key,
folders=dup.folders,
folder_count=dup.count,
)
for dup in duplicates
]
if groups:
message = (
f"Found {len(groups)} duplicate group(s). "
"Review carefully - some duplicates may be different releases "
"(e.g., dubbed vs. subbed)."
)
else:
message = "No duplicate folders found."
return DuplicateFoldersResponse(
total_groups=len(groups),
duplicate_groups=groups,
message=message,
)
except Exception as exc:
logger.error("Failed to scan for duplicate folders: %s", str(exc))
raise ServerError(
message=f"Failed to scan for duplicates: {str(exc)}"
) from exc
class AnimeSummary(BaseModel):
"""Summary of an anime series with missing episodes.
@@ -236,8 +338,8 @@ async def list_anime(
sort_by: Optional sorting parameter. Allowed: title, id, name,
missing_episodes
filter: Optional filter parameter. Allowed values:
- "no_episodes": Show only series with no downloaded
episodes in folder
- "missing_episodes": Show only series that have any missing episodes
- "no_episodes": Show only series that have no downloaded episodes
_auth: Ensures the caller is authenticated (value unused)
anime_service: AnimeService instance provided via dependency
@@ -298,7 +400,7 @@ async def list_anime(
# Validate filter parameter
if filter:
try:
allowed_filters = ["no_episodes"]
allowed_filters = ["missing_episodes", "no_episodes"]
validate_filter_value(filter, allowed_filters)
except ValueError as e:
raise ValidationError(message=str(e))
@@ -724,13 +826,17 @@ async def add_series(
if series_app and hasattr(series_app, 'loader'):
try:
year = series_app.loader.get_year(key)
logger.info(f"Fetched year for {key}: {year}")
logger.info("Fetched year for %s: %s", key, year)
except Exception as e:
logger.warning(f"Could not fetch year for {key}: {e}")
logger.warning("Could not fetch year for %s: %s", key, e)
# Create folder name with year if available
if year:
folder_name_with_year = f"{name} ({year})"
year_suffix = f" ({year})"
if name.endswith(year_suffix):
folder_name_with_year = name
else:
folder_name_with_year = f"{name}{year_suffix}"
else:
folder_name_with_year = name
@@ -834,9 +940,9 @@ async def add_series(
# Step F: Scan missing episodes immediately if background loader is not running
# Uses existing SerieScanner and AnimeService sync to avoid duplicates
try:
loader_running = (
background_loader.worker_task is not None
and not background_loader.worker_task.done()
loader_running = bool(
background_loader.worker_tasks
and any(not t.done() for t in background_loader.worker_tasks)
)
if (
not loader_running
@@ -1084,3 +1190,75 @@ async def get_anime(
# Maximum allowed input size for security
MAX_INPUT_LENGTH = 100000 # 100KB
@router.put("/{anime_key}")
async def update_anime_metadata(
anime_key: str,
body: AnimeMetadataUpdate,
_auth: dict = Depends(require_auth),
db: AsyncSession = Depends(get_database_session),
) -> dict:
"""Update anime metadata (key, tmdb_id, tvdb_id).
Args:
anime_key: Current series key to update
body: Fields to update (all optional)
_auth: Authentication dependency
db: Database session
Returns:
Updated series metadata
Raises:
HTTPException 404: Series not found
HTTPException 409: Key conflict (new key already exists)
HTTPException 422: Validation error
"""
series = await AnimeSeriesService.get_by_key(db, anime_key)
if not series:
raise HTTPException(
status_code=status.HTTP_404_NOT_FOUND,
detail=f"Series with key '{anime_key}' not found",
)
updates = {}
if body.key is not None and body.key != anime_key:
existing = await AnimeSeriesService.get_by_key(db, body.key)
if existing:
raise HTTPException(
status_code=status.HTTP_409_CONFLICT,
detail=f"A series with key '{body.key}' already exists",
)
updates["key"] = body.key
if body.tmdb_id is not None:
updates["tmdb_id"] = body.tmdb_id
if body.tvdb_id is not None:
updates["tvdb_id"] = body.tvdb_id
if not updates:
return {
"key": series.key,
"tmdb_id": series.tmdb_id,
"tvdb_id": series.tvdb_id,
"message": "No changes",
}
updated = await AnimeSeriesService.update(db, series.id, **updates)
await db.commit()
logger.info(
"Updated metadata for '%s': %s",
anime_key,
updates,
)
return {
"key": updated.key,
"tmdb_id": updated.tmdb_id,
"tvdb_id": updated.tvdb_id,
"message": "Metadata updated successfully",
}

View File

@@ -76,6 +76,8 @@ async def setup_auth(req: SetupRequest):
config.scheduler.schedule_days = req.scheduler_schedule_days
if req.scheduler_auto_download_after_rescan is not None:
config.scheduler.auto_download_after_rescan = req.scheduler_auto_download_after_rescan
if req.scheduler_folder_scan_enabled is not None:
config.scheduler.folder_scan_enabled = req.scheduler_folder_scan_enabled
# Update logging configuration
if req.logging_level:
@@ -161,6 +163,22 @@ async def setup_auth(req: SetupRequest):
# Perform NFO scan if configured
await perform_nfo_scan_if_needed(progress_service)
# Start scheduler if anime_directory is now set
try:
from src.server.services.scheduler.scheduler_service import (
get_scheduler_service,
)
scheduler_svc = get_scheduler_service()
logger.info("Starting scheduler after initialization")
await scheduler_svc.ensure_started()
logger.info("Scheduler started successfully during setup")
except Exception as sched_exc:
logger.warning(
"Failed to start scheduler during setup: %s", sched_exc
)
# Continue — scheduler failure should not break initialization
# Send completion event
from src.server.services.progress_service import ProgressType
await progress_service.start_progress(

View File

@@ -1,7 +1,10 @@
import logging
from typing import Any, Dict, List, Optional
from fastapi import APIRouter, Depends, HTTPException, status
logger = logging.getLogger(__name__)
from src.server.models.config import AppConfig, ConfigUpdate, ValidationResult
from src.server.services.config_service import (
ConfigBackupError,
@@ -28,16 +31,53 @@ def get_config(auth: Optional[dict] = Depends(require_auth)) -> AppConfig:
@router.put("", response_model=AppConfig)
def update_config(
async def update_config(
update: ConfigUpdate, auth: dict = Depends(require_auth)
) -> AppConfig:
"""Apply an update to the configuration and persist it.
Creates automatic backup before applying changes.
Creates automatic backup before applying changes. If anime_directory
is configured, starts the scheduler service.
"""
try:
config_service = get_config_service()
return config_service.update_config(update)
updated_config = config_service.update_config(update)
# Sync anime_directory to settings if it was updated
from src.config.settings import settings as app_settings
anime_dir_changed = False
if update.other and update.other.get("anime_directory"):
anime_dir = update.other.get("anime_directory")
if anime_dir and not app_settings.anime_directory:
app_settings.anime_directory = str(anime_dir)
anime_dir_changed = True
logger.info("Synced anime_directory from config: %s", anime_dir)
# Start scheduler if anime_directory was just configured
if anime_dir_changed:
try:
from src.server.services.scheduler.scheduler_service import (
get_scheduler_service,
)
scheduler_svc = get_scheduler_service()
logger.info(
"Starting scheduler after anime_directory configuration"
)
await scheduler_svc.ensure_started()
logger.info(
"Scheduler started successfully after config update"
)
except Exception as sched_exc:
logger.warning(
"Failed to start scheduler after config update: %s",
sched_exc,
)
# Config was already saved, don't fail the request
return updated_config
except ConfigValidationError as e:
raise HTTPException(
status_code=status.HTTP_400_BAD_REQUEST,
@@ -244,9 +284,9 @@ async def update_directory(
try:
import structlog
from src.server.services.anime_service import sync_series_from_data_files
from src.server.services.anime_service import sync_legacy_series_to_db
logger = structlog.get_logger(__name__)
sync_count = await sync_series_from_data_files(directory, logger)
sync_count = await sync_legacy_series_to_db(directory, logger)
logger.info(
"Directory updated: synced series from data files",
directory=directory,

View File

@@ -5,7 +5,7 @@ from datetime import datetime
from typing import Any, Dict, Optional
import psutil
from fastapi import APIRouter, Depends, HTTPException
from fastapi import APIRouter, Depends, HTTPException, Request
from pydantic import BaseModel
from sqlalchemy import text
from sqlalchemy.ext.asyncio import AsyncSession
@@ -17,15 +17,21 @@ logger = logging.getLogger(__name__)
router = APIRouter(prefix="/health", tags=["health"])
from src.server.utils.version import APP_VERSION
class HealthStatus(BaseModel):
"""Basic health status response."""
status: str
timestamp: str
version: str = "1.0.0"
version: str = APP_VERSION
service: str = "aniworld-api"
series_app_initialized: bool = False
anime_directory_configured: bool = False
scheduler_next_run: Optional[str] = None
scheduler_last_run: Optional[str] = None
checks: Optional[Dict[str, Any]] = None
class DatabaseHealth(BaseModel):
@@ -60,7 +66,7 @@ class DetailedHealthStatus(BaseModel):
status: str
timestamp: str
version: str = "1.0.0"
version: str = APP_VERSION
dependencies: DependencyHealth
startup_time: datetime
@@ -91,7 +97,7 @@ async def check_database_health(db: AsyncSession) -> DatabaseHealth:
message="Database connection successful",
)
except Exception as e:
logger.error(f"Database health check failed: {e}")
logger.error("Database health check failed: %s", e)
return DatabaseHealth(
status="unhealthy",
connection_time_ms=0,
@@ -121,7 +127,7 @@ async def check_filesystem_health() -> Dict[str, Any]:
"message": "Filesystem check completed",
}
except Exception as e:
logger.error(f"Filesystem health check failed: {e}")
logger.error("Filesystem health check failed: %s", e)
return {
"status": "unhealthy",
"message": f"Filesystem check failed: {str(e)}",
@@ -164,36 +170,99 @@ def get_system_metrics() -> SystemMetrics:
uptime_seconds=uptime_seconds,
)
except Exception as e:
logger.error(f"System metrics collection failed: {e}")
logger.error("System metrics collection failed: %s", e)
raise HTTPException(
status_code=500, detail=f"Failed to collect system metrics: {str(e)}"
)
@router.get("", response_model=HealthStatus)
async def basic_health_check() -> HealthStatus:
async def basic_health_check(request: Request) -> HealthStatus:
"""Basic health check endpoint.
This endpoint does not depend on anime_directory configuration
and should always return 200 OK for basic health monitoring.
Includes service information for identification.
Includes scheduler next/last run times for monitoring tools.
Includes startup health check results.
Returns:
HealthStatus: Simple health status with timestamp and service info.
"""
from src.config.settings import settings
from src.server.utils.dependencies import _series_app
# Get scheduler status for health monitoring
scheduler_status: dict = {}
try:
from src.server.services.scheduler.scheduler_service import (
get_scheduler_service,
)
scheduler_status = get_scheduler_service().get_status()
except Exception:
pass
# Get startup checks from app state
checks = getattr(request.app.state, "startup_checks", None)
# Determine overall status based on checks
overall_status = "healthy"
if checks:
for check_name, check_data in checks.items():
if check_data.get("status") == "error":
overall_status = "unhealthy"
break
elif check_data.get("status") == "warning":
overall_status = "degraded"
logger.debug("Basic health check requested")
return HealthStatus(
status="healthy",
status=overall_status,
timestamp=datetime.now().isoformat(),
service="aniworld-api",
series_app_initialized=_series_app is not None,
anime_directory_configured=bool(settings.anime_directory),
scheduler_next_run=scheduler_status.get("next_run"),
scheduler_last_run=scheduler_status.get("last_run"),
checks=checks,
)
@router.get("/ready")
async def ready_check(request: Request) -> Dict[str, Any]:
"""Readiness check endpoint for container orchestrators.
Returns 503 if critical dependencies are not available.
This endpoint is used by Kubernetes, Docker Swarm, etc. to determine
if the container should receive traffic.
Returns:
dict: Readiness status with checks details.
"""
checks = getattr(request.app.state, "startup_checks", {})
critical_failures = []
for check_name, check_data in checks.items():
if check_data.get("status") == "error":
critical_failures.append(f"{check_name}: {check_data.get('message')}")
if critical_failures:
return {
"status": "not_ready",
"ready": False,
"timestamp": datetime.now().isoformat(),
"critical_failures": critical_failures,
"checks": checks,
}
return {
"status": "ready",
"ready": True,
"timestamp": datetime.now().isoformat(),
"checks": checks,
}
@router.get("/detailed", response_model=DetailedHealthStatus)
async def detailed_health_check(
db: AsyncSession = Depends(get_database_session),
@@ -236,7 +305,7 @@ async def detailed_health_check(
startup_time=startup_time,
)
except Exception as e:
logger.error(f"Detailed health check failed: {e}")
logger.error("Detailed health check failed: %s", e)
raise HTTPException(status_code=500, detail="Health check failed")

View File

@@ -16,6 +16,11 @@ from src.core.entities.series import Serie
from src.core.SeriesApp import SeriesApp
from src.core.services.nfo_factory import get_nfo_factory
from src.core.services.nfo_service import NFOService
from src.core.services.nfo_repair_service import (
REQUIRED_TAGS,
NfoRepairService,
find_missing_tags,
)
from src.core.services.tmdb_client import TMDBAPIError
from src.server.models.nfo import (
MediaDownloadRequest,
@@ -27,8 +32,10 @@ from src.server.models.nfo import (
NFOContentResponse,
NFOCreateRequest,
NFOCreateResponse,
NfoDiagnosticsResponse,
NFOMissingResponse,
NFOMissingSeries,
NfoRepairResponse,
)
from src.server.utils.dependencies import get_series_app, require_auth
from src.server.utils.media import check_media_files, get_media_file_paths
@@ -144,6 +151,27 @@ async def batch_create_nfo(
nfo_path=str(nfo_path)
)
except TMDBAPIError as e:
logger.warning("TMDB API error for %s, creating minimal fallback: %s", serie_id, e)
# TMDB failed, create minimal NFO
try:
serie_folder = serie.ensure_folder_with_year()
except Exception:
serie_folder = serie_folder
serie_name = serie.name or serie_folder
nfo_path = await nfo_service.create_minimal_nfo(
serie_name=serie_name,
serie_folder=serie_folder
)
return NFOBatchResult(
serie_id=serie_id,
serie_folder=serie_folder,
success=True,
message="Created minimal NFO (TMDB lookup failed)",
nfo_path=str(nfo_path)
)
except Exception as e:
logger.error(
f"Error creating NFO for {serie_id}: {e}",
@@ -243,7 +271,7 @@ async def get_missing_nfo(
)
except Exception as e:
logger.error(f"Error getting missing NFOs: {e}", exc_info=True)
logger.exception("Error getting missing NFOs: %s", e)
raise HTTPException(
status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
detail=f"Failed to get missing NFOs: {str(e)}"
@@ -334,7 +362,7 @@ async def check_nfo(
except HTTPException:
raise
except Exception as e:
logger.error(f"Error checking NFO for {serie_id}: {e}", exc_info=True)
logger.exception("Error checking NFO for %s: %s", serie_id, e)
raise HTTPException(
status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
detail=f"Failed to check NFO: {str(e)}"
@@ -429,11 +457,42 @@ async def create_nfo(
except HTTPException:
raise
except TMDBAPIError as e:
logger.warning(f"TMDB API error creating NFO for {serie_id}: {e}")
raise HTTPException(
status_code=status.HTTP_503_SERVICE_UNAVAILABLE,
detail=f"TMDB API error: {str(e)}"
) from e
logger.warning("TMDB API error for %s, creating minimal fallback: %s", serie_id, e)
# TMDB failed, create minimal NFO with just folder name
try:
serie_folder = serie.ensure_folder_with_year()
except Exception:
serie_folder = serie_folder
folder_path = Path(settings.anime_directory) / serie_folder
serie_name_fallback = request.serie_name or serie.name or serie_folder
nfo_path = await nfo_service.create_minimal_nfo(
serie_name=serie_name_fallback,
serie_folder=serie_folder,
year=year
)
# Check media files (will likely be empty)
media_status = check_media_files(folder_path)
file_paths = get_media_file_paths(folder_path)
media_files = MediaFilesStatus(
has_poster=media_status.get("poster", False),
has_logo=media_status.get("logo", False),
has_fanart=media_status.get("fanart", False),
poster_path=str(file_paths["poster"]) if file_paths.get("poster") else None,
logo_path=str(file_paths["logo"]) if file_paths.get("logo") else None,
fanart_path=str(file_paths["fanart"]) if file_paths.get("fanart") else None
)
return NFOCreateResponse(
serie_id=serie_id,
serie_folder=serie_folder,
nfo_path=str(nfo_path),
media_files=media_files,
message="Created minimal NFO (TMDB lookup failed)"
)
except Exception as e:
logger.error(
f"Error creating NFO for {serie_id}: {e}",
@@ -524,7 +583,7 @@ async def update_nfo(
except HTTPException:
raise
except TMDBAPIError as e:
logger.warning(f"TMDB API error updating NFO for {serie_id}: {e}")
logger.warning("TMDB API error updating NFO for %s: %s", serie_id, e)
raise HTTPException(
status_code=status.HTTP_503_SERVICE_UNAVAILABLE,
detail=f"TMDB API error: {str(e)}"
@@ -756,3 +815,142 @@ async def download_media(
status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
detail=f"Failed to download media: {str(e)}"
) from e
@router.get("/{serie_key}/diagnostics", response_model=NfoDiagnosticsResponse)
async def get_nfo_diagnostics(
serie_key: str,
_auth: dict = Depends(require_auth),
series_app: SeriesApp = Depends(get_series_app),
) -> NfoDiagnosticsResponse:
"""Get NFO diagnostics showing missing required tags.
Args:
serie_key: Series key identifier
_auth: Authentication dependency
series_app: SeriesApp instance
Returns:
NfoDiagnosticsResponse with has_nfo, missing_tags, required_tags
Raises:
HTTPException 404: Series not found
"""
serie = None
for s in series_app.list.GetList():
if getattr(s, "key", None) == serie_key:
serie = s
break
if not serie:
raise HTTPException(
status_code=status.HTTP_404_NOT_FOUND,
detail=f"Series with key '{serie_key}' not found",
)
serie_folder = serie.ensure_folder_with_year()
folder_path = Path(settings.anime_directory) / serie_folder
nfo_path = folder_path / "tvshow.nfo"
required_tag_names = list(REQUIRED_TAGS.values())
if not nfo_path.exists():
return NfoDiagnosticsResponse(
has_nfo=False,
nfo_path=None,
missing_tags=required_tag_names,
required_tags=required_tag_names,
)
missing = find_missing_tags(nfo_path)
return NfoDiagnosticsResponse(
has_nfo=True,
nfo_path=str(nfo_path),
missing_tags=missing,
required_tags=required_tag_names,
)
@router.post("/{serie_key}/repair", response_model=NfoRepairResponse)
async def repair_nfo(
serie_key: str,
_auth: dict = Depends(require_auth),
series_app: SeriesApp = Depends(get_series_app),
nfo_service: NFOService = Depends(get_nfo_service),
) -> NfoRepairResponse:
"""Repair or recreate NFO file for a series.
Detects missing required tags and re-fetches metadata from TMDB.
Args:
serie_key: Series key identifier
_auth: Authentication dependency
series_app: SeriesApp instance
nfo_service: NFO service for TMDB operations
Returns:
NfoRepairResponse with success status and details
Raises:
HTTPException 404: Series not found
HTTPException 400: Cannot repair (e.g., no TMDB data available)
"""
serie = None
for s in series_app.list.GetList():
if getattr(s, "key", None) == serie_key:
serie = s
break
if not serie:
raise HTTPException(
status_code=status.HTTP_404_NOT_FOUND,
detail=f"Series with key '{serie_key}' not found",
)
serie_folder = serie.ensure_folder_with_year()
folder_path = Path(settings.anime_directory) / serie_folder
nfo_path = folder_path / "tvshow.nfo"
# Get missing tags before repair for reporting
missing_before = find_missing_tags(nfo_path) if nfo_path.exists() else list(REQUIRED_TAGS.values())
try:
repair_service = NfoRepairService(nfo_service)
if nfo_path.exists():
repaired = await repair_service.repair_series(folder_path, serie_folder)
if not repaired:
return NfoRepairResponse(
success=True,
message="NFO is already complete, no repair needed",
repaired_tags=[],
)
else:
# No NFO exists — create new one
await nfo_service.create_tvshow_nfo(
serie_name=serie.name,
serie_folder=serie_folder,
download_poster=True,
download_logo=True,
download_fanart=True,
)
return NfoRepairResponse(
success=True,
message=f"NFO repaired successfully. Fixed {len(missing_before)} missing tags.",
repaired_tags=missing_before,
)
except TMDBAPIError as e:
logger.warning("NFO repair failed for '%s': %s", serie_key, e)
raise HTTPException(
status_code=status.HTTP_400_BAD_REQUEST,
detail=f"Cannot repair NFO: {str(e)}. Ensure TMDB ID is set.",
) from e
except Exception as e:
logger.error("NFO repair error for '%s': %s", serie_key, e, exc_info=True)
raise HTTPException(
status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
detail=f"Failed to repair NFO: {str(e)}",
) from e

View File

@@ -10,7 +10,7 @@ from fastapi import APIRouter, Depends, HTTPException, status
from src.server.models.config import SchedulerConfig
from src.server.services.config_service import ConfigServiceError, get_config_service
from src.server.services.scheduler_service import get_scheduler_service
from src.server.services.scheduler.scheduler_service import get_scheduler_service
from src.server.utils.dependencies import require_auth
logger = logging.getLogger(__name__)
@@ -31,6 +31,7 @@ def _build_response(config: SchedulerConfig) -> Dict[str, Any]:
"schedule_time": config.schedule_time,
"schedule_days": config.schedule_days,
"auto_download_after_rescan": config.auto_download_after_rescan,
"folder_scan_enabled": config.folder_scan_enabled,
},
"status": {
"is_running": runtime.get("is_running", False),

View File

@@ -39,7 +39,7 @@ def get_settings() -> Union[DevelopmentSettings, ProductionSettings]:
Example:
>>> settings = get_settings()
>>> print(settings.log_level)
DEBUG
INFO
"""
if ENVIRONMENT in {"development", "testing"}:
return get_development_settings()

View File

@@ -215,7 +215,7 @@ class DevelopmentSettings(BaseSettings):
@property
def debug_enabled(self) -> bool:
"""Check if debug mode is enabled."""
return True
return False
@property
def reload_enabled(self) -> bool:

View File

@@ -95,10 +95,10 @@ def setup_logging() -> Dict[str, logging.Logger]:
# Log initial setup
root_logger.info("=" * 80)
root_logger.info("FastAPI Server Logging Initialized")
root_logger.info(f"Log Level: {settings.log_level.upper()}")
root_logger.info(f"Server Log: {server_log_file.absolute()}")
root_logger.info(f"Error Log: {error_log_file.absolute()}")
root_logger.info(f"Access Log: {access_log_file.absolute()}")
root_logger.info("Log Level: %s", settings.log_level.upper())
root_logger.info("Server Log: %s", server_log_file.absolute())
root_logger.info("Error Log: %s", error_log_file.absolute())
root_logger.info("Access Log: %s", access_log_file.absolute())
root_logger.info("=" * 80)
return {

View File

@@ -88,7 +88,7 @@ async def init_db() -> None:
try:
# Get database URL
db_url = _get_database_url()
logger.info(f"Initializing database: {db_url}")
logger.info("Initializing database: %s", db_url)
# Build engine kwargs based on database type
is_sqlite = "sqlite" in db_url
@@ -143,7 +143,7 @@ async def init_db() -> None:
logger.info("Database initialization complete")
except Exception as e:
logger.error(f"Failed to initialize database: {e}")
logger.error("Failed to initialize database: %s", e)
raise
@@ -171,7 +171,7 @@ async def close_db() -> None:
conn.commit()
logger.info("SQLite WAL checkpoint completed")
except Exception as e:
logger.warning(f"WAL checkpoint failed (non-critical): {e}")
logger.warning("WAL checkpoint failed (non-critical): %s", e)
if _engine:
logger.info("Closing async database engine...")
@@ -188,7 +188,7 @@ async def close_db() -> None:
logger.info("Database connections closed")
except Exception as e:
logger.error(f"Error closing database: {e}")
logger.error("Error closing database: %s", e)
def get_engine() -> AsyncEngine:

View File

@@ -27,7 +27,7 @@ logger = logging.getLogger(__name__)
# Schema Version Constants
# =============================================================================
CURRENT_SCHEMA_VERSION = "1.0.0"
CURRENT_SCHEMA_VERSION = "1.0.1"
SCHEMA_VERSION_TABLE = "schema_version"
# Expected tables in the current schema
@@ -98,7 +98,7 @@ async def initialize_database(
seed_data=True
)
if result["success"]:
logger.info(f"Database initialized: {result['schema_version']}")
logger.info("Database initialized: %s", result['schema_version'])
"""
if engine is None:
engine = get_engine()
@@ -117,7 +117,12 @@ async def initialize_database(
if create_schema:
tables = await create_database_schema(engine)
result["tables_created"] = tables
logger.info(f"Created {len(tables)} tables")
logger.info("Created %s tables", len(tables))
# Migrate schema if needed (add missing columns to existing tables)
migrations = await migrate_schema_if_needed(engine)
if migrations:
logger.info("Applied %s schema migrations", len(migrations))
# Validate schema if requested
if validate_schema:
@@ -148,7 +153,7 @@ async def initialize_database(
return result
except Exception as e:
logger.error(f"Database initialization failed: {e}", exc_info=True)
logger.exception("Database initialization failed: %s", e)
raise RuntimeError(f"Failed to initialize database: {e}") from e
@@ -194,14 +199,14 @@ async def create_database_schema(
created_tables = [t for t in new_tables if t not in existing_tables]
if created_tables:
logger.info(f"Created tables: {', '.join(created_tables)}")
logger.info("Created tables: %s", ', '.join(created_tables))
else:
logger.info("All tables already exist")
return new_tables
except Exception as e:
logger.error(f"Failed to create schema: {e}", exc_info=True)
logger.exception("Failed to create schema: %s", e)
raise RuntimeError(f"Schema creation failed: {e}") from e
@@ -295,7 +300,7 @@ async def validate_database_schema(
return result
except Exception as e:
logger.error(f"Schema validation failed: {e}", exc_info=True)
logger.exception("Schema validation failed: %s", e)
return {
"valid": False,
"missing_tables": [],
@@ -305,6 +310,66 @@ async def validate_database_schema(
}
# =============================================================================
# Schema Migration
# =============================================================================
async def migrate_schema_if_needed(
engine: Optional[AsyncEngine] = None
) -> List[str]:
"""Migrate database schema to current version if needed.
Handles adding missing columns to existing tables for backward
compatibility with older database schemas.
Args:
engine: Optional database engine (uses default if not provided)
Returns:
List of migration operations performed
"""
if engine is None:
engine = get_engine()
migrations_applied = []
try:
async with engine.connect() as conn:
# Get existing columns in system_settings table
existing_columns = await conn.run_sync(
lambda sync_conn: [
col["name"]
for col in inspect(sync_conn).get_columns("system_settings")
]
)
# Migration: Add legacy_key_cleanup_completed column if missing
if "legacy_key_cleanup_completed" not in existing_columns:
logger.info(
"Migrating system_settings table: "
"adding legacy_key_cleanup_completed column"
)
await conn.execute(
text("""
ALTER TABLE system_settings
ADD COLUMN legacy_key_cleanup_completed BOOLEAN
NOT NULL DEFAULT 0
""")
)
migrations_applied.append("added legacy_key_cleanup_completed")
logger.info(
"Migration complete: added legacy_key_cleanup_completed column"
)
except Exception as e:
logger.error("Schema migration failed: %s", e)
# Don't raise - migration failures shouldn't block startup
# The missing column will be handled gracefully by the application
return migrations_applied
# =============================================================================
# Schema Version Management
# =============================================================================
@@ -319,7 +384,7 @@ async def get_schema_version(engine: Optional[AsyncEngine] = None) -> str:
engine: Optional database engine (uses default if not provided)
Returns:
Schema version string (e.g., "1.0.0", "empty", "unknown")
Schema version string (e.g., "1.0.1", "empty", "unknown")
"""
if engine is None:
engine = get_engine()
@@ -342,7 +407,7 @@ async def get_schema_version(engine: Optional[AsyncEngine] = None) -> str:
return "unknown"
except Exception as e:
logger.error(f"Failed to get schema version: {e}")
logger.error("Failed to get schema version: %s", e)
return "error"
@@ -409,7 +474,7 @@ async def seed_initial_data(engine: Optional[AsyncEngine] = None) -> None:
logger.info("Data will be populated via normal application usage")
except Exception as e:
logger.error(f"Failed to seed initial data: {e}", exc_info=True)
logger.exception("Failed to seed initial data: %s", e)
raise
@@ -484,12 +549,12 @@ async def check_database_health(
f"(connectivity: {result['connectivity_ms']}ms)"
)
else:
logger.warning(f"Database health issues: {result['issues']}")
logger.warning("Database health issues: %s", result['issues'])
return result
except Exception as e:
logger.error(f"Database health check failed: {e}")
logger.error("Database health check failed: %s", e)
return {
"healthy": False,
"accessible": False,
@@ -547,13 +612,13 @@ async def create_database_backup(
backup_path = backup_dir / f"aniworld_{timestamp}.db"
try:
logger.info(f"Creating database backup: {backup_path}")
logger.info("Creating database backup: %s", backup_path)
shutil.copy2(db_path, backup_path)
logger.info(f"Backup created successfully: {backup_path}")
logger.info("Backup created successfully: %s", backup_path)
return backup_path
except Exception as e:
logger.error(f"Failed to create backup: {e}", exc_info=True)
logger.exception("Failed to create backup: %s", e)
raise RuntimeError(f"Backup creation failed: {e}") from e

View File

@@ -15,7 +15,7 @@ from datetime import datetime, timezone
from enum import Enum
from typing import List, Optional
from sqlalchemy import Boolean, DateTime, ForeignKey, Integer, String, Text, func
from sqlalchemy import Boolean, DateTime, ForeignKey, Index, Integer, String, Text, func
from sqlalchemy.orm import Mapped, mapped_column, relationship, validates
from src.server.database.base import Base, TimestampMixin
@@ -83,6 +83,10 @@ class AnimeSeries(Base, TimestampMixin):
Boolean, nullable=False, default=False, server_default="0",
doc="Whether tvshow.nfo file exists for this series"
)
nfo_path: Mapped[Optional[str]] = mapped_column(
String(1000), nullable=True,
doc="Path to the tvshow.nfo metadata file"
)
nfo_created_at: Mapped[Optional[datetime]] = mapped_column(
DateTime(timezone=True), nullable=True,
doc="Timestamp when NFO was first created"
@@ -91,6 +95,7 @@ class AnimeSeries(Base, TimestampMixin):
DateTime(timezone=True), nullable=True,
doc="Timestamp when NFO was last updated"
)
# TMDB (The Movie Database) ID for series metadata
tmdb_id: Mapped[Optional[int]] = mapped_column(
Integer, nullable=True, index=True,
doc="TMDB (The Movie Database) ID for series metadata"
@@ -316,6 +321,7 @@ class DownloadQueueItem(Base, TimestampMixin):
id: Primary key
series_id: Foreign key to AnimeSeries
episode_id: Foreign key to Episode
status: Queue status (pending/downloading/completed/failed/permanently_failed)
error_message: Error description if failed
download_url: Provider download URL
file_destination: Target file path
@@ -347,6 +353,33 @@ class DownloadQueueItem(Base, TimestampMixin):
index=True
)
# Status column to track queue item state
# Allows distinguishing pending items from permanently failed ones
status: Mapped[str] = mapped_column(
String(50), nullable=False, default="pending",
doc="Queue item status: pending, downloading, completed, failed, permanently_failed"
)
# Retry count to track failed download attempts
# Used to determine when to move item to permanently_failed
retry_count: Mapped[int] = mapped_column(
Integer, nullable=False, default=0,
doc="Number of retry attempts for this download"
)
# Unique constraint to prevent duplicate pending queue items per episode
# An episode can only have one PENDING entry at a time
# The status column allows failed items to remain in DB while new
# pending items can be added (application-level dedup still required)
__table_args__ = (
Index(
"ix_download_queue_episode_status",
"episode_id",
"status",
unique=True,
),
)
# Error handling
error_message: Mapped[Optional[str]] = mapped_column(
Text, nullable=True,
@@ -580,6 +613,14 @@ class SystemSettings(Base, TimestampMixin):
Boolean, nullable=False, default=False, server_default="0",
doc="Whether the initial media scan has been completed"
)
migration_legacy_files_completed: Mapped[bool] = mapped_column(
Boolean, nullable=False, default=False, server_default="0",
doc="Whether legacy key/data file migration has been completed"
)
legacy_key_cleanup_completed: Mapped[bool] = mapped_column(
Boolean, nullable=False, default=False, server_default="0",
doc="Whether legacy key file cleanup has been completed"
)
last_scan_timestamp: Mapped[Optional[datetime]] = mapped_column(
DateTime(timezone=True), nullable=True,
doc="Timestamp of the last completed scan"

View File

@@ -107,7 +107,7 @@ class AnimeSeriesService:
db.add(series)
await db.flush()
await db.refresh(series)
logger.info(f"Created anime series: {series.name} (key={series.key}, year={year})")
logger.info("Created anime series: %s (key=%s, year=%s)", series.name, series.key, year)
return series
@staticmethod
@@ -148,7 +148,47 @@ class AnimeSeriesService:
select(AnimeSeries).where(AnimeSeries.key == key)
)
return result.scalar_one_or_none()
@staticmethod
def get_by_folder_sync(db: Session, folder: str) -> Optional[AnimeSeries]:
"""Look up an anime series by its filesystem folder name (sync).
Intended as a fallback for ``SerieScanner`` when neither a ``key``
file nor a ``data`` file exists on disk for a given folder.
Args:
db: Synchronous database session (from ``get_sync_session``).
folder: Filesystem folder name to match (e.g.
``"Rooster Fighter (2026)"``).
Returns:
``AnimeSeries`` instance or ``None`` if not found.
"""
result = db.execute(
select(AnimeSeries).where(AnimeSeries.folder == folder)
)
return result.scalar_one_or_none()
@staticmethod
async def get_by_folder(db: AsyncSession, folder: str) -> Optional[AnimeSeries]:
"""Look up an anime series by its filesystem folder name (async).
Intended as primary lookup for ``SerieScanner`` when scanning
directories, replacing the legacy file-based lookups (key/data files).
Args:
db: Async database session.
folder: Filesystem folder name to match (e.g.
``"Rooster Fighter (2026)"``).
Returns:
``AnimeSeries`` instance or ``None`` if not found.
"""
result = await db.execute(
select(AnimeSeries).where(AnimeSeries.folder == folder)
)
return result.scalar_one_or_none()
@staticmethod
async def get_all(
db: AsyncSession,
@@ -205,7 +245,7 @@ class AnimeSeriesService:
await db.flush()
await db.refresh(series)
logger.info(f"Updated anime series: {series.name} (id={series_id})")
logger.info("Updated anime series: %s (id=%s)", series.name, series_id)
return series
@staticmethod
@@ -226,7 +266,7 @@ class AnimeSeriesService:
)
deleted = result.rowcount > 0
if deleted:
logger.info(f"Deleted anime series with id={series_id}")
logger.info("Deleted anime series with id=%s", series_id)
return deleted
@staticmethod
@@ -253,48 +293,92 @@ class AnimeSeriesService:
return list(result.scalars().all())
@staticmethod
async def get_series_with_no_episodes(
async def get_series_with_missing_episodes(
db: AsyncSession,
limit: Optional[int] = None,
offset: int = 0,
) -> List[AnimeSeries]:
"""Get anime series that have no episodes found in folder.
Since episodes in the database represent MISSING episodes
(from episodeDict), this returns series that have episodes
in the DB with is_downloaded=False, meaning they have missing
episodes and no files were found in the folder for those episodes.
Returns series where:
- At least one episode exists in database with is_downloaded=False
"""Get anime series that currently have missing episodes.
Episodes in the database represent missing episodes (from episodeDict).
This returns series that have at least one missing episode recorded in
the database (is_downloaded=False).
Args:
db: Database session
limit: Optional limit for results
offset: Offset for pagination
Returns:
List of AnimeSeries with missing episodes (not in folder)
List of AnimeSeries that have missing episodes.
"""
# Subquery to find series IDs with at least one undownloaded episode
undownloaded_series_ids = (
# Subquery to find series IDs with at least one missing episode
missing_series_ids = (
select(Episode.series_id)
.where(Episode.is_downloaded == False)
.distinct()
.subquery()
)
# Select series that have undownloaded episodes
query = (
select(AnimeSeries)
.where(AnimeSeries.id.in_(select(undownloaded_series_ids.c.series_id)))
.where(AnimeSeries.id.in_(select(missing_series_ids.c.series_id)))
.order_by(AnimeSeries.name)
.offset(offset)
)
if limit:
query = query.limit(limit)
result = await db.execute(query)
return list(result.scalars().all())
@staticmethod
async def get_series_with_no_episodes(
db: AsyncSession,
limit: Optional[int] = None,
offset: int = 0,
) -> List[AnimeSeries]:
"""Get anime series that have no downloaded episodes.
A series has "no episodes" if it has at least one missing episode
(is_downloaded=False) and no downloaded episodes (is_downloaded=True).
Args:
db: Database session
limit: Optional limit for results
offset: Offset for pagination
Returns:
List of AnimeSeries where no episodes are downloaded.
"""
# Series with missing episodes
missing_series_ids = (
select(Episode.series_id)
.where(Episode.is_downloaded == False)
.distinct()
.subquery()
)
# Series with any downloaded episodes
downloaded_series_ids = (
select(Episode.series_id)
.where(Episode.is_downloaded == True)
.distinct()
.subquery()
)
query = (
select(AnimeSeries)
.where(AnimeSeries.id.in_(select(missing_series_ids.c.series_id)))
.where(~AnimeSeries.id.in_(select(downloaded_series_ids.c.series_id)))
.order_by(AnimeSeries.name)
.offset(offset)
)
if limit:
query = query.limit(limit)
result = await db.execute(query)
return list(result.scalars().all())
@@ -477,6 +561,7 @@ class EpisodeService:
db: AsyncSession,
series_id: int,
season: Optional[int] = None,
only_missing: bool = False,
) -> List[Episode]:
"""Get episodes for a series.
@@ -484,6 +569,9 @@ class EpisodeService:
db: Database session
series_id: Foreign key to AnimeSeries
season: Optional season filter
only_missing: If True, only return episodes where
is_downloaded is False (i.e., missing episodes).
Default False returns all episodes.
Returns:
List of Episode instances
@@ -493,6 +581,9 @@ class EpisodeService:
if season is not None:
query = query.where(Episode.season == season)
if only_missing:
query = query.where(Episode.is_downloaded == False)
query = query.order_by(Episode.season, Episode.episode_number)
result = await db.execute(query)
return list(result.scalars().all())
@@ -558,11 +649,11 @@ class EpisodeService:
@staticmethod
async def delete(db: AsyncSession, episode_id: int) -> bool:
"""Delete episode.
Args:
db: Database session
episode_id: Episode primary key
Returns:
True if deleted, False if not found
"""
@@ -571,6 +662,33 @@ class EpisodeService:
)
return result.rowcount > 0
@staticmethod
async def delete_by_series(
db: AsyncSession,
series_id: int,
season: int,
episode_number: int,
) -> bool:
"""Delete episode by series ID, season, and episode number.
Args:
db: Database session
series_id: Foreign key to AnimeSeries
season: Season number
episode_number: Episode number within season
Returns:
True if deleted, False if not found
"""
result = await db.execute(
delete(Episode).where(
Episode.series_id == series_id,
Episode.season == season,
Episode.episode_number == episode_number,
)
)
return result.rowcount > 0
@staticmethod
async def delete_by_series_and_episode(
db: AsyncSession,
@@ -657,7 +775,7 @@ class EpisodeService:
updated_count += 1
await db.flush()
logger.info(f"Bulk marked {updated_count} episodes as downloaded")
logger.info("Bulk marked %s episodes as downloaded", updated_count)
return updated_count
@@ -684,6 +802,8 @@ class DownloadQueueService:
episode_id: int,
download_url: Optional[str] = None,
file_destination: Optional[str] = None,
status: str = "pending",
retry_count: int = 0,
) -> DownloadQueueItem:
"""Add item to download queue.
@@ -693,6 +813,8 @@ class DownloadQueueService:
episode_id: Foreign key to Episode
download_url: Optional provider download URL
file_destination: Optional target file path
status: Queue item status (default: "pending")
retry_count: Number of retry attempts (default: 0)
Returns:
Created DownloadQueueItem instance
@@ -702,13 +824,15 @@ class DownloadQueueService:
episode_id=episode_id,
download_url=download_url,
file_destination=file_destination,
status=status,
retry_count=retry_count,
)
db.add(item)
await db.flush()
await db.refresh(item)
logger.info(
f"Added to download queue: episode_id={episode_id} "
f"for series_id={series_id}"
f"for series_id={series_id}, status={status}"
)
return item
@@ -735,21 +859,24 @@ class DownloadQueueService:
async def get_by_episode(
db: AsyncSession,
episode_id: int,
status_filter: Optional[str] = None,
) -> Optional[DownloadQueueItem]:
"""Get download queue item by episode ID.
Args:
db: Database session
episode_id: Foreign key to Episode
status_filter: Optional status to filter by (e.g., "pending")
Returns:
DownloadQueueItem instance or None if not found
"""
result = await db.execute(
select(DownloadQueueItem).where(
DownloadQueueItem.episode_id == episode_id
)
query = select(DownloadQueueItem).where(
DownloadQueueItem.episode_id == episode_id
)
if status_filter:
query = query.where(DownloadQueueItem.status == status_filter)
result = await db.execute(query)
return result.scalar_one_or_none()
@staticmethod
@@ -806,7 +933,96 @@ class DownloadQueueService:
await db.flush()
await db.refresh(item)
logger.debug(f"Set error on download queue item {item_id}")
logger.debug("Set error on download queue item %s", item_id)
return item
@staticmethod
async def set_status(
db: AsyncSession,
item_id: int,
status: str,
) -> Optional[DownloadQueueItem]:
"""Set status on download queue item.
Args:
db: Database session
item_id: Item primary key
status: New status value
Returns:
Updated DownloadQueueItem instance or None if not found
"""
item = await DownloadQueueService.get_by_id(db, item_id)
if not item:
return None
item.status = status
await db.flush()
await db.refresh(item)
logger.debug("Set status on download queue item %s to %s", item_id, status)
return item
@staticmethod
async def increment_retry_count(
db: AsyncSession,
item_id: int,
) -> Optional[DownloadQueueItem]:
"""Increment retry count on download queue item.
Args:
db: Database session
item_id: Item primary key
Returns:
Updated DownloadQueueItem instance or None if not found
"""
item = await DownloadQueueService.get_by_id(db, item_id)
if not item:
return None
item.retry_count += 1
await db.flush()
await db.refresh(item)
logger.debug(
"Incremented retry count on download queue item %s to %s",
item_id, item.retry_count
)
return item
@staticmethod
async def set_status_and_error(
db: AsyncSession,
item_id: int,
status: str,
error_message: Optional[str] = None,
) -> Optional[DownloadQueueItem]:
"""Set status and error message on download queue item atomically.
Args:
db: Database session
item_id: Item primary key
status: New status value
error_message: Optional error description
Returns:
Updated DownloadQueueItem instance or None if not found
"""
item = await DownloadQueueService.get_by_id(db, item_id)
if not item:
return None
item.status = status
if error_message is not None:
item.error_message = error_message
await db.flush()
await db.refresh(item)
logger.debug(
"Set status=%s on download queue item %s, error=%s",
status, item_id, error_message
)
return item
@staticmethod
@@ -825,7 +1041,7 @@ class DownloadQueueService:
)
deleted = result.rowcount > 0
if deleted:
logger.info(f"Deleted download queue item with id={item_id}")
logger.info("Deleted download queue item with id=%s", item_id)
return deleted
@staticmethod
@@ -887,7 +1103,7 @@ class DownloadQueueService:
)
count = result.rowcount
logger.info(f"Bulk deleted {count} download queue items")
logger.info("Bulk deleted %s download queue items", count)
return count
@@ -908,7 +1124,7 @@ class DownloadQueueService:
"""
result = await db.execute(delete(DownloadQueueItem))
count = result.rowcount
logger.info(f"Cleared all {count} download queue items")
logger.info("Cleared all %s download queue items", count)
return count
@@ -962,7 +1178,7 @@ class UserSessionService:
db.add(session)
await db.flush()
await db.refresh(session)
logger.info(f"Created user session: {session_id}")
logger.info("Created user session: %s", session_id)
return session
@staticmethod
@@ -1049,7 +1265,7 @@ class UserSessionService:
session.revoke()
await db.flush()
logger.info(f"Revoked user session: {session_id}")
logger.info("Revoked user session: %s", session_id)
return True
@staticmethod
@@ -1071,7 +1287,7 @@ class UserSessionService:
)
)
count = result.rowcount
logger.info(f"Cleaned up {count} expired sessions")
logger.info("Cleaned up %s expired sessions", count)
return count
@staticmethod

View File

@@ -125,6 +125,66 @@ class SystemSettingsService:
settings = await SystemSettingsService.get_or_create(db)
return settings.initial_media_scan_completed
@staticmethod
async def is_migration_legacy_files_completed(db: AsyncSession) -> bool:
"""Check if legacy key/data file migration has been completed.
Args:
db: Database session
Returns:
True if legacy migration is completed, False otherwise
"""
settings = await SystemSettingsService.get_or_create(db)
return settings.migration_legacy_files_completed
@staticmethod
async def mark_migration_legacy_files_completed(
db: AsyncSession,
timestamp: Optional[datetime] = None
) -> None:
"""Mark the legacy key/data file migration as completed.
Args:
db: Database session
timestamp: Optional timestamp to set, defaults to current time
"""
settings = await SystemSettingsService.get_or_create(db)
settings.migration_legacy_files_completed = True
settings.last_scan_timestamp = timestamp or datetime.now(timezone.utc)
await db.commit()
logger.info("Marked legacy files migration as completed")
@staticmethod
async def is_legacy_key_cleanup_completed(db: AsyncSession) -> bool:
"""Check if legacy key file cleanup has been completed.
Args:
db: Database session
Returns:
True if cleanup is completed, False otherwise
"""
settings = await SystemSettingsService.get_or_create(db)
return settings.legacy_key_cleanup_completed
@staticmethod
async def mark_legacy_key_cleanup_completed(
db: AsyncSession,
timestamp: Optional[datetime] = None
) -> None:
"""Mark the legacy key file cleanup as completed.
Args:
db: Database session
timestamp: Optional timestamp to set, defaults to current time
"""
settings = await SystemSettingsService.get_or_create(db)
settings.legacy_key_cleanup_completed = True
settings.last_scan_timestamp = timestamp or datetime.now(timezone.utc)
await db.commit()
logger.info("Marked legacy key file cleanup as completed")
@staticmethod
async def mark_initial_media_scan_completed(
db: AsyncSession,
@@ -154,6 +214,8 @@ class SystemSettingsService:
settings.initial_scan_completed = False
settings.initial_nfo_scan_completed = False
settings.initial_media_scan_completed = False
settings.migration_legacy_files_completed = False
settings.legacy_key_cleanup_completed = False
settings.last_scan_timestamp = None
await db.commit()
logger.info("Reset all scan completion flags")

View File

@@ -6,6 +6,7 @@ configuration, middleware setup, static file serving, and Jinja2 template
integration.
"""
import asyncio
import logging
from contextlib import asynccontextmanager
from pathlib import Path
@@ -37,7 +38,6 @@ from src.server.controllers.page_controller import router as page_router
from src.server.middleware.auth import AuthMiddleware
from src.server.middleware.error_handler import register_exception_handlers
from src.server.middleware.setup_redirect import SetupRedirectMiddleware
from src.server.services.anime_service import sync_series_from_data_files
from src.server.services.progress_service import get_progress_service
from src.server.services.websocket_service import get_websocket_service
@@ -51,7 +51,7 @@ async def _check_incomplete_series_on_startup(background_loader) -> None:
Args:
background_loader: BackgroundLoaderService instance
"""
logger = setup_logging(log_level="INFO")
logger = logging.getLogger("aniworld")
try:
from src.server.database.connection import get_db_session
@@ -96,11 +96,112 @@ async def _check_incomplete_series_on_startup(background_loader) -> None:
else:
logger.info("All series data is complete. No background loading needed.")
except Exception as e:
logger.error(f"Error checking incomplete series: {e}", exc_info=True)
except Exception:
logger.exception("Error checking incomplete series")
except Exception:
logger.exception("Failed to check incomplete series on startup")
async def _run_startup_health_checks(logger) -> dict:
"""Run startup health checks for critical dependencies.
Checks:
- ffmpeg availability
- DNS resolution for aniworld.to and api.themoviedb.org
- anime_directory configuration and writability
Args:
logger: Logger instance for recording check results.
Returns:
dict: Health check results with status and details for each check.
"""
import asyncio
import shutil
import socket
from typing import Any, Dict
checks: Dict[str, Any] = {
"ffmpeg": {"status": "unknown", "message": None},
"dns_aniworld": {"status": "unknown", "message": None},
"dns_tmdb": {"status": "unknown", "message": None},
"anime_directory": {"status": "unknown", "message": None, "path": None},
}
# Check ffmpeg availability
try:
ffmpeg_path = shutil.which("ffmpeg")
if ffmpeg_path:
checks["ffmpeg"]["status"] = "ok"
checks["ffmpeg"]["message"] = f"Found at {ffmpeg_path}"
logger.debug("ffmpeg health check passed: %s", ffmpeg_path)
else:
checks["ffmpeg"]["status"] = "warning"
checks["ffmpeg"]["message"] = "ffmpeg not found in PATH"
logger.warning("ffmpeg health check failed: not in PATH")
except Exception as e:
logger.error(f"Failed to check incomplete series on startup: {e}", exc_info=True)
checks["ffmpeg"]["status"] = "error"
checks["ffmpeg"]["message"] = str(e)
logger.warning("Could not check ffmpeg: %s", e)
# Check DNS resolution for aniworld.to
try:
socket.gethostbyname("aniworld.to")
checks["dns_aniworld"]["status"] = "ok"
checks["dns_aniworld"]["message"] = "Resolved successfully"
logger.debug("DNS health check passed for aniworld.to")
except socket.gaierror as e:
checks["dns_aniworld"]["status"] = "warning"
checks["dns_aniworld"]["message"] = f"DNS resolution failed: {e}"
logger.warning("DNS health check failed for aniworld.to: %s", e)
except Exception as e:
checks["dns_aniworld"]["status"] = "warning"
checks["dns_aniworld"]["message"] = f"Unexpected error: {e}"
logger.warning("Unexpected DNS error for aniworld.to: %s", e)
# Check DNS resolution for api.themoviedb.org
try:
socket.gethostbyname("api.themoviedb.org")
checks["dns_tmdb"]["status"] = "ok"
checks["dns_tmdb"]["message"] = "Resolved successfully"
logger.debug("DNS health check passed for api.themoviedb.org")
except socket.gaierror as e:
checks["dns_tmdb"]["status"] = "warning"
checks["dns_tmdb"]["message"] = f"DNS resolution failed: {e}"
logger.warning("DNS health check failed for api.themoviedb.org: %s", e)
except Exception as e:
checks["dns_tmdb"]["status"] = "warning"
checks["dns_tmdb"]["message"] = f"Unexpected error: {e}"
logger.warning("Unexpected DNS error for api.themoviedb.org: %s", e)
# Check anime_directory configuration and writability
from src.config.settings import settings
anime_dir = settings.anime_directory
if not anime_dir:
checks["anime_directory"]["status"] = "error"
checks["anime_directory"]["message"] = "anime_directory not configured"
checks["anime_directory"]["path"] = None
logger.error("anime_directory health check failed: not configured")
else:
import os
checks["anime_directory"]["path"] = anime_dir
if not os.path.isdir(anime_dir):
checks["anime_directory"]["status"] = "error"
checks["anime_directory"]["message"] = f"Directory does not exist: {anime_dir}"
logger.error("anime_directory health check failed: %s does not exist", anime_dir)
elif not os.access(anime_dir, os.W_OK):
checks["anime_directory"]["status"] = "error"
checks["anime_directory"]["message"] = f"Directory not writable: {anime_dir}"
logger.error("anime_directory health check failed: %s not writable", anime_dir)
else:
checks["anime_directory"]["status"] = "ok"
checks["anime_directory"]["message"] = f"Directory exists and is writable: {anime_dir}"
logger.debug("anime_directory health check passed: %s", anime_dir)
return checks
@asynccontextmanager
@@ -113,6 +214,7 @@ async def lifespan(_application: FastAPI):
"""
# Setup logging first with INFO level
logger = setup_logging(log_level="INFO")
logger.info("Starting FastAPI application v%s", APP_VERSION)
# Track successful initialization steps
initialized = {
@@ -241,7 +343,6 @@ async def lifespan(_application: FastAPI):
from src.server.services.initialization_service import (
perform_initial_setup,
perform_media_scan_if_needed,
perform_nfo_repair_scan,
perform_nfo_scan_if_needed,
)
@@ -297,30 +398,29 @@ async def lifespan(_application: FastAPI):
except Exception as e:
logger.warning("Failed to start background loader service: %s", e)
# Initialize and start scheduler service
try:
from src.server.services.scheduler_service import (
get_scheduler_service,
)
scheduler_service = get_scheduler_service()
await scheduler_service.start()
initialized['scheduler'] = True
logger.info("Scheduler service started")
except Exception as e:
logger.warning("Failed to start scheduler service: %s", e)
# Continue - scheduler is optional
# Run media scan only on first run
await perform_media_scan_if_needed(background_loader)
# Scan every series NFO on startup and repair any that are
# missing required tags by queuing them for background reload
await perform_nfo_repair_scan(background_loader)
else:
logger.info(
"Download service initialization skipped - "
"anime directory not configured"
)
# Initialize and start scheduler service (independent of anime_directory)
# The scheduler loads its own config from config.json and the
# anime_directory may be configured there even if the env var is empty.
try:
logger.info("Initializing scheduler service...")
from src.server.services.scheduler.scheduler_service import (
get_scheduler_service,
)
scheduler_service = get_scheduler_service()
logger.info("Scheduler service instance obtained, starting...")
await scheduler_service.start()
initialized['scheduler'] = True
logger.info("Scheduler service started successfully")
except Exception as e:
logger.warning("Failed to start scheduler service: %s", e)
except (OSError, RuntimeError, ValueError) as e:
logger.warning("Failed to initialize services: %s", e)
# Continue startup - services can be initialized later
@@ -333,6 +433,27 @@ async def lifespan(_application: FastAPI):
logger.info(
"API documentation available at http://127.0.0.1:8000/api/docs"
)
# Check for ffmpeg availability and warn if missing
try:
import shutil as _shutil
if _shutil.which("ffmpeg") is None:
logger.warning(
"ffmpeg not found in PATH. HLS streams may fail to download. "
"Install ffmpeg to enable HLS support."
)
else:
logger.debug("ffmpeg found at: %s", _shutil.which("ffmpeg"))
except Exception as _exc:
logger.warning("Could not check for ffmpeg: %s", _exc)
# Run startup health checks and store results for /health endpoint
try:
startup_checks = await _run_startup_health_checks(logger)
app.state.startup_checks = startup_checks
except Exception as _exc:
logger.warning("Could not run startup health checks: %s", _exc)
app.state.startup_checks = {}
except Exception as e:
logger.error("Error during startup: %s", e, exc_info=True)
startup_error = e
@@ -377,7 +498,9 @@ async def lifespan(_application: FastAPI):
# 1. Stop scheduler service (only if initialized)
if initialized['scheduler']:
try:
from src.server.services.scheduler_service import get_scheduler_service
from src.server.services.scheduler.scheduler_service import (
get_scheduler_service,
)
scheduler_service = get_scheduler_service()
logger.info("Stopping scheduler service...")
await asyncio.wait_for(
@@ -422,8 +545,8 @@ async def lifespan(_application: FastAPI):
# 4. Shutdown download service and persist active downloads
try:
from src.server.services.download_service import ( # noqa: E501
_download_service_instance,
from src.server.services.download_service import (
_download_service_instance, # noqa: E501
)
if _download_service_instance is not None:
logger.info("Stopping download service...")
@@ -480,11 +603,13 @@ async def lifespan(_application: FastAPI):
raise startup_error
from src.server.utils.version import APP_VERSION
# Initialize FastAPI app with lifespan
app = FastAPI(
title="Aniworld Download Manager",
description="Modern web interface for Aniworld anime download management",
version="1.0.0",
version=APP_VERSION,
docs_url="/api/docs",
redoc_url="/api/redoc",
lifespan=lifespan

View File

@@ -74,7 +74,8 @@ def register_exception_handlers(app: FastAPI) -> None:
) -> JSONResponse:
"""Handle authentication errors (401)."""
logger.warning(
f"Authentication error: {exc.message}",
"Authentication error: %s",
exc.message,
extra={"details": exc.details, "path": str(request.url.path)},
)
return JSONResponse(
@@ -94,7 +95,8 @@ def register_exception_handlers(app: FastAPI) -> None:
) -> JSONResponse:
"""Handle authorization errors (403)."""
logger.warning(
f"Authorization error: {exc.message}",
"Authorization error: %s",
exc.message,
extra={"details": exc.details, "path": str(request.url.path)},
)
return JSONResponse(
@@ -114,7 +116,8 @@ def register_exception_handlers(app: FastAPI) -> None:
) -> JSONResponse:
"""Handle validation errors (422)."""
logger.info(
f"Validation error: {exc.message}",
"Validation error: %s",
exc.message,
extra={"details": exc.details, "path": str(request.url.path)},
)
return JSONResponse(
@@ -134,7 +137,8 @@ def register_exception_handlers(app: FastAPI) -> None:
) -> JSONResponse:
"""Handle bad request errors (400)."""
logger.info(
f"Bad request error: {exc.message}",
"Bad request error: %s",
exc.message,
extra={"details": exc.details, "path": str(request.url.path)},
)
return JSONResponse(
@@ -154,7 +158,8 @@ def register_exception_handlers(app: FastAPI) -> None:
) -> JSONResponse:
"""Handle not found errors (404)."""
logger.info(
f"Not found error: {exc.message}",
"Not found error: %s",
exc.message,
extra={"details": exc.details, "path": str(request.url.path)},
)
return JSONResponse(
@@ -174,7 +179,8 @@ def register_exception_handlers(app: FastAPI) -> None:
) -> JSONResponse:
"""Handle conflict errors (409)."""
logger.info(
f"Conflict error: {exc.message}",
"Conflict error: %s",
exc.message,
extra={"details": exc.details, "path": str(request.url.path)},
)
return JSONResponse(
@@ -194,7 +200,8 @@ def register_exception_handlers(app: FastAPI) -> None:
) -> JSONResponse:
"""Handle rate limit errors (429)."""
logger.warning(
f"Rate limit exceeded: {exc.message}",
"Rate limit exceeded: %s",
exc.message,
extra={"details": exc.details, "path": str(request.url.path)},
)
return JSONResponse(
@@ -214,7 +221,8 @@ def register_exception_handlers(app: FastAPI) -> None:
) -> JSONResponse:
"""Handle generic API exceptions."""
logger.error(
f"API error: {exc.message}",
"API error: %s",
exc.message,
extra={
"error_code": exc.error_code,
"details": exc.details,
@@ -238,12 +246,13 @@ def register_exception_handlers(app: FastAPI) -> None:
) -> JSONResponse:
"""Handle unexpected exceptions."""
logger.exception(
f"Unexpected error: {str(exc)}",
"Unexpected error: %s",
str(exc),
extra={"path": str(request.url.path)},
)
# Log full traceback for debugging
logger.debug(f"Traceback: {traceback.format_exc()}")
logger.debug("Traceback: %s", traceback.format_exc())
# Return generic error response for security
return JSONResponse(

View File

@@ -315,11 +315,11 @@ class RequestSanitizationMiddleware(BaseHTTPMiddleware):
None if malicious content detected, sanitized value otherwise
"""
if self.check_sql_injection and self._check_sql_injection(value):
logger.warning(f"Potential SQL injection detected: {value[:100]}")
logger.warning("Potential SQL injection detected: %s", value[:100])
return None
if self.check_xss and self._check_xss(value):
logger.warning(f"Potential XSS detected: {value[:100]}")
logger.warning("Potential XSS detected: %s", value[:100])
return None
return value
@@ -341,7 +341,7 @@ class RequestSanitizationMiddleware(BaseHTTPMiddleware):
content_type
and not any(ct in content_type for ct in self.allowed_content_types)
):
logger.warning(f"Unsupported content type: {content_type}")
logger.warning("Unsupported content type: %s", content_type)
return JSONResponse(
status_code=415,
content={"detail": "Unsupported Media Type"},
@@ -350,7 +350,7 @@ class RequestSanitizationMiddleware(BaseHTTPMiddleware):
# Check request size
content_length = request.headers.get("content-length")
if content_length and int(content_length) > self.max_request_size:
logger.warning(f"Request too large: {content_length} bytes")
logger.warning("Request too large: %s bytes", content_length)
return JSONResponse(
status_code=413,
content={"detail": "Request Entity Too Large"},
@@ -361,7 +361,7 @@ class RequestSanitizationMiddleware(BaseHTTPMiddleware):
if isinstance(value, str):
sanitized = self._sanitize_value(value)
if sanitized is None:
logger.warning(f"Malicious query parameter detected: {key}")
logger.warning("Malicious query parameter detected: %s", key)
return JSONResponse(
status_code=400,
content={"detail": "Malicious request detected"},
@@ -372,7 +372,7 @@ class RequestSanitizationMiddleware(BaseHTTPMiddleware):
if isinstance(value, str):
sanitized = self._sanitize_value(value)
if sanitized is None:
logger.warning(f"Malicious path parameter detected: {key}")
logger.warning("Malicious path parameter detected: %s", key)
return JSONResponse(
status_code=400,
content={"detail": "Malicious request detected"},

View File

@@ -83,6 +83,30 @@ class AnimeSeriesResponse(BaseModel):
return v
class AnimeMetadataUpdate(BaseModel):
"""Request model for updating anime metadata (key, tmdb_id, tvdb_id)."""
key: Optional[str] = Field(None, description="New series key (URL-safe, lowercase)")
tmdb_id: Optional[int] = Field(None, ge=1, description="TMDB ID (positive integer)")
tvdb_id: Optional[int] = Field(None, ge=1, description="TVDB ID (positive integer)")
@field_validator('key', mode='before')
@classmethod
def validate_key_format(cls, v: Optional[str]) -> Optional[str]:
"""Validate key is URL-safe lowercase with hyphens only."""
if v is None:
return v
v = v.strip().lower()
if not v:
raise ValueError("Key cannot be empty")
if not KEY_PATTERN.match(v):
raise ValueError(
"Key must contain only lowercase letters, numbers, and hyphens. "
"Cannot start or end with a hyphen."
)
return v
class SearchRequest(BaseModel):
"""Request payload for searching series."""

View File

@@ -73,6 +73,9 @@ class SetupRequest(BaseModel):
scheduler_auto_download_after_rescan: Optional[bool] = Field(
default=False, description="Auto-download missing episodes after rescan"
)
scheduler_folder_scan_enabled: Optional[bool] = Field(
default=False, description="Run folder maintenance during scheduled run"
)
# Logging configuration
logging_level: Optional[str] = Field(

View File

@@ -39,6 +39,23 @@ class SchedulerConfig(BaseModel):
description="Automatically queue and start downloads for all missing "
"episodes after a scheduled rescan completes.",
)
folder_scan_enabled: bool = Field(
default=False,
description="Run folder maintenance (NFO repair, folder renaming, "
"poster checks) during the scheduled run.",
)
# Legacy alias fields — read via Pydantic alias
auto_download: Optional[bool] = Field(default=None, alias="auto_download")
folder_scan: Optional[bool] = Field(default=None, alias="folder_scan")
def __init__(self, **data):
super().__init__(**data)
# Map legacy keys to primary fields only when primary key absent from data.
# "key in data" checks for explicit presence (even False/None), not just truthiness.
if self.auto_download is not None and "auto_download_after_rescan" not in data:
object.__setattr__(self, "auto_download_after_rescan", self.auto_download)
if self.folder_scan is not None and "folder_scan_enabled" not in data:
object.__setattr__(self, "folder_scan_enabled", self.folder_scan)
@field_validator("schedule_time")
@classmethod
@@ -64,6 +81,22 @@ class SchedulerConfig(BaseModel):
)
return v
def model_dump(self, **kwargs) -> Dict[str, object]:
"""Serialize, excluding legacy alias fields when they are None.
The alias fields (auto_download, folder_scan) must not be written to
config.json as null entries, otherwise a roundtrip load sees the key
present (哪怕 value is None) and skips the alias-to-primary mapping.
"""
data = super().model_dump(**kwargs)
# Drop None alias fields so they don't pollute config.json.
# They are still settable via the constructor for backward compatibility.
if data.get("auto_download") is None:
data.pop("auto_download", None)
if data.get("folder_scan") is None:
data.pop("folder_scan", None)
return data
class BackupConfig(BaseModel):
"""Configuration for automatic backups of application data."""
@@ -166,6 +199,12 @@ class AppConfig(BaseModel):
logging: LoggingConfig = Field(default_factory=LoggingConfig)
backup: BackupConfig = Field(default_factory=BackupConfig)
nfo: NFOConfig = Field(default_factory=NFOConfig)
scan_key_overrides: Dict[str, str] = Field(
default_factory=dict,
description="Map of folder names to provider keys for scan overrides. "
"Used when auto-generated keys from folder names are incorrect. "
"Format: {\"Folder Name\": \"actual-provider-key\"}"
)
other: Dict[str, object] = Field(
default_factory=dict, description="Arbitrary other settings"
)
@@ -204,6 +243,7 @@ class ConfigUpdate(BaseModel):
logging: Optional[LoggingConfig] = None
backup: Optional[BackupConfig] = None
nfo: Optional[NFOConfig] = None
scan_key_overrides: Optional[Dict[str, str]] = None
other: Optional[Dict[str, object]] = None
def apply_to(self, current: AppConfig) -> AppConfig:
@@ -220,6 +260,8 @@ class ConfigUpdate(BaseModel):
data["backup"] = self.backup.model_dump()
if self.nfo is not None:
data["nfo"] = self.nfo.model_dump()
if self.scan_key_overrides is not None:
data["scan_key_overrides"] = self.scan_key_overrides
if self.other is not None:
merged = dict(current.other or {})
merged.update(self.other)

View File

@@ -22,6 +22,7 @@ class DownloadStatus(str, Enum):
COMPLETED = "completed"
FAILED = "failed"
CANCELLED = "cancelled"
PERMANENTLY_FAILED = "permanently_failed"
class DownloadPriority(str, Enum):

View File

@@ -355,3 +355,29 @@ class NFOMissingResponse(BaseModel):
...,
description="List of series missing NFO"
)
class NfoDiagnosticsResponse(BaseModel):
"""Response for NFO diagnostics showing missing required tags."""
has_nfo: bool = Field(..., description="Whether tvshow.nfo exists")
nfo_path: Optional[str] = Field(None, description="Path to NFO file if exists")
missing_tags: List[str] = Field(
default_factory=list,
description="List of missing required tag names"
)
required_tags: List[str] = Field(
default_factory=list,
description="All required tag names for reference"
)
class NfoRepairResponse(BaseModel):
"""Response after NFO repair attempt."""
success: bool = Field(..., description="Whether repair succeeded")
message: str = Field(..., description="Human-readable result message")
repaired_tags: List[str] = Field(
default_factory=list,
description="Tags that were missing before repair"
)

View File

@@ -498,13 +498,19 @@ class AnimeService:
logger.info("No series found in SeriesApp")
return []
# Build NFO metadata map and filter data from database
nfo_map = {}
series_with_no_episodes = set()
# Build NFO metadata map, episode dict, and filter data from database.
# Using DB as authoritative source for episodeDict ensures that
# episodes marked is_downloaded=True are never shown as missing,
# even if the in-memory state is stale.
nfo_map: dict = {}
db_episode_dict_map: dict[str, dict[int, list[int]]] = {}
series_with_no_episodes: set = set()
async with get_db_session() as db:
# Get all series NFO metadata using service layer
db_series_list = await AnimeSeriesService.get_all(db)
# Single query: load all series with their episodes eagerly
db_series_list = await AnimeSeriesService.get_all(
db, with_episodes=True
)
for db_series in db_series_list:
nfo_created = (
@@ -522,21 +528,38 @@ class AnimeService:
"tmdb_id": db_series.tmdb_id,
"tvdb_id": db_series.tvdb_id,
"series_id": db_series.id,
"loading_status": db_series.loading_status,
"loading_error": db_series.loading_error,
}
# Build episodeDict from DB, skipping is_downloaded=True
# episodes so they are never shown as missing in the UI.
ep_dict: dict[int, list[int]] = {}
if db_series.episodes:
for ep in db_series.episodes:
if ep.is_downloaded:
continue
if ep.season not in ep_dict:
ep_dict[ep.season] = []
ep_dict[ep.season].append(ep.episode_number)
for s in ep_dict:
ep_dict[s].sort()
db_episode_dict_map[db_series.folder] = ep_dict
# If filter is "no_episodes", get series with no
# downloaded episodes
if filter_type == "no_episodes":
# Use service method to get series with
# undownloaded episodes
series_no_downloads = (
await AnimeSeriesService
.get_series_with_no_episodes(db)
# If filter is "missing_episodes", get series with any missing episodes
if filter_type == "missing_episodes":
series_missing = (
await AnimeSeriesService.get_series_with_missing_episodes(db)
)
series_with_no_episodes = {
s.folder for s in series_no_downloads
}
series_with_missing_episodes = {s.folder for s in series_missing}
# If filter is "no_episodes", get series with no downloaded episodes
if filter_type == "no_episodes":
series_no_downloads = (
await AnimeSeriesService.get_series_with_no_episodes(db)
)
series_with_no_episodes = {s.folder for s in series_no_downloads}
# Build result list with enriched metadata
result_list = []
for serie in series:
@@ -544,9 +567,17 @@ class AnimeService:
name = getattr(serie, "name", "")
site = getattr(serie, "site", "")
folder = getattr(serie, "folder", "")
episode_dict = getattr(serie, "episodeDict", {}) or {}
# Use DB-backed episodeDict (is_downloaded=True already filtered out)
# with in-memory episodeDict as fallback if the series isn't in DB yet.
episode_dict = db_episode_dict_map.get(
folder,
getattr(serie, "episodeDict", {}) or {}
)
# Apply filter if specified
if filter_type == "missing_episodes":
if folder not in series_with_missing_episodes:
continue
if filter_type == "no_episodes":
if folder not in series_with_no_episodes:
continue
@@ -567,6 +598,8 @@ class AnimeService:
"tmdb_id": nfo_data.get("tmdb_id"),
"tvdb_id": nfo_data.get("tvdb_id"),
"series_id": nfo_data.get("series_id"),
"loading_status": nfo_data.get("loading_status"),
"loading_error": nfo_data.get("loading_error"),
}
result_list.append(series_dict)
@@ -811,18 +844,24 @@ class AnimeService:
- Adds new missing episodes that are not in the database
- Removes episodes from database that are no longer missing
(i.e., the file has been added to the filesystem)
- Preserves episodes marked as downloaded (is_downloaded=True)
so download history is not lost
"""
from src.server.database.service import AnimeSeriesService, EpisodeService
# Get existing episodes from database
# Get existing episodes from database (all episodes, including downloaded)
existing_episodes = await EpisodeService.get_by_series(db, existing.id)
# Build dict of existing episodes: {season: {ep_num: episode_id}}
# and track which ones are already downloaded
existing_dict: dict[int, dict[int, int]] = {}
downloaded_set: set[tuple[int, int]] = set()
for ep in existing_episodes:
if ep.season not in existing_dict:
existing_dict[ep.season] = {}
existing_dict[ep.season][ep.episode_number] = ep.id
if ep.is_downloaded:
downloaded_set.add((ep.season, ep.episode_number))
# Get new missing episodes from scan
new_dict = serie.episodeDict or {}
@@ -853,9 +892,22 @@ class AnimeService:
# Remove episodes from database that are no longer missing
# (i.e., the episode file now exists on the filesystem)
# BUT: preserve episodes that are already downloaded (is_downloaded=True)
# so we don't lose download history
for season, eps_dict in existing_dict.items():
for ep_num, episode_id in eps_dict.items():
if (season, ep_num) not in new_missing_set:
# Skip already-downloaded episodes — they should stay in DB
# with is_downloaded=True to preserve download history
if (season, ep_num) in downloaded_set:
logger.debug(
"Preserving downloaded episode in database: "
"%s S%02dE%02d",
serie.key,
season,
ep_num
)
continue
await EpisodeService.delete(db, episode_id)
logger.info(
"Removed episode from database (no longer missing): "
@@ -885,6 +937,10 @@ class AnimeService:
This method is called during initialization and after rescans
to ensure the in-memory series list is in sync with the database.
Only episodes where is_downloaded=False are loaded into the
in-memory episodeDict, so downloaded episodes are not shown
as missing.
"""
from src.core.entities.series import Serie
from src.server.database.connection import get_db_session
@@ -899,9 +955,14 @@ class AnimeService:
series_list = []
for anime_series in anime_series_list:
# Build episode_dict from episodes relationship
# Only include episodes that are NOT downloaded (is_downloaded=False)
# so the missing-episode list stays accurate
episode_dict: dict[int, list[int]] = {}
if anime_series.episodes:
for episode in anime_series.episodes:
# Skip downloaded episodes — they are not missing
if episode.is_downloaded:
continue
season = episode.season
if season not in episode_dict:
episode_dict[season] = []
@@ -915,7 +976,8 @@ class AnimeService:
name=anime_series.name,
site=anime_series.site,
folder=anime_series.folder,
episodeDict=episode_dict
episodeDict=episode_dict,
year=anime_series.year
)
series_list.append(serie)
@@ -941,12 +1003,12 @@ class AnimeService:
# Get the serie from in-memory cache
if not hasattr(self._app, 'list') or not hasattr(self._app.list, 'keyDict'):
logger.warning(f"Series list not available for episode sync: {series_key}")
logger.warning("Series list not available for episode sync: %s", series_key)
return 0
serie = self._app.list.keyDict.get(series_key)
if not serie:
logger.warning(f"Series not found in memory for episode sync: {series_key}")
logger.warning("Series not found in memory for episode sync: %s", series_key)
return 0
episodes_added = 0
@@ -955,26 +1017,42 @@ class AnimeService:
# Get series from database
series_db = await AnimeSeriesService.get_by_key(db, series_key)
if not series_db:
logger.warning(f"Series not found in database: {series_key}")
logger.warning("Series not found in database: %s", series_key)
return 0
# Get existing episodes from database
# Get existing episodes from database (all, including downloaded)
existing_episodes = await EpisodeService.get_by_series(db, series_db.id)
# Build dict of existing episodes: {season: {ep_num: episode_id}}
# and track which ones are already downloaded
existing_dict: dict[int, dict[int, int]] = {}
downloaded_set: set[tuple[int, int]] = set()
for ep in existing_episodes:
if ep.season not in existing_dict:
existing_dict[ep.season] = {}
existing_dict[ep.season][ep.episode_number] = ep.id
if ep.is_downloaded:
downloaded_set.add((ep.season, ep.episode_number))
# Get new missing episodes from in-memory serie
new_dict = serie.episodeDict or {}
# Add new missing episodes that are not in the database
# Skip episodes that are already downloaded (is_downloaded=True)
# so we don't re-add them as missing after they've been downloaded
for season, episode_numbers in new_dict.items():
existing_season_eps = existing_dict.get(season, {})
for ep_num in episode_numbers:
# Skip if already downloaded — don't re-add as missing
if (season, ep_num) in downloaded_set:
logger.debug(
"Skipping already-downloaded episode: "
"%s S%02dE%02d",
series_key,
season,
ep_num,
)
continue
if ep_num not in existing_season_eps:
await EpisodeService.create(
db=db,
@@ -996,7 +1074,7 @@ class AnimeService:
try:
await self._broadcast_series_updated(series_key)
except Exception as e:
logger.warning(f"Failed to broadcast series update: {e}")
logger.warning("Failed to broadcast series update: %s", e)
return episodes_added
@@ -1010,20 +1088,23 @@ class AnimeService:
if hasattr(self._app, 'list') and hasattr(self._app.list, 'keyDict'):
serie = self._app.list.keyDict.get(series_key)
if serie:
# Convert episode dict keys to strings for JSON
missing_episodes = {str(k): v for k, v in (serie.episodeDict or {}).items()}
total_missing = sum(len(eps) for eps in missing_episodes.values())
# Fetch NFO metadata from database
# Fetch NFO metadata and episodes from database.
# Using DB as the authoritative source for missing_episodes
# ensures that episodes marked is_downloaded=True are never
# broadcast as missing, even if in-memory state is stale.
has_nfo = False
nfo_created_at = None
nfo_updated_at = None
tmdb_id = None
tvdb_id = None
missing_episodes: dict[str, list] = {}
try:
from src.server.database.connection import get_db_session
from src.server.database.service import AnimeSeriesService
from src.server.database.service import (
AnimeSeriesService,
EpisodeService,
)
async with get_db_session() as db:
db_series = await AnimeSeriesService.get_by_key(db, series_key)
@@ -1039,12 +1120,31 @@ class AnimeService:
)
tmdb_id = db_series.tmdb_id
tvdb_id = db_series.tvdb_id
# Build missing_episodes from DB, skipping is_downloaded=True
db_episodes = await EpisodeService.get_by_series(
db, db_series.id, only_missing=True
)
for ep in db_episodes:
key_str = str(ep.season)
if key_str not in missing_episodes:
missing_episodes[key_str] = []
missing_episodes[key_str].append(ep.episode_number)
for s in missing_episodes:
missing_episodes[s].sort()
except Exception as e:
logger.warning(
"Could not fetch NFO data for %s: %s",
"Could not fetch series data for %s from DB: %s",
series_key,
str(e)
)
# Fallback to in-memory state
missing_episodes = {
str(k): v
for k, v in (serie.episodeDict or {}).items()
}
total_missing = sum(len(eps) for eps in missing_episodes.values())
series_data = {
"key": serie.key,
@@ -1458,19 +1558,17 @@ def get_anime_service(series_app: SeriesApp) -> AnimeService:
return AnimeService(series_app)
async def sync_series_from_data_files(
async def sync_legacy_series_to_db(
anime_directory: str,
log_instance=None # pylint: disable=unused-argument
) -> int:
"""
Sync series from data files to the database.
One-time legacy sync: import any series from 'data' files
not already in the database.
Scans the anime directory for data files and adds any new series
to the database. Existing series are skipped (no duplicates).
This function is typically called during application startup to ensure
series metadata stored in filesystem data files is available in the
database.
Deprecated: Series are now loaded directly from the database.
This function remains for backwards compatibility with legacy
file-based data during migration.
Args:
anime_directory: Path to the anime directory with data files
@@ -1482,6 +1580,11 @@ async def sync_series_from_data_files(
"""
# Always use structlog for structured logging with keyword arguments
log = structlog.get_logger(__name__)
log.warning(
"sync_legacy_series_to_db is deprecated. "
"Series are now loaded directly from database."
)
try:
from src.server.database.connection import get_db_session
@@ -1546,6 +1649,7 @@ async def sync_series_from_data_files(
name=serie.name,
site=serie.site,
folder=serie.folder,
year=serie.year if hasattr(serie, 'year') else None,
)
# Create Episode records for each episode in episodeDict

View File

@@ -22,6 +22,7 @@ from typing import Any, Dict, List, Optional
import structlog
from src.core.services.nfo_factory import get_nfo_factory
from src.server.services.websocket_service import WebSocketService
logger = structlog.get_logger(__name__)
@@ -188,7 +189,7 @@ class BackgroundLoaderService:
"""
# Check if task already exists
if key in self.active_tasks:
logger.debug(f"Task for series {key} already exists, skipping")
logger.debug("Task for series %s already exists, skipping", key)
return
task = SeriesLoadingTask(
@@ -202,7 +203,7 @@ class BackgroundLoaderService:
self.active_tasks[key] = task
await self.task_queue.put(task)
logger.info(f"Added loading task for series: {key}")
logger.info("Added loading task for series: %s", key)
# Broadcast initial status
await self._broadcast_status(task)
@@ -277,7 +278,7 @@ class BackgroundLoaderService:
Args:
worker_id: Unique identifier for this worker instance
"""
logger.info(f"Background worker {worker_id} started processing tasks")
logger.info("Background worker %s started processing tasks", worker_id)
while not self._shutdown:
try:
@@ -301,14 +302,14 @@ class BackgroundLoaderService:
# No task available, continue loop
continue
except asyncio.CancelledError:
logger.info(f"Worker {worker_id} task cancelled")
logger.info("Worker %s task cancelled", worker_id)
break
except Exception as e:
logger.exception(f"Error in background worker {worker_id}: {e}")
logger.exception("Error in background worker %s: %s", worker_id, e)
# Continue processing other tasks
continue
logger.info(f"Background worker {worker_id} stopped")
logger.info("Background worker %s stopped", worker_id)
async def _load_series_data(self, task: SeriesLoadingTask) -> None:
"""Load all missing data for a series.
@@ -362,10 +363,10 @@ class BackgroundLoaderService:
# Broadcast completion
await self._broadcast_status(task)
logger.info(f"Successfully loaded all data for series: {task.key}")
logger.info("Successfully loaded all data for series: %s", task.key)
except Exception as e:
logger.exception(f"Error loading series data: {e}")
logger.exception("Error loading series data: %s", e)
task.status = LoadingStatus.FAILED
task.error = str(e)
task.completed_at = datetime.now(timezone.utc)
@@ -400,14 +401,14 @@ class BackgroundLoaderService:
# Check if directory exists
if series_dir.exists() and series_dir.is_dir():
logger.debug(f"Found series directory: {series_dir}")
logger.debug("Found series directory: %s", series_dir)
return series_dir
else:
logger.warning(f"Series directory not found: {series_dir}")
logger.warning("Series directory not found: %s", series_dir)
return None
except Exception as e:
logger.error(f"Error finding series directory for {task.key}: {e}")
logger.error("Error finding series directory for %s: %s", task.key, e)
return None
async def _scan_series_episodes(self, series_dir: Path, task: SeriesLoadingTask) -> Dict[str, List[str]]:
@@ -440,13 +441,13 @@ class BackgroundLoaderService:
if episodes:
episodes_by_season[season_name] = episodes
logger.debug(f"Found {len(episodes)} episodes in {season_name}")
logger.debug("Found %s episodes in %s", len(episodes), season_name)
logger.info(f"Scanned {len(episodes_by_season)} seasons for {task.key}")
logger.info("Scanned %s seasons for %s", len(episodes_by_season), task.key)
return episodes_by_season
except Exception as e:
logger.error(f"Error scanning episodes for {task.key}: {e}")
logger.error("Error scanning episodes for %s: %s", task.key, e)
return {}
async def _load_episodes(self, task: SeriesLoadingTask, db: Any) -> None:
@@ -466,7 +467,7 @@ class BackgroundLoaderService:
# Find series directory without full rescan
series_dir = await self._find_series_directory(task)
if not series_dir:
logger.error(f"Cannot load episodes - directory not found for {task.key}")
logger.error("Cannot load episodes - directory not found for %s", task.key)
task.progress["episodes"] = False
return
@@ -474,7 +475,7 @@ class BackgroundLoaderService:
episodes_by_season = await self._scan_series_episodes(series_dir, task)
if not episodes_by_season:
logger.warning(f"No episodes found for {task.key}")
logger.warning("No episodes found for %s", task.key)
task.progress["episodes"] = False
return
@@ -489,10 +490,10 @@ class BackgroundLoaderService:
series_db.loading_status = "loading_episodes"
await db.commit()
logger.info(f"Episodes loaded for series: {task.key} ({len(episodes_by_season)} seasons)")
logger.info("Episodes loaded for series: %s (%s seasons)", task.key, len(episodes_by_season))
except Exception as e:
logger.exception(f"Failed to load episodes for {task.key}: {e}")
logger.exception("Failed to load episodes for %s: %s", task.key, e)
raise
async def _load_nfo_and_images(self, task: SeriesLoadingTask, db: Any) -> bool:
@@ -521,7 +522,7 @@ class BackgroundLoaderService:
# Check if NFO already exists
if self.series_app.nfo_service.has_nfo(task.folder):
logger.info(f"NFO already exists for {task.key}, skipping creation")
logger.info("NFO already exists for %s, skipping creation", task.key)
# Update task progress
task.progress["nfo"] = True
@@ -536,31 +537,46 @@ class BackgroundLoaderService:
if not series_db.has_nfo:
series_db.has_nfo = True
series_db.nfo_created_at = datetime.now(timezone.utc)
logger.info(f"Updated database with existing NFO for {task.key}")
logger.info("Updated database with existing NFO for %s", task.key)
if not series_db.logo_loaded:
series_db.logo_loaded = True
if not series_db.images_loaded:
series_db.images_loaded = True
await db.commit()
logger.info(f"Existing NFO found and database updated for series: {task.key}")
logger.info("Existing NFO found and database updated for series: %s", task.key)
return False
# NFO doesn't exist, create it
await self._broadcast_status(task, "Generating NFO file...")
logger.info(f"Creating new NFO for {task.key}")
# Use existing NFOService to create NFO with all images
# This reuses all existing TMDB API logic and image downloading
nfo_path = await self.series_app.nfo_service.create_tvshow_nfo(
serie_name=task.name,
serie_folder=task.folder,
year=task.year,
download_poster=True,
download_logo=True,
download_fanart=True
)
logger.info("Creating new NFO for %s", task.key)
# Create a fresh NFOService for this task to avoid shared TMDB session closure
try:
factory = get_nfo_factory()
nfo_service = factory.create()
except ValueError:
logger.warning(
"NFOService unavailable for %s, skipping NFO/images",
task.key
)
task.progress["nfo"] = False
task.progress["logo"] = False
task.progress["images"] = False
return False
try:
nfo_path = await nfo_service.create_tvshow_nfo(
serie_name=task.name,
serie_folder=task.folder,
year=task.year,
download_poster=True,
download_logo=True,
download_fanart=True
)
finally:
await nfo_service.close()
# Update task progress
task.progress["nfo"] = True
task.progress["logo"] = True
@@ -577,11 +593,11 @@ class BackgroundLoaderService:
series_db.loading_status = "loading_nfo"
await db.commit()
logger.info(f"NFO and images created and loaded for series: {task.key}")
logger.info("NFO and images created and loaded for series: %s", task.key)
return True
except Exception as e:
logger.exception(f"Failed to load NFO/images for {task.key}: {e}")
logger.exception("Failed to load NFO/images for %s: %s", task.key, e)
# Don't fail the entire task if NFO fails
task.progress["nfo"] = False
task.progress["logo"] = False
@@ -611,7 +627,7 @@ class BackgroundLoaderService:
# Scan for missing episodes using the targeted scan method
# This populates the episodeDict without triggering a full rescan
logger.info(f"Scanning missing episodes for {task.key}")
logger.info("Scanning missing episodes for %s", task.key)
missing_episodes = self.series_app.serie_scanner.scan_single_series(
key=task.key,
folder=task.folder
@@ -628,12 +644,12 @@ class BackgroundLoaderService:
# Notify anime_service to sync episodes to database
# Use sync_single_series_after_scan which gets data from serie_scanner.keyDict
if self.anime_service:
logger.debug(f"Calling anime_service.sync_single_series_after_scan for {task.key}")
logger.debug("Calling anime_service.sync_single_series_after_scan for %s", task.key)
await self.anime_service.sync_single_series_after_scan(task.key)
else:
logger.warning(f"anime_service not available, episodes will not be synced to DB for {task.key}")
logger.warning("anime_service not available, episodes will not be synced to DB for %s", task.key)
else:
logger.info(f"No missing episodes found for {task.key}")
logger.info("No missing episodes found for %s", task.key)
# Update series status in database
from src.server.database.service import AnimeSeriesService
@@ -648,7 +664,7 @@ class BackgroundLoaderService:
task.progress["episodes"] = True
except Exception as e:
logger.exception(f"Failed to scan missing episodes for {task.key}: {e}")
logger.exception("Failed to scan missing episodes for %s: %s", task.key, e)
task.progress["episodes"] = False
async def _broadcast_status(

View File

@@ -170,14 +170,17 @@ class InMemoryCacheBackend(CacheBackend):
"""Get value from cache."""
async with self._lock:
if key not in self.cache:
logger.debug("Cache miss for key: %s", key)
return None
item = self.cache[key]
if self._is_expired(item):
logger.debug("Cache expired for key: %s", key)
del self.cache[key]
return None
logger.debug("Cache hit for key: %s", key)
return item["value"]
async def set(
@@ -196,6 +199,7 @@ class InMemoryCacheBackend(CacheBackend):
"expiry": expiry,
"created": datetime.utcnow(),
}
logger.debug("Cached key: %s (ttl=%s)", key, ttl)
return True
async def delete(self, key: str) -> bool:
@@ -203,7 +207,9 @@ class InMemoryCacheBackend(CacheBackend):
async with self._lock:
if key in self.cache:
del self.cache[key]
logger.debug("Deleted cache key: %s", key)
return True
logger.debug("Cache delete skipped; key not found: %s", key)
return False
async def exists(self, key: str) -> bool:
@@ -223,6 +229,7 @@ class InMemoryCacheBackend(CacheBackend):
"""Clear all cached values."""
async with self._lock:
self.cache.clear()
logger.debug("Cleared in-memory cache")
return True
async def get_many(self, keys: List[str]) -> Dict[str, Any]:
@@ -281,13 +288,14 @@ class RedisCacheBackend(CacheBackend):
import aioredis
self._redis = await aioredis.create_redis_pool(self.redis_url)
logger.debug("Connected to Redis at %s", self.redis_url)
except ImportError:
logger.error(
"aioredis not installed. Install with: pip install aioredis"
)
raise
except Exception as e:
logger.error(f"Failed to connect to Redis: {e}")
logger.error("Failed to connect to Redis: %s", e)
raise
return self._redis
@@ -308,7 +316,7 @@ class RedisCacheBackend(CacheBackend):
return pickle.loads(data)
except Exception as e:
logger.error(f"Redis get error: {e}")
logger.error("Redis get error: %s", e)
return None
async def set(
@@ -327,7 +335,7 @@ class RedisCacheBackend(CacheBackend):
return True
except Exception as e:
logger.error(f"Redis set error: {e}")
logger.error("Redis set error: %s", e)
return False
async def delete(self, key: str) -> bool:
@@ -338,7 +346,7 @@ class RedisCacheBackend(CacheBackend):
return result > 0
except Exception as e:
logger.error(f"Redis delete error: {e}")
logger.error("Redis delete error: %s", e)
return False
async def exists(self, key: str) -> bool:
@@ -348,7 +356,7 @@ class RedisCacheBackend(CacheBackend):
return await redis.exists(self._make_key(key))
except Exception as e:
logger.error(f"Redis exists error: {e}")
logger.error("Redis exists error: %s", e)
return False
async def clear(self) -> bool:
@@ -361,7 +369,7 @@ class RedisCacheBackend(CacheBackend):
return True
except Exception as e:
logger.error(f"Redis clear error: {e}")
logger.error("Redis clear error: %s", e)
return False
async def get_many(self, keys: List[str]) -> Dict[str, Any]:
@@ -379,7 +387,7 @@ class RedisCacheBackend(CacheBackend):
return result
except Exception as e:
logger.error(f"Redis get_many error: {e}")
logger.error("Redis get_many error: %s", e)
return {}
async def set_many(
@@ -392,7 +400,7 @@ class RedisCacheBackend(CacheBackend):
return True
except Exception as e:
logger.error(f"Redis set_many error: {e}")
logger.error("Redis set_many error: %s", e)
return False
async def delete_pattern(self, pattern: str) -> int:
@@ -409,7 +417,7 @@ class RedisCacheBackend(CacheBackend):
return 0
except Exception as e:
logger.error(f"Redis delete_pattern error: {e}")
logger.error("Redis delete_pattern error: %s", e)
return 0
async def close(self) -> None:

View File

@@ -8,6 +8,7 @@ This service handles:
"""
import json
import logging
import shutil
from datetime import datetime
from pathlib import Path
@@ -15,6 +16,8 @@ from typing import Dict, List, Optional
from src.server.models.config import AppConfig, ConfigUpdate, ValidationResult
logger = logging.getLogger(__name__)
class ConfigServiceError(Exception):
"""Base exception for configuration service errors."""
@@ -41,7 +44,7 @@ class ConfigService:
"""
# Current configuration schema version
CONFIG_VERSION = "1.0.0"
CONFIG_VERSION = "1.0.1"
def __init__(
self,
@@ -136,12 +139,18 @@ class ConfigService:
self.create_backup()
except ConfigBackupError as e:
# Log but don't fail save operation
print(f"Warning: Failed to create backup: {e}")
logger.warning("Failed to create backup: %s", e)
# Save configuration with version
data = config.model_dump()
data["version"] = self.CONFIG_VERSION
# Re-serialize SchedulerConfig through its overridden model_dump so
# that None legacy alias fields are stripped before writing to disk.
# Pydantic converts nested models to plain dicts in model_dump() output,
# so we call the override explicitly on the scheduler field.
data["scheduler"] = config.scheduler.model_dump()
# Write to temporary file first for atomic operation
temp_path = self.config_path.with_suffix(".tmp")
try:

View File

@@ -14,6 +14,7 @@ import uuid
from collections import deque
from concurrent.futures import ThreadPoolExecutor
from datetime import datetime, timezone
from pathlib import Path
from typing import TYPE_CHECKING, Dict, List, Optional
import structlog
@@ -68,6 +69,7 @@ class DownloadService:
progress_service: Optional progress service for tracking
"""
self._anime_service = anime_service
self._directory = anime_service._directory
self._max_retries = max_retries
self._progress_service = progress_service or get_progress_service()
@@ -79,6 +81,9 @@ class DownloadService:
self._pending_queue: deque[DownloadItem] = deque()
# Helper dict for O(1) lookup of pending items by ID
self._pending_items_by_id: Dict[str, DownloadItem] = {}
# Helper dict for O(1) lookup of pending items by episode identity
# Key: (serie_id, season, episode), Value: item ID
self._pending_by_episode: Dict[tuple, str] = {}
self._active_download: Optional[DownloadItem] = None
self._completed_items: deque[DownloadItem] = deque(maxlen=100)
self._failed_items: deque[DownloadItem] = deque(maxlen=50)
@@ -165,6 +170,27 @@ class DownloadService:
logger.error("Failed to save item to database: %s", e)
return item
async def _set_status_in_database(
self,
item_id: str,
status: str,
) -> bool:
"""Set status on an item in the database.
Args:
item_id: Download item ID
status: New status value
Returns:
True if update succeeded
"""
try:
repository = self._get_repository()
return await repository.set_status(item_id, status)
except Exception as e:
logger.error("Failed to set status in database: %s", e)
return False
async def _set_error_in_database(
self,
item_id: str,
@@ -186,6 +212,25 @@ class DownloadService:
logger.error("Failed to set error in database: %s", e)
return False
async def _increment_retry_in_database(
self,
item_id: str,
) -> bool:
"""Increment retry count on an item in the database.
Args:
item_id: Download item ID
Returns:
True if update succeeded
"""
try:
repository = self._get_repository()
return await repository.increment_retry(item_id)
except Exception as e:
logger.error("Failed to increment retry in database: %s", e)
return False
async def _delete_from_database(self, item_id: str) -> bool:
"""Delete an item from the database.
@@ -207,52 +252,259 @@ class DownloadService:
series_key: str,
season: int,
episode: int,
serie_folder: Optional[str] = None,
) -> bool:
"""Remove a downloaded episode from the missing episodes list.
"""Mark a downloaded episode as downloaded instead of deleting it.
Called when a download completes successfully to update the
database so the episode no longer appears as missing.
Called when a download completes successfully to update both:
1. The database (Episode record marked is_downloaded=True)
2. The in-memory Serie.episodeDict and series_list cache
This ensures the episode no longer appears as missing in both
the API responses and the UI immediately after download,
while preserving the download history.
Args:
series_key: Unique provider key for the series
season: Season number
episode: Episode number within season
serie_folder: Series folder name (required for file_path)
Returns:
True if episode was removed, False otherwise
True if episode was updated, False otherwise
"""
try:
from src.server.database.connection import get_db_session
from src.server.database.service import EpisodeService
from src.server.database.service import AnimeSeriesService, EpisodeService
logger.info(
"Attempting to mark episode as downloaded in DB: "
"%s S%02dE%02d",
series_key,
season,
episode,
)
async with get_db_session() as db:
deleted = await EpisodeService.delete_by_series_and_episode(
# Get series by key to find series_id
series = await AnimeSeriesService.get_by_key(db, series_key)
if not series:
logger.warning(
"Series not found for key: %s", series_key
)
return False
# Get episode by series_id, season, episode_number
ep = await EpisodeService.get_by_episode(
db=db,
series_key=series_key,
series_id=series.id,
season=season,
episode_number=episode,
)
if deleted:
if not ep:
logger.warning(
"Episode not found in DB: %s S%02dE%02d",
series_key,
season,
episode,
)
return False
# Construct file_path if serie_folder provided
file_path = None
if serie_folder:
season_folder = f"Season {season}"
file_path = str(
Path(self._directory) / serie_folder / season_folder
)
# Mark episode as downloaded instead of deleting
updated = await EpisodeService.mark_downloaded(
db=db,
episode_id=ep.id,
file_path=file_path or "",
)
if updated:
logger.info(
"Removed episode from missing list: "
"Marked episode as downloaded in DB: "
"%s S%02dE%02d, file_path=%s",
series_key,
season,
episode,
file_path,
)
else:
logger.warning(
"Failed to mark episode as downloaded: "
"%s S%02dE%02d",
series_key,
season,
episode,
)
# Clear the anime service cache so list_missing
# returns updated data
try:
self._anime_service._cached_list_missing.cache_clear()
except Exception:
pass
return deleted
return False
# Update in-memory Serie.episodeDict so list_missing is
# immediately consistent without a full DB reload
self._remove_episode_from_memory(series_key, season, episode)
# Clear the anime service cache so list_missing
# re-reads from the (now updated) in-memory state
try:
self._anime_service._cached_list_missing.cache_clear()
logger.debug(
"Cleared list_missing cache after marking "
"%s S%02dE%02d as downloaded",
series_key,
season,
episode,
)
except Exception:
pass
# Broadcast real-time update to frontend so the series card
# immediately reflects the new downloaded state (no longer
# shows the episode as missing) without waiting for a full
# reload on DOWNLOAD_COMPLETED.
try:
await self._anime_service._broadcast_series_updated(
series_key
)
logger.debug(
"Broadcast series_updated after marking "
"%s S%02dE%02d as downloaded",
series_key,
season,
episode,
)
except Exception as broadcast_exc:
logger.warning(
"Failed to broadcast series update after marking "
"%s S%02dE%02d as downloaded: %s",
series_key,
season,
episode,
broadcast_exc,
)
return True
except Exception as e:
logger.error(
"Failed to remove episode from missing list: %s", e
"Failed to mark episode as downloaded: "
"%s S%02dE%02d - %s",
series_key,
season,
episode,
e,
)
return False
def _remove_episode_from_memory(
self,
series_key: str,
season: int,
episode: int,
) -> None:
"""Remove an episode from the in-memory Serie.episodeDict.
Updates the SeriesApp's keyDict so that list_missing and
series_list reflect the removal immediately without needing
a full database reload.
Args:
series_key: Unique provider key for the series
season: Season number
episode: Episode number within season
"""
try:
app = self._anime_service._app
serie = app.list.keyDict.get(series_key)
if not serie:
logger.debug(
"Series %s not found in keyDict, skipping "
"in-memory removal",
series_key,
)
return
ep_dict = serie.episodeDict
if season not in ep_dict:
logger.debug(
"Season %d not in episodeDict for %s, "
"skipping in-memory removal",
season,
series_key,
)
return
if episode in ep_dict[season]:
ep_dict[season].remove(episode)
logger.info(
"Removed episode from in-memory episodeDict: "
"%s S%02dE%02d (remaining in season: %s)",
series_key,
season,
episode,
ep_dict[season],
)
# Remove the season key if no episodes remain
if not ep_dict[season]:
del ep_dict[season]
logger.info(
"Removed empty season %d from episodeDict "
"for %s",
season,
series_key,
)
# Refresh series_list so GetMissingEpisode()
# reflects the change
app.series_list = app.list.GetMissingEpisode()
logger.info(
"Refreshed series_list: %d series with "
"missing episodes remaining",
len(app.series_list),
)
# Update deprecated data file if it exists
# DB is authoritative; data file is optional backup
serie_folder = serie.folder
data_path = Path(self._directory) / serie_folder / "data"
if data_path.exists():
try:
import warnings
with warnings.catch_warnings():
warnings.simplefilter("ignore", DeprecationWarning)
serie.save_to_file(str(data_path))
logger.debug(
"Updated data file after download: %s",
data_path,
)
except Exception as e:
logger.warning(
"Failed to update data file %s: %s",
data_path,
e,
)
else:
logger.debug(
"Episode %d not in season %d for %s, "
"already removed from memory",
episode,
season,
series_key,
)
except Exception as e:
logger.warning(
"Failed to remove episode from in-memory state: "
"%s S%02dE%02d - %s",
series_key,
season,
episode,
e,
)
async def _init_queue_progress(self) -> None:
"""Initialize the download queue progress tracking.
@@ -272,12 +524,21 @@ class DownloadService:
)
self._queue_progress_initialized = True
except Exception as e:
logger.error("Failed to initialize queue progress: %s", e)
# If the entry already exists (e.g. from a concurrent task),
# treat that as success — the progress is usable.
from src.server.services.progress_service import ProgressServiceError
if isinstance(e, ProgressServiceError) and "already exists" in str(e):
logger.debug(
"Queue progress already initialized by concurrent task"
)
self._queue_progress_initialized = True
else:
logger.error("Failed to initialize queue progress: %s", e)
def _add_to_pending_queue(
self, item: DownloadItem, front: bool = False
) -> None:
"""Add item to pending queue and update helper dict.
"""Add item to pending queue and update helper dicts.
Args:
item: Download item to add
@@ -288,9 +549,12 @@ class DownloadService:
else:
self._pending_queue.append(item)
self._pending_items_by_id[item.id] = item
# Track by episode identity for deduplication
ep_key = (item.serie_id, item.episode.season, item.episode.episode)
self._pending_by_episode[ep_key] = item.id
def _remove_from_pending_queue(self, item_or_id: str) -> Optional[DownloadItem]: # noqa: E501
"""Remove item from pending queue and update helper dict.
"""Remove item from pending queue and update helper dicts.
Args:
item_or_id: Item ID to remove
@@ -310,6 +574,10 @@ class DownloadService:
try:
self._pending_queue.remove(item)
del self._pending_items_by_id[item_id]
# Clean up episode tracking
ep_key = (item.serie_id, item.episode.season, item.episode.episode)
if self._pending_by_episode.get(ep_key) == item_id:
del self._pending_by_episode[ep_key]
return item
except (ValueError, KeyError):
return None
@@ -349,10 +617,35 @@ class DownloadService:
# Initialize queue progress tracking if not already done
await self._init_queue_progress()
# Filter out episodes already in pending queue
episodes_to_add = []
skipped_count = 0
seen_in_batch: set = set() # Track duplicates within this batch
for ep in episodes:
ep_key = (serie_id, ep.season, ep.episode)
if ep_key in self._pending_by_episode or ep_key in seen_in_batch:
logger.debug(
"Skipping duplicate episode in queue",
serie_key=serie_id,
season=ep.season,
episode=ep.episode,
)
skipped_count += 1
continue
seen_in_batch.add(ep_key)
episodes_to_add.append(ep)
if skipped_count > 0:
logger.info(
"Skipped %d duplicate episodes in queue",
skipped_count,
serie_key=serie_id,
)
created_ids = []
try:
for episode in episodes:
for episode in episodes_to_add:
item = DownloadItem(
id=self._generate_item_id(),
serie_id=serie_id,
@@ -636,8 +929,12 @@ class DownloadService:
"queue_status": queue_status.model_dump(mode="json")
},
)
# Reset flag so next queue run re-creates the progress entry
self._queue_progress_initialized = False
else:
logger.info("Queue processing stopped by user")
# Reset flag so next queue run re-creates the progress entry
self._queue_progress_initialized = False
async def start_next_download(self) -> Optional[str]:
"""Legacy method - redirects to start_queue_processing.
@@ -658,18 +955,21 @@ class DownloadService:
self._is_stopped = True
logger.info("Download processing stopped")
# Notify via progress service
queue_status = await self.get_queue_status()
await self._progress_service.update_progress(
progress_id="download_queue",
message="Queue processing stopped",
metadata={
"action": "queue_stopped",
"is_stopped": True,
"queue_status": queue_status.model_dump(mode="json"),
},
force_broadcast=True,
)
# Notify via progress service (guard against entry not existing)
try:
queue_status = await self.get_queue_status()
await self._progress_service.update_progress(
progress_id="download_queue",
message="Queue processing stopped",
metadata={
"action": "queue_stopped",
"is_stopped": True,
"queue_status": queue_status.model_dump(mode="json"),
},
force_broadcast=True,
)
except Exception as e:
logger.warning("Could not update queue progress on stop: %s", e)
async def get_queue_status(self) -> QueueStatus:
"""Get current status of all queues.
@@ -837,17 +1137,16 @@ class DownloadService:
if item.retry_count >= self._max_retries:
continue
# Move back to pending
# Move back to pending (retry_count will be incremented
# by _process_download when the item fails again)
self._failed_items.remove(item)
item.status = DownloadStatus.PENDING
item.retry_count += 1
item.error = None
item.progress = None
item.retry_count += 1
self._add_to_pending_queue(item)
retried_ids.append(item.id)
# Status is now managed in-memory only
logger.info(
"Retrying failed item: item_id=%s, retry_count=%d",
item.id,
@@ -855,18 +1154,23 @@ class DownloadService:
)
if retried_ids:
# Notify via progress service
queue_status = await self.get_queue_status()
await self._progress_service.update_progress(
progress_id="download_queue",
message=f"Retried {len(retried_ids)} failed items",
metadata={
"action": "items_retried",
"retried_ids": retried_ids,
"queue_status": queue_status.model_dump(mode="json"),
},
force_broadcast=True,
)
# Notify via progress service if available
try:
queue_status = await self.get_queue_status()
await self._progress_service.update_progress(
progress_id="download_queue",
message=f"Retried {len(retried_ids)} failed items",
metadata={
"action": "items_retried",
"retried_ids": retried_ids,
"queue_status": queue_status.model_dump(mode="json"),
},
force_broadcast=True,
)
except Exception as e:
logger.warning(
"Failed to broadcast retry progress: %s", e
)
return retried_ids
@@ -933,18 +1237,36 @@ class DownloadService:
self._completed_items.append(item)
# Delete completed item from database (status is in-memory)
logger.info(
"Download succeeded, cleaning up: item_id=%s, "
"serie_key=%s, S%02dE%02d",
item.id,
item.serie_id,
item.episode.season,
item.episode.episode,
)
# Delete completed item from download queue database
await self._delete_from_database(item.id)
# Remove episode from missing episodes list in database
await self._remove_episode_from_missing_list(
# Mark episode as downloaded in missing episodes list
# (both database and in-memory)
removed = await self._remove_episode_from_missing_list(
series_key=item.serie_id,
season=item.episode.season,
episode=item.episode.episode,
serie_folder=item.serie_folder,
)
logger.info(
"Download completed successfully: item_id=%s", item.id
"Download completed successfully: item_id=%s, "
"serie_key=%s, S%02dE%02d, "
"missing_episode_removed=%s",
item.id,
item.serie_id,
item.episode.season,
item.episode.episode,
removed,
)
else:
raise AnimeServiceError("Download returned False")
@@ -988,17 +1310,35 @@ class DownloadService:
item.status = DownloadStatus.FAILED
item.completed_at = datetime.now(timezone.utc)
item.error = str(e)
# Increment retry count in memory and database
item.retry_count += 1
await self._increment_retry_in_database(item.id)
self._failed_items.append(item)
# Set error in database
await self._set_error_in_database(item.id, str(e))
logger.error(
"Download failed: item_id=%s, error=%s, retry_count=%d",
item.id,
str(e),
item.retry_count,
)
# Check if max retries exceeded - move to dead-letter
if item.retry_count >= self._max_retries:
await self._set_status_in_database(
item.id, DownloadStatus.PERMANENTLY_FAILED.value
)
logger.error(
"Download permanently failed after max retries: "
"item_id=%s, error=%s, retry_count=%d",
item.id,
str(e),
item.retry_count,
)
else:
logger.error(
"Download failed: item_id=%s, error=%s, retry_count=%d",
item.id,
str(e),
item.retry_count,
)
# Note: Failure is already broadcast by AnimeService
# via ProgressService when SeriesApp fires failed event

View File

@@ -1,12 +1,14 @@
"""Centralized initialization service for application startup and setup."""
import asyncio
import os
from pathlib import Path
from typing import Callable, Optional
import structlog
from src.config.settings import settings
from src.server.services.anime_service import sync_series_from_data_files
from src.server.services.anime_service import sync_legacy_series_to_db
from src.server.services.legacy_file_migration import migrate_series_from_files_to_db
logger = structlog.get_logger(__name__)
@@ -99,6 +101,151 @@ async def _mark_initial_scan_completed() -> None:
)
async def _check_legacy_migration_status() -> bool:
"""Check if legacy key/data file migration has been completed.
Returns:
bool: True if migration was completed, False otherwise
"""
return await _check_scan_status(
check_method=lambda svc, db: svc.is_migration_legacy_files_completed(db),
scan_type="legacy_migration",
log_completed_msg="Legacy file migration already completed, skipping",
log_not_completed_msg="Legacy file migration not yet run, will check for files"
)
async def _mark_legacy_migration_completed() -> None:
"""Mark the legacy file migration as completed in system settings."""
await _mark_scan_completed(
mark_method=lambda svc, db: svc.mark_migration_legacy_files_completed(db),
scan_type="legacy_migration"
)
async def _check_legacy_key_cleanup_status() -> bool:
"""Check if legacy key file cleanup has been completed.
Returns:
bool: True if cleanup was completed, False otherwise
"""
return await _check_scan_status(
check_method=lambda svc, db: svc.is_legacy_key_cleanup_completed(db),
scan_type="legacy_key_cleanup",
log_completed_msg="Legacy key file cleanup already completed, skipping",
log_not_completed_msg="Legacy key file cleanup not yet run, will clean up key files"
)
async def _mark_legacy_key_cleanup_completed() -> None:
"""Mark the legacy key file cleanup as completed in system settings."""
await _mark_scan_completed(
mark_method=lambda svc, db: svc.mark_legacy_key_cleanup_completed(db),
scan_type="legacy_key_cleanup"
)
async def _migrate_legacy_files() -> int:
"""Migrate series from legacy key/data files to database.
Returns:
int: Number of series migrated
"""
from src.server.database.connection import get_db_session
logger.info("Checking for legacy key/data files to migrate...")
try:
async with get_db_session() as db:
migrated_count = await migrate_series_from_files_to_db(
settings.anime_directory,
db
)
if migrated_count > 0:
logger.info("Migrated %d series from legacy files", migrated_count)
else:
logger.info("No series found in legacy files to migrate")
return migrated_count
except Exception as e:
logger.warning("Failed to migrate legacy files: %s", e)
return 0
async def _cleanup_legacy_key_files() -> int:
"""Remove legacy key files from folders that already have DB entries.
This is a one-time cleanup task that runs at startup after legacy migration.
It removes deprecated 'key' files that cause duplicate key errors when
folders are renamed, since the DB is now the source of truth.
Returns:
int: Number of key files deleted
"""
from src.server.database.connection import get_db_session
from src.server.database.service import AnimeSeriesService
logger.info("Checking for legacy key files to clean up...")
if not settings.anime_directory or not os.path.isdir(settings.anime_directory):
logger.warning(
"Anime directory not configured or does not exist, skipping legacy key cleanup"
)
return 0
deleted_count = 0
scanned_count = 0
try:
async with get_db_session() as db:
# Get all series from DB to know which folders should have key files removed
all_series = await AnimeSeriesService.get_all(db)
# Build a set of known folder names from DB
db_folders: set[str] = {series.folder for series in all_series if series.folder}
for folder_name in db_folders:
folder_path = settings.anime_directory / folder_name
key_file = folder_path / "key"
if not key_file.exists():
continue
scanned_count += 1
try:
key_file.unlink()
deleted_count += 1
logger.info(
"Removed legacy key file",
folder=folder_name,
key_file=str(key_file)
)
except OSError as exc:
logger.warning(
"Could not remove legacy key file",
folder=folder_name,
key_file=str(key_file),
error=str(exc)
)
except Exception as e:
logger.error(
"Legacy key file cleanup failed",
error=str(e),
exc_info=True
)
return deleted_count
logger.info(
"Legacy key file cleanup complete",
scanned=scanned_count,
deleted=deleted_count
)
return deleted_count
async def _sync_anime_folders(progress_service=None) -> int:
"""Scan anime folders and sync series to database.
@@ -118,7 +265,7 @@ async def _sync_anime_folders(progress_service=None) -> int:
metadata={"step_id": "series_sync"}
)
sync_count = await sync_series_from_data_files(settings.anime_directory)
sync_count = await sync_legacy_series_to_db(settings.anime_directory)
logger.info("Data file sync complete. Added %d series.", sync_count)
if progress_service:
@@ -181,18 +328,19 @@ async def _validate_anime_directory(progress_service=None) -> bool:
async def perform_initial_setup(progress_service=None):
"""Perform initial setup including series sync and scan completion marking.
This function is called both during application lifespan startup
and when the setup endpoint is completed. It ensures that:
1. Series are synced from data files to database
2. Initial scan is marked as completed
3. Series are loaded into memory
4. NFO scan is performed if configured
5. Media scan is performed
1. Legacy key/data files are migrated to database (one-time)
2. Series are synced from data files to database
3. Initial scan is marked as completed
4. Series are loaded into memory
5. NFO scan is performed if configured
6. Media scan is performed
Args:
progress_service: Optional ProgressService for emitting updates
Returns:
bool: True if initialization was performed, False if skipped
"""
@@ -225,17 +373,30 @@ async def perform_initial_setup(progress_service=None):
# Perform the actual initialization
try:
# First, run legacy file migration if needed (independent of initial scan)
is_legacy_migration_done = await _check_legacy_migration_status()
if not is_legacy_migration_done:
await _migrate_legacy_files()
await _mark_legacy_migration_completed()
# Sync series from anime folders to database
await _sync_anime_folders(progress_service)
# Clean up legacy key files from folders that now have DB entries
# This runs after migration/sync to ensure DB entries exist before deletion
is_key_cleanup_done = await _check_legacy_key_cleanup_status()
if not is_key_cleanup_done:
await _cleanup_legacy_key_files()
await _mark_legacy_key_cleanup_completed()
# Mark the initial scan as completed
await _mark_initial_scan_completed()
# Load series into memory from database
await _load_series_into_memory(progress_service)
return True
except (OSError, RuntimeError, ValueError) as e:
logger.warning("Failed to perform initial setup: %s", e)
return False
@@ -342,7 +503,7 @@ async def perform_nfo_scan_if_needed(progress_service=None):
if not settings.tmdb_api_key
else "Skipped - NFO features disabled"
)
logger.info(f"NFO scan skipped: {message}")
logger.info("NFO scan skipped: %s", message)
if progress_service:
await progress_service.complete_progress(
@@ -377,111 +538,15 @@ async def perform_nfo_scan_if_needed(progress_service=None):
)
_NFO_REPAIR_SEMAPHORE: asyncio.Semaphore = asyncio.Semaphore(3)
async def _repair_one_series(series_dir: Path, series_name: str) -> None:
"""Repair a single series NFO in isolation.
Creates a fresh :class:`NFOService` and :class:`NfoRepairService` per
invocation so that each repair owns its own ``aiohttp`` session/connector
and concurrent tasks cannot interfere with each other.
A module-level semaphore (``_NFO_REPAIR_SEMAPHORE``) limits the number of
simultaneous TMDB requests to avoid rate-limiting.
Any exception is caught and logged so the asyncio task never silently
drops an unhandled error.
Args:
series_dir: Absolute path to the series folder.
series_name: Human-readable series name for log messages.
"""
from src.core.services.nfo_factory import NFOServiceFactory
from src.core.services.nfo_repair_service import NfoRepairService
async with _NFO_REPAIR_SEMAPHORE:
try:
factory = NFOServiceFactory()
nfo_service = factory.create()
repair_service = NfoRepairService(nfo_service)
await repair_service.repair_series(series_dir, series_name)
except Exception as exc: # pylint: disable=broad-except
logger.error(
"NFO repair failed for %s: %s",
series_name,
exc,
)
async def perform_nfo_repair_scan(background_loader=None) -> None:
"""Scan all series folders and repair incomplete tvshow.nfo files.
Runs on every application startup (not guarded by a run-once DB flag).
Checks each subfolder of ``settings.anime_directory`` for a ``tvshow.nfo``
and calls ``_repair_one_series`` for every file with absent or empty
required tags.
Each repair task creates its own isolated :class:`NFOService` /
:class:`TMDBClient` so concurrent tasks never share an ``aiohttp``
session — this prevents "Connector is closed" errors when many repairs
run in parallel. A semaphore caps TMDB concurrency at 3 to stay within
rate limits.
The ``background_loader`` parameter is accepted for backwards-compatibility
but is no longer used.
Args:
background_loader: Unused. Kept to avoid breaking call-sites.
"""
from src.core.services.nfo_repair_service import nfo_needs_repair
if not settings.tmdb_api_key:
logger.warning("NFO repair scan skipped — TMDB API key not configured")
return
if not settings.anime_directory:
logger.warning("NFO repair scan skipped — anime directory not configured")
return
anime_dir = Path(settings.anime_directory)
if not anime_dir.is_dir():
logger.warning("NFO repair scan skipped — anime directory not found: %s", anime_dir)
return
queued = 0
total = 0
for series_dir in sorted(anime_dir.iterdir()):
if not series_dir.is_dir():
continue
nfo_path = series_dir / "tvshow.nfo"
if not nfo_path.exists():
continue
total += 1
series_name = series_dir.name
if nfo_needs_repair(nfo_path):
queued += 1
# Each task creates its own NFOService so connectors are isolated.
asyncio.create_task(
_repair_one_series(series_dir, series_name),
name=f"nfo_repair:{series_name}",
)
logger.info(
"NFO repair scan complete: %d of %d series queued for repair",
queued,
total,
)
async def _check_media_scan_status() -> bool:
"""Check if initial media scan has been completed.
Returns:
bool: True if media scan was completed, False otherwise
"""
return await _check_scan_status(
check_method=lambda svc, db: svc.is_initial_media_scan_completed(db),
scan_type="media"
)
# DISABLED: Always return True to skip startup scan
# To re-enable, change to: return await _check_scan_status(...)
return True
async def _mark_media_scan_completed() -> None:

View File

@@ -0,0 +1,233 @@
"""One-time migration service for legacy key and data files.
This module provides functionality to migrate series data from legacy
file-based storage (key/data files) to the database. The migration is
designed to be idempotent and run only once per environment.
"""
from __future__ import annotations
import json
import os
from pathlib import Path
from typing import Optional
import structlog
from sqlalchemy.ext.asyncio import AsyncSession
logger = structlog.get_logger(__name__)
async def migrate_series_from_files_to_db(
anime_dir: str,
db: AsyncSession,
) -> int:
"""Migrate series from legacy key/data files to database.
Scans for folders containing legacy 'key' or 'data' files and imports
any series not already in the database. The DB version wins if a series
exists in both places.
Args:
anime_dir: Path to the anime directory
db: Database session
Returns:
Number of series imported
"""
from src.server.database.service import AnimeSeriesService, EpisodeService
if not anime_dir or not os.path.isdir(anime_dir):
logger.warning(
"Anime directory does not exist, skipping legacy migration",
anime_dir=anime_dir
)
return 0
migrated_count = 0
scanned_count = 0
try:
for folder_name in os.listdir(anime_dir):
folder_path = os.path.join(anime_dir, folder_name)
if not os.path.isdir(folder_path):
continue
scanned_count += 1
# Check for 'key' file (single line with series key)
key_file = os.path.join(folder_path, "key")
# Check for 'data' file (JSON with series metadata)
data_file = os.path.join(folder_path, "data")
series_data: Optional[dict] = None
# Try to load from 'data' file first (more complete)
if os.path.isfile(data_file):
series_data = _load_data_file(data_file)
elif os.path.isfile(key_file):
# Fall back to 'key' file - just the key, need to infer other data
series_data = _load_key_file(key_file, folder_name)
if series_data is None:
continue
key = series_data.get("key")
if not key:
logger.warning(
"Skipping folder with no valid key",
folder=folder_name
)
continue
# Check if already in DB
existing = await AnimeSeriesService.get_by_key(db, key)
if existing:
logger.debug(
"Series already in database, skipping",
key=key,
folder=folder_name
)
continue
# Create the series in DB
try:
name = series_data.get("name") or folder_name
site = series_data.get("site", "https://aniworld.to")
folder = series_data.get("folder", folder_name)
year = series_data.get("year")
anime_series = await AnimeSeriesService.create(
db=db,
key=key,
name=name,
site=site,
folder=folder,
year=year,
)
# Create episodes if present
episode_dict = series_data.get("episodeDict", {})
if episode_dict:
for season, episode_numbers in episode_dict.items():
for episode_number in episode_numbers:
await EpisodeService.create(
db=db,
series_id=anime_series.id,
season=season,
episode_number=episode_number,
)
migrated_count += 1
logger.info(
"Migrated series from legacy file",
key=key,
name=name,
folder=folder_name
)
except Exception as e:
logger.warning(
"Failed to migrate series from legacy file",
key=key,
folder=folder_name,
error=str(e)
)
except Exception as e:
logger.error(
"Legacy migration failed",
anime_dir=anime_dir,
error=str(e),
exc_info=True
)
logger.info(
"Legacy file migration complete",
scanned_folders=scanned_count,
migrated=migrated_count
)
return migrated_count
def _load_data_file(data_file_path: str) -> Optional[dict]:
"""Load and parse a legacy 'data' file (JSON).
Args:
data_file_path: Path to the data file
Returns:
Parsed data dict or None if parsing fails
"""
try:
with open(data_file_path, "r", encoding="utf-8") as f:
data = json.load(f)
if not isinstance(data, dict):
logger.warning(
"Data file is not a dictionary",
file=data_file_path
)
return None
# Ensure episodeDict has int keys
if "episodeDict" in data and isinstance(data["episodeDict"], dict):
data["episodeDict"] = {
int(k): v for k, v in data["episodeDict"].items()
}
return data
except json.JSONDecodeError as e:
logger.warning(
"Failed to parse legacy data file (JSON error)",
file=data_file_path,
error=str(e)
)
return None
except Exception as e:
logger.warning(
"Failed to read legacy data file",
file=data_file_path,
error=str(e)
)
return None
def _load_key_file(key_file_path: str, folder_name: str) -> Optional[dict]:
"""Load a legacy 'key' file (single line with series key).
Args:
key_file_path: Path to the key file
folder_name: Folder name to use as fallback name
Returns:
Data dict with key and inferred fields, or None if loading fails
"""
try:
with open(key_file_path, "r", encoding="utf-8") as f:
key = f.read().strip()
if not key:
logger.warning(
"Key file is empty",
file=key_file_path
)
return None
# Infer basic data from key file
return {
"key": key,
"name": folder_name,
"site": "https://aniworld.to",
"folder": folder_name,
"episodeDict": {},
}
except Exception as e:
logger.warning(
"Failed to read legacy key file",
file=key_file_path,
error=str(e)
)
return None

View File

@@ -151,7 +151,7 @@ class EmailNotificationService:
start_tls=True,
)
logger.info(f"Email notification sent to {to_address}")
logger.info("Email notification sent to %s", to_address)
return True
except ImportError:
@@ -160,7 +160,7 @@ class EmailNotificationService:
)
return False
except Exception as e:
logger.error(f"Failed to send email notification: {e}")
logger.error("Failed to send email notification: %s", e)
return False
@@ -205,7 +205,7 @@ class WebhookNotificationService:
timeout=aiohttp.ClientTimeout(total=self.timeout),
) as response:
if response.status < 400:
logger.info(f"Webhook notification sent to {url}")
logger.info("Webhook notification sent to %s", url)
return True
else:
logger.warning(
@@ -213,9 +213,9 @@ class WebhookNotificationService:
)
except asyncio.TimeoutError:
logger.warning(f"Webhook timeout (attempt {attempt + 1}/{self.max_retries}): {url}")
logger.warning("Webhook timeout (attempt %s/%s): %s", attempt + 1, self.max_retries, url)
except Exception as e:
logger.error(f"Failed to send webhook (attempt {attempt + 1}/{self.max_retries}): {e}")
logger.error("Failed to send webhook (attempt %s/%s): %s", attempt + 1, self.max_retries, e)
if attempt < self.max_retries - 1:
await asyncio.sleep(2 ** attempt) # Exponential backoff
@@ -436,7 +436,7 @@ class NotificationService:
await self.in_app_service.add_notification(notification)
results["in_app"] = True
except Exception as e:
logger.error(f"Failed to send in-app notification: {e}")
logger.error("Failed to send in-app notification: %s", e)
results["in_app"] = False
# Send email notification
@@ -452,7 +452,7 @@ class NotificationService:
)
results["email"] = success
except Exception as e:
logger.error(f"Failed to send email notification: {e}")
logger.error("Failed to send email notification: %s", e)
results["email"] = False
# Send webhook notifications
@@ -476,7 +476,7 @@ class NotificationService:
success = await self.webhook_service.send_webhook(str(url), payload)
webhook_results.append(success)
except Exception as e:
logger.error(f"Failed to send webhook notification to {url}: {e}")
logger.error("Failed to send webhook notification to %s: %s", url, e)
webhook_results.append(False)
results["webhook"] = all(webhook_results) if webhook_results else False

View File

@@ -83,15 +83,12 @@ class QueueRepository:
) -> DownloadItem:
"""Convert database model to DownloadItem.
Note: Since the database model is simplified, status, priority,
progress, and retry_count default to initial values.
Args:
db_item: SQLAlchemy download queue item
item_id: Optional override for item ID
Returns:
Pydantic download item with default status/priority
Pydantic download item with status/retry_count from database
"""
# Get episode info from the related Episode object
episode = db_item.episode
@@ -109,14 +106,14 @@ class QueueRepository:
serie_folder=series.folder if series else "",
serie_name=series.name if series else "",
episode=episode_identifier,
status=DownloadStatus.PENDING, # Default - managed in-memory
priority=DownloadPriority.NORMAL, # Default - managed in-memory
status=DownloadStatus(db_item.status), # From database
priority=DownloadPriority.NORMAL, # Managed in-memory
added_at=db_item.created_at or datetime.now(timezone.utc),
started_at=db_item.started_at,
completed_at=db_item.completed_at,
progress=None, # Managed in-memory
error=db_item.error_message,
retry_count=0, # Managed in-memory
retry_count=db_item.retry_count, # From database
source_url=db_item.download_url,
)
@@ -350,6 +347,110 @@ class QueueRepository:
finally:
if manage_session:
await session.close()
async def set_status(
self,
item_id: str,
status: str,
db: Optional[AsyncSession] = None,
) -> bool:
"""Set status on a download item.
Args:
item_id: Download item ID
status: New status value
db: Optional existing database session
Returns:
True if update succeeded, False if item not found
Raises:
QueueRepositoryError: If update fails
"""
session = db or self._db_session_factory()
manage_session = db is None
try:
result = await DownloadQueueService.set_status(
session,
int(item_id),
status,
)
if manage_session:
await session.commit()
success = result is not None
if success:
logger.debug(
"Set status on queue item: item_id=%s, status=%s",
item_id,
status,
)
return success
except ValueError:
return False
except Exception as e:
if manage_session:
await session.rollback()
logger.error("Failed to set status: %s", e)
raise QueueRepositoryError(f"Failed to set status: {e}") from e
finally:
if manage_session:
await session.close()
async def increment_retry(
self,
item_id: str,
db: Optional[AsyncSession] = None,
) -> bool:
"""Increment retry count on a download item.
Args:
item_id: Download item ID
db: Optional existing database session
Returns:
True if update succeeded, False if item not found
Raises:
QueueRepositoryError: If update fails
"""
session = db or self._db_session_factory()
manage_session = db is None
try:
result = await DownloadQueueService.increment_retry_count(
session,
int(item_id),
)
if manage_session:
await session.commit()
success = result is not None
if success:
logger.debug(
"Incremented retry count on queue item: item_id=%s",
item_id,
)
return success
except ValueError:
return False
except Exception as e:
if manage_session:
await session.rollback()
logger.error("Failed to increment retry: %s", e)
raise QueueRepositoryError(f"Failed to increment retry: {e}") from e
finally:
if manage_session:
await session.close()
async def delete_item(
self,

View File

@@ -0,0 +1,291 @@
"""Rescan service — orchestrates library rescans.
This service handles the actual scan/rescan logic:
- Library rescan via anime_service
- Auto-download of missing episodes (if enabled)
- Folder maintenance scan (if enabled)
- Orphaned folder key resolution
SchedulerService only calls RescanService.execute() — it does not
know about the internal steps.
"""
from __future__ import annotations
import logging
from datetime import datetime, timedelta, timezone
from typing import List, Optional
from src.server.models.config import SchedulerConfig
logger = logging.getLogger(__name__)
_AUTO_DOWNLOAD_COOLDOWN_SECONDS = 300 # 5 minutes
class RescanService:
"""Orchestrates all rescan-related operations.
Encapsulates the full post-rescan workflow so SchedulerService
only needs to call a single execute() method.
"""
def __init__(self, config: Optional[SchedulerConfig] = None) -> None:
"""Initialize the rescan service.
Args:
config: Optional scheduler config. If None, operations that depend
on config flags (auto_download, folder_scan) will be skipped.
"""
self._config = config
self._last_scan_time: Optional[datetime] = None
self._last_auto_download_time: Optional[datetime] = None
@property
def last_scan_time(self) -> Optional[datetime]:
return self._last_scan_time
# ------------------------------------------------------------------
# Public entry point
# ------------------------------------------------------------------
async def execute(self) -> dict:
"""Execute the full rescan workflow.
Runs in order:
1. anime_service.rescan()
2. auto-download (if enabled)
3. folder scan (if enabled)
4. key resolution scan (always, if anime_directory configured)
Returns:
Dict with duration and counts for each step.
"""
from src.server.services.websocket_service import get_websocket_service
scan_start = datetime.now(timezone.utc)
results = {
"started_at": scan_start.isoformat(),
"duration_seconds": 0.0,
"rescan_completed": False,
"auto_download_queued": 0,
"folder_scan_completed": False,
"key_resolution": {"resolved": 0, "skipped": 0, "errors": 0},
}
await self._broadcast("scheduled_rescan_started", {"timestamp": scan_start.isoformat()})
try:
# 1. Main library rescan
await self._run_rescan()
results["rescan_completed"] = True
# 2. Auto-download
if self._config and self._config.auto_download_after_rescan:
try:
queued = await self._run_auto_download()
results["auto_download_queued"] = queued
await self._broadcast("auto_download_started", {"queued_count": queued})
except Exception as exc:
logger.error("Auto-download failed: %s", exc, exc_info=True)
await self._broadcast("auto_download_error", {"error": str(exc)})
# 3. Folder scan
if self._config and self._config.folder_scan_enabled:
try:
await self._run_folder_scan()
results["folder_scan_completed"] = True
except Exception as exc:
logger.error("Folder scan failed: %s", exc, exc_info=True)
await self._broadcast("folder_scan_error", {"error": str(exc)})
# 4. Key resolution scan
try:
key_stats = await self._run_key_resolution()
results["key_resolution"] = key_stats
except Exception as exc:
logger.error("Key resolution scan failed: %s", exc, exc_info=True)
self._last_scan_time = datetime.now(timezone.utc)
results["duration_seconds"] = (self._last_scan_time - scan_start).total_seconds()
await self._broadcast(
"scheduled_rescan_completed",
{
"timestamp": self._last_scan_time.isoformat(),
"duration_seconds": results["duration_seconds"],
},
)
logger.info(
"Scheduled library rescan completed: duration=%.2fs",
results["duration_seconds"],
)
except Exception as exc:
logger.error("Scheduled rescan failed: %s", exc, exc_info=True)
await self._broadcast(
"scheduled_rescan_error",
{"error": str(exc), "timestamp": datetime.now(timezone.utc).isoformat()},
)
raise
return results
# ------------------------------------------------------------------
# Step 1: Library rescan
# ------------------------------------------------------------------
async def _run_rescan(self) -> None:
"""Run the anime service rescan."""
from src.server.utils.dependencies import get_anime_service
anime_service = get_anime_service()
logger.info("Anime service obtained, calling anime_service.rescan()...")
await anime_service.rescan()
logger.info("anime_service.rescan() completed")
# ------------------------------------------------------------------
# Step 2: Auto-download
# ------------------------------------------------------------------
async def _run_auto_download(self) -> int:
"""Queue and start downloads for all series with missing episodes.
Returns:
Number of episodes queued.
"""
from src.server.models.download import EpisodeIdentifier
from src.server.utils.dependencies import (
get_anime_service,
get_download_service,
)
# Cooldown check to prevent rapid re-triggers
now = datetime.now(timezone.utc)
if self._last_auto_download_time is not None:
elapsed = now - self._last_auto_download_time
if elapsed < timedelta(seconds=_AUTO_DOWNLOAD_COOLDOWN_SECONDS):
logger.debug(
"Auto-download skipped: cooldown active (elapsed=%.1fs cooldown=%ds)",
elapsed.total_seconds(),
_AUTO_DOWNLOAD_COOLDOWN_SECONDS,
)
return 0
anime_service = get_anime_service()
download_service = get_download_service()
series_list = anime_service._cached_list_missing()
queued_count = 0
for series in series_list:
episode_dict: dict = series.get("episodeDict") or {}
if not episode_dict:
continue
episodes: List[EpisodeIdentifier] = []
for season_str, ep_numbers in episode_dict.items():
for ep_num in ep_numbers:
episodes.append(
EpisodeIdentifier(season=int(season_str), episode=int(ep_num))
)
if not episodes:
continue
await download_service.add_to_queue(
serie_id=series.get("key", ""),
serie_folder=series.get("folder", series.get("name", "")),
serie_name=series.get("name", ""),
episodes=episodes,
)
queued_count += len(episodes)
logger.info(
"Auto-download queued episodes for series=%s count=%d",
series.get("key"),
len(episodes),
)
if queued_count:
await download_service.start_queue_processing()
logger.info("Auto-download queue processing started: queued=%d", queued_count)
self._last_auto_download_time = datetime.now(timezone.utc)
logger.info("Auto-download completed: queued_count=%d", queued_count)
return queued_count
# ------------------------------------------------------------------
# Step 3: Folder scan
# ------------------------------------------------------------------
async def _run_folder_scan(self) -> None:
"""Run the folder scan maintenance task."""
from src.server.services.scheduler.folder_scan_service import FolderScanService
folder_scan_service = FolderScanService()
await folder_scan_service.run_folder_scan()
logger.info("Folder scan completed successfully")
# ------------------------------------------------------------------
# Step 4: Key resolution
# ------------------------------------------------------------------
async def _run_key_resolution(self) -> dict:
"""Run the orphaned folder key resolution scan.
Returns:
Dict with resolved/skipped/errors counts.
"""
from src.server.services.scheduler.key_resolution_service import (
perform_key_resolution_scan,
)
key_stats = await perform_key_resolution_scan()
logger.info(
"Key resolution scan completed: resolved=%d, skipped=%d, errors=%d",
key_stats["resolved"],
key_stats["skipped"],
key_stats["errors"],
)
return key_stats
# ------------------------------------------------------------------
# Private helpers
# ------------------------------------------------------------------
async def _broadcast(self, event_type: str, data: dict) -> None:
"""Broadcast a WebSocket event to all connected clients."""
try:
from src.server.services.websocket_service import get_websocket_service
ws_service = get_websocket_service()
await ws_service.manager.broadcast({"type": event_type, "data": data})
except Exception as exc:
logger.warning(
"WebSocket broadcast failed: event=%s error=%s", event_type, exc
)
# ---------------------------------------------------------------------------
# Module-level singleton
# ---------------------------------------------------------------------------
_rescan_service: Optional[RescanService] = None
def get_rescan_service(config: Optional[SchedulerConfig] = None) -> RescanService:
"""Return a RescanService singleton (or create with optional config)."""
global _rescan_service
if _rescan_service is None or config is not None:
_rescan_service = RescanService(config=config)
logger.debug("Created new RescanService singleton")
else:
logger.debug("Returning existing RescanService singleton")
return _rescan_service
def reset_rescan_service() -> None:
"""Reset the singleton (used in tests)."""
global _rescan_service
_rescan_service = None

View File

@@ -0,0 +1,45 @@
"""Scheduler services package.
Contains scheduler orchestration and rescan coordination:
- scheduler_service: Cron-based scheduler using APScheduler
- rescan_orchestrator: Legacy alias for RescanService (for backward compatibility)
"""
from __future__ import annotations
from src.server.services.rescan_service import (
RescanService,
get_rescan_service,
reset_rescan_service,
)
# Backward compatibility alias
from src.server.services.scheduler.rescan_orchestrator import (
RescanOrchestrator,
get_rescan_orchestrator,
reset_rescan_orchestrator,
)
from src.server.services.scheduler.scheduler_service import (
SchedulerService,
SchedulerServiceError,
get_scheduler_service,
reset_scheduler_service,
)
__all__ = [
# RescanService (new location)
"RescanService",
"get_rescan_service",
"reset_rescan_service",
# Scheduler
"SchedulerService",
"SchedulerServiceError",
"get_scheduler_service",
"reset_scheduler_service",
# Backward compatibility
"RescanOrchestrator",
"get_rescan_orchestrator",
"reset_rescan_orchestrator",
# Sub-services (still in scheduler folder)
"folder_rename_service",
]

View File

@@ -0,0 +1,739 @@
"""Folder rename service for validating and renaming series folders.
After NFO repair, this service iterates over every subfolder in
``settings.anime_directory`` that contains a ``tvshow.nfo``. For each
folder it parses the NFO to extract ``<title>`` and ``<year>``, computes
the expected folder name ``f"{title} ({year})"``, sanitises it for
filesystem safety, and renames the folder if the current name differs.
Database records (``AnimeSeries.folder``, ``Episode.file_path``,
``DownloadQueueItem.file_destination``) are updated atomically to
reflect the new paths.
"""
from __future__ import annotations
import logging
import re
import shutil
from collections import defaultdict
from dataclasses import dataclass
from pathlib import Path
from typing import Optional
from lxml import etree
from src.config.settings import settings
from src.server.database.connection import get_db_session
from src.server.database.service import (
AnimeSeriesService,
DownloadQueueService,
EpisodeService,
)
from src.server.utils.dependencies import get_download_service
from src.server.utils.filesystem import sanitize_folder_name
logger = logging.getLogger(__name__)
# Pre-compiled pattern for stripping existing year suffixes
_YEAR_SUFFIX_PATTERN = re.compile(r'(\s*\(\d{4}\))+\s*$')
@dataclass
class DuplicateGroup:
"""Represents a group of duplicate folders for the same series.
Attributes:
key: The series key (folder name before rename).
folders: List of folder paths that map to this series.
nfo_paths: List of corresponding NFO file paths.
"""
key: str
folders: list[str]
nfo_paths: list[Path]
@property
def count(self) -> int:
return len(self.folders)
def __repr__(self) -> str:
return f"DuplicateGroup(key={self.key!r}, folders={self.folders})"
@dataclass
class RenameStats:
"""Statistics from a folder rename operation."""
scanned: int = 0
renamed: int = 0
skipped: int = 0
errors: int = 0
def to_dict(self) -> dict[str, int]:
return {"scanned": self.scanned, "renamed": self.renamed, "skipped": self.skipped, "errors": self.errors}
def _scan_for_pre_existing_duplicates(anime_dir: Path) -> list[DuplicateGroup]:
"""Scan anime directory for pre-existing duplicate folders.
Groups folders by the series key extracted from their NFO files.
Folders with the same title+year (same expected name) are flagged as duplicates.
Args:
anime_dir: Path to the anime directory to scan.
Returns:
List of DuplicateGroup objects, one per series with duplicate folders.
"""
groups: dict[str, list[tuple[str, Path]]] = defaultdict(list)
for series_dir in anime_dir.iterdir():
if not series_dir.is_dir():
continue
nfo_path = series_dir / "tvshow.nfo"
if not nfo_path.exists():
continue
title, year = _parse_nfo_title_and_year(nfo_path)
if not title or not year:
continue
expected_name = _compute_expected_folder_name(title, year)
groups[expected_name].append((series_dir.name, nfo_path))
duplicates = []
for key, items in groups.items():
if len(items) > 1:
folders = [item[0] for item in items]
nfo_paths = [item[1] for item in items]
duplicates.append(DuplicateGroup(key=key, folders=folders, nfo_paths=nfo_paths))
return duplicates
def _try_merge_duplicate_group(group: DuplicateGroup, dry_run: bool = False) -> bool:
"""Attempt to merge a duplicate group automatically.
Uses the first folder as the canonical one and removes others if they are
empty or contain only symlinks.
Args:
group: The DuplicateGroup to merge.
dry_run: If True, only log actions without executing them.
Returns:
True if merge was successful, False otherwise.
"""
if len(group.folders) < 2:
return True
canonical = group.folders[0]
to_remove = group.folders[1:]
for folder in to_remove:
folder_path = group.nfo_paths[0].parent.parent / folder
if not folder_path.exists():
continue
try:
contents = list(folder_path.iterdir())
except PermissionError:
logger.warning("Permission denied accessing %s, skip merge", folder_path)
return False
except OSError:
return False
if not contents:
if dry_run:
logger.info("[DRY-RUN] Would delete empty duplicate folder: %s", folder_path)
else:
try:
folder_path.rmdir()
logger.info("Deleted empty duplicate folder: %s", folder_path)
except OSError:
return False
continue
canonical_path = folder_path.parent / canonical
all_symlinks = all(
item.is_symlink() and item.resolve() == canonical_path.resolve()
for item in contents
)
if all_symlinks:
if dry_run:
logger.info("[DRY-RUN] Would remove symlinks in duplicate folder: %s", folder_path)
else:
for item in contents:
item.unlink()
try:
folder_path.rmdir()
logger.info("Removed symlink-only duplicate folder: %s", folder_path)
except OSError:
return False
continue
logger.warning(
"Cannot auto-merge duplicate folders for '%s': %s (manual merge required)",
group.key,
[canonical] + to_remove,
)
return False
return True
def _parse_nfo_title_and_year(nfo_path: Path) -> tuple[Optional[str], Optional[str]]:
"""Parse a tvshow.nfo and return (title, year) text values.
Args:
nfo_path: Absolute path to the ``tvshow.nfo`` file.
Returns:
Tuple of (title, year) where either may be ``None`` if missing
or empty.
"""
try:
tree = etree.parse(str(nfo_path))
root = tree.getroot()
title_elem = root.find("./title")
year_elem = root.find("./year")
title = title_elem.text.strip() if title_elem is not None and title_elem.text and title_elem.text.strip() else None
year = year_elem.text.strip() if year_elem is not None and year_elem.text and year_elem.text.strip() else None
return title, year
except etree.XMLSyntaxError as exc:
logger.warning("Malformed XML in %s: %s", nfo_path, exc)
return None, None
except Exception as exc:
logger.warning("Unexpected error parsing %s: %s", nfo_path, exc)
return None, None
def _compute_expected_folder_name(title: str, year: str) -> str:
"""Compute the expected folder name from title and year.
Removes any existing year suffixes (e.g., "(2021)") before adding the
canonical one to prevent duplication across multiple folder rename runs.
Args:
title: Series title from NFO.
year: Release year from NFO.
Returns:
Sanitised folder name in the format ``"{title} ({year})"``.
"""
clean_title = _YEAR_SUFFIX_PATTERN.sub('', title).strip()
year_suffix = f" ({year})"
raw_name = f"{clean_title}{year_suffix}"
return sanitize_folder_name(raw_name)
def _is_series_being_downloaded(series_folder: str) -> bool:
"""Check whether the given series has an active or pending download.
Args:
series_folder: The series folder name (as stored in the DB).
Returns:
``True`` if the series appears in the active download or the
pending queue.
"""
try:
download_service = get_download_service()
active = download_service._active_download
if active and active.serie_folder == series_folder:
return True
for item in download_service._pending_queue:
if item.serie_folder == series_folder:
return True
return False
except Exception as exc:
logger.warning(
"Could not check download status for %s: %s", series_folder, exc
)
return True
def _remove_key_file(path: Path) -> None:
"""Remove legacy 'key' file from a series folder.
Args:
path: Path to the series folder.
"""
key_file = path / "key"
if key_file.exists():
try:
key_file.unlink()
logger.info("Removed legacy 'key' file after rename: %s", key_file)
except OSError as exc:
logger.warning("Could not remove legacy 'key' file %s: %s", key_file, exc)
def _move_file(item: Path, dest: Path) -> bool:
"""Move a single file or directory to destination.
Args:
item: Source path to move.
dest: Destination path.
Returns:
True if move succeeded, False otherwise.
"""
try:
item.rename(dest)
logger.debug("Moved %s%s", item, dest)
return True
except PermissionError as exc:
logger.warning("Permission denied moving %s: %s", item, exc)
return False
except OSError as exc:
logger.warning("OS error moving %s: %s", item, exc)
return False
def _cleanup_orphaned_folder(old_path: Path, new_path: Path, dry_run: bool = False) -> bool:
"""Clean up orphaned folder after successful rename.
After a folder is successfully renamed to new_path, this function checks
if the old_path still exists (orphaned folder) and removes it. If the
old folder contains files, they are moved to new_path before deletion.
Args:
old_path: The original folder path before rename.
new_path: The new folder path after rename.
dry_run: If True, only log actions without executing them.
Returns:
True if old folder was cleaned up (or would be in dry-run mode),
False if old folder does not exist or cleanup failed.
"""
if not old_path.exists():
logger.debug("Old folder does not exist, no cleanup needed: %s", old_path)
return False
try:
contents = list(old_path.iterdir())
except PermissionError as exc:
logger.warning("Permission denied accessing old folder %s: %s", old_path, exc)
return False
except OSError as exc:
logger.warning("OS error accessing old folder %s: %s", old_path, exc)
return False
if not contents:
if dry_run:
logger.info("[DRY-RUN] Would delete empty orphaned folder: %s", old_path)
return True
try:
old_path.rmdir()
logger.info("Deleted empty orphaned folder: %s", old_path)
return True
except PermissionError as exc:
logger.warning("Permission denied deleting folder %s: %s", old_path, exc)
return False
except OSError as exc:
logger.warning("OS error deleting folder %s: %s", old_path, exc)
return False
if dry_run:
logger.info("[DRY-RUN] Would move %d files from orphaned folder %s to %s",
len(contents), old_path, new_path)
for item in contents:
logger.info("[DRY-RUN] Would move: %s%s", item, new_path / item.name)
logger.info("[DRY-RUN] Would then delete orphaned folder: %s", old_path)
return True
files_moved = 0
errors = 0
for item in contents:
if not _move_file(item, new_path / item.name):
errors += 1
else:
files_moved += 1
if files_moved > 0:
logger.info("Moved %d files from orphaned folder to %s", files_moved, new_path)
try:
old_path.rmdir()
logger.info("Deleted orphaned folder after moving contents: %s", old_path)
return errors == 0
except OSError as exc:
logger.warning("Could not delete orphaned folder %s (may not be empty): %s", old_path, exc)
return False
async def _update_series_folder(db, series, new_folder: str) -> None:
"""Update AnimeSeries.folder in the database.
Args:
db: Database session.
series: The AnimeSeries instance to update.
new_folder: New folder name.
"""
if series is None:
return
await AnimeSeriesService.update(db, series.id, folder=new_folder)
logger.info("Updated AnimeSeries.folder: %s (id=%s)", new_folder, series.id)
def _update_episode_paths(episodes, old_series_path: Path, new_series_path: Path) -> None:
"""Update Episode.file_path for all episodes of a series.
Args:
episodes: List of Episode instances.
old_series_path: Path to the old series folder.
new_series_path: Path to the new series folder.
"""
for episode in episodes:
if not episode.file_path:
continue
old_file_path = Path(episode.file_path)
try:
old_file_path.relative_to(old_series_path)
new_file_path = new_series_path / old_file_path.relative_to(old_series_path)
episode.file_path = str(new_file_path)
logger.debug("Updated Episode.file_path: %s%s", old_file_path, new_file_path)
except ValueError:
pass
def _update_queue_destinations(
queue_items,
series_id,
old_series_path: Path,
new_series_path: Path,
) -> None:
"""Update DownloadQueueItem.file_destination for pending items.
Args:
queue_items: List of DownloadQueueItem instances.
series_id: ID of the series to filter by.
old_series_path: Path to the old series folder.
new_series_path: Path to the new series folder.
"""
for item in queue_items:
if item.series_id != series_id or not item.file_destination:
continue
old_dest = Path(item.file_destination)
try:
old_dest.relative_to(old_series_path)
new_dest = new_series_path / old_dest.relative_to(old_series_path)
item.file_destination = str(new_dest)
logger.debug("Updated DownloadQueueItem.file_destination: %s%s", old_dest, new_dest)
except ValueError:
pass
async def _update_database_paths(
old_folder: str,
new_folder: str,
anime_dir: Path,
) -> None:
"""Update all database records that reference the old folder path.
Updates:
- ``AnimeSeries.folder`` → ``new_folder``
- ``Episode.file_path`` → adjusted to new folder
- ``DownloadQueueItem.file_destination`` → adjusted to new folder
Args:
old_folder: Previous folder name.
new_folder: New folder name.
anime_dir: Root anime directory path.
"""
old_series_path = anime_dir / old_folder
new_series_path = anime_dir / new_folder
async with get_db_session() as db:
series = await AnimeSeriesService.get_by_folder(db, old_folder)
if series is None:
all_series = await AnimeSeriesService.get_all(db)
for s in all_series:
if s.folder == old_folder:
series = s
break
await _update_series_folder(db, series, new_folder)
if series is None:
return
episodes = await EpisodeService.get_by_series(db, series.id)
_update_episode_paths(episodes, old_series_path, new_series_path)
await db.flush()
queue_items = await DownloadQueueService.get_all(db, with_series=True)
_update_queue_destinations(queue_items, series.id, old_series_path, new_series_path)
await db.flush()
logger.info("Database paths updated for series '%s''%s'", old_folder, new_folder)
def _remove_duplicate_target_folder(
series_dir: Path,
current_name: str,
expected_name: str,
expected_path: Path,
) -> bool:
"""Handle the case where the target folder already exists.
Removes the source folder and its DB record to avoid orphaning
episodes/downloads.
Args:
series_dir: Path to the series directory being processed.
current_name: Current folder name.
expected_name: Expected folder name.
expected_path: Path to the expected (target) folder.
Returns:
True if folder was removed successfully, False otherwise.
"""
logger.warning(
"Cannot rename '%s''%s' — target already exists",
current_name,
expected_name,
)
try:
try:
contents = list(series_dir.iterdir())
logger.warning(
"REMOVING folder '%s' with %d items — target '%s' already exists",
current_name,
len(contents),
expected_name,
)
for item in contents:
logger.warning(" Would remove: %s", item)
except OSError as exc:
logger.warning(
"Could not list contents of folder '%s' before removal: %s",
current_name,
exc,
)
shutil.rmtree(series_dir)
logger.info(
"Removed source folder '%s' — series already exists at target",
current_name,
)
# Delete source DB record using synchronous helper
_delete_series_db_record(current_name, expected_name)
return True
except OSError as exc:
logger.error("Failed to remove source folder '%s': %s", current_name, exc)
return False
def _delete_series_db_record(current_name: str, expected_name: str) -> None:
"""Delete the series DB record for a folder that was removed.
Args:
current_name: The folder name to look up in the DB.
expected_name: The target folder name (for logging).
"""
try:
import asyncio
asyncio.run(_delete_series_db_record_async(current_name, expected_name))
except Exception as exc:
logger.warning(
"Could not delete DB record for '%s': %s",
current_name,
exc,
)
async def _delete_series_db_record_async(current_name: str, expected_name: str) -> None:
"""Async helper to delete series DB record.
Args:
current_name: The folder name to look up.
expected_name: The target folder name (for logging).
"""
async with get_db_session() as db:
source_series = await AnimeSeriesService.get_by_folder(db, current_name)
if source_series is None:
all_series = await AnimeSeriesService.get_all(db)
for s in all_series:
if s.folder == current_name:
source_series = s
break
if source_series is not None:
await AnimeSeriesService.delete(db, source_series.id)
logger.info(
"Deleted source DB record for '%s' (id=%s) — target folder '%s' retains DB record",
current_name,
source_series.id,
expected_name,
)
else:
logger.info(
"No DB record found for source folder '%s' — folder removed only",
current_name,
)
async def validate_and_rename_series_folders(dry_run: bool = False) -> dict[str, int]:
"""Validate and rename series folders to match NFO metadata.
Iterates over every subfolder in ``settings.anime_directory`` that
contains a ``tvshow.nfo``. For each folder:
1. Parse the NFO to extract ``<title>`` and ``<year>``.
2. Compute the expected folder name: ``f"{title} ({year})"``.
3. Sanitise the expected name for filesystem safety.
4. Compare with the current folder name.
5. If different, rename the folder and update the database.
Skips folders where title or year is missing/empty. Logs every
rename action.
Args:
dry_run: If True, simulate rename operations without actually
moving folders or updating the database.
Returns:
Dictionary with counts:
- ``"scanned"``: total folders scanned
- ``"renamed"``: folders renamed
- ``"skipped"``: folders skipped (missing title/year)
- ``"errors"``: folders that caused an error
"""
if not settings.anime_directory:
logger.warning("Folder rename skipped — anime directory not configured")
return RenameStats().to_dict()
anime_dir = Path(settings.anime_directory)
if not anime_dir.is_dir():
logger.warning("Folder rename skipped — anime directory not found: %s", anime_dir)
return RenameStats().to_dict()
if dry_run:
logger.info("Running in DRY-RUN mode — no changes will be made")
stats = RenameStats()
pre_existing_duplicates: set[str] = set()
duplicates = _scan_for_pre_existing_duplicates(anime_dir)
for dup_group in duplicates:
if _try_merge_duplicate_group(dup_group, dry_run=dry_run):
logger.info(
"Auto-merged duplicate group for '%s' (%d folders)",
dup_group.key,
dup_group.count,
)
else:
for folder in dup_group.folders:
pre_existing_duplicates.add(folder)
logger.warning(
"Duplicate folders detected for series '%s': %s"
"manual cleanup required (different releases or non-empty duplicates)",
dup_group.key,
dup_group.folders,
)
for series_dir in sorted(anime_dir.iterdir()):
if not series_dir.is_dir():
continue
nfo_path = series_dir / "tvshow.nfo"
if not nfo_path.exists():
continue
stats.scanned += 1
title, year = _parse_nfo_title_and_year(nfo_path)
if not title or not year:
logger.info(
"Skipping rename for '%s' — missing title or year in NFO",
series_dir.name,
)
stats.skipped += 1
continue
expected_name = _compute_expected_folder_name(title, year)
current_name = series_dir.name
if expected_name == current_name:
logger.debug("Folder name already correct: '%s'", current_name)
continue
if _is_series_being_downloaded(current_name):
logger.info(
"Skipping rename for '%s' — series has active or pending downloads",
current_name,
)
stats.skipped += 1
continue
expected_path = anime_dir / expected_name
if current_name in pre_existing_duplicates:
logger.warning(
"Skipping rename for '%s' — pre-existing duplicate folder detected",
current_name,
)
stats.errors += 1
continue
if expected_path.exists():
if _remove_duplicate_target_folder(series_dir, current_name, expected_name, expected_path):
stats.renamed += 1
else:
stats.errors += 1
continue
if len(str(expected_path)) > 4096:
logger.warning(
"Cannot rename '%s''%s' — path exceeds OS limit",
current_name,
expected_name,
)
stats.errors += 1
continue
if dry_run:
logger.info("[DRY-RUN] Would rename folder: '%s''%s'", current_name, expected_name)
stats.renamed += 1
continue
try:
old_path = series_dir
series_dir.rename(expected_path)
logger.info("Renamed folder: '%s''%s'", current_name, expected_name)
stats.renamed += 1
await _update_database_paths(current_name, expected_name, anime_dir)
_remove_key_file(expected_path)
_cleanup_orphaned_folder(old_path, expected_path, dry_run=False)
except PermissionError as exc:
logger.error(
"Permission denied renaming '%s''%s': %s",
current_name,
expected_name,
exc,
)
stats.errors += 1
except OSError as exc:
logger.error(
"OS error renaming '%s''%s': %s",
current_name,
expected_name,
exc,
)
stats.errors += 1
logger.info(
"Folder rename scan complete: scanned=%d, renamed=%d, skipped=%d, errors=%d",
stats.scanned,
stats.renamed,
stats.skipped,
stats.errors,
)
return stats.to_dict()

View File

@@ -0,0 +1,428 @@
"""Folder scan service for daily maintenance tasks.
Encapsulates the daily folder-scan logic (orphaned-file detection,
metadata refresh, and missing-episode queuing) so that the scheduler
remains clean and the scan can be tested independently.
"""
from __future__ import annotations
import asyncio
from pathlib import Path
from typing import Optional
import structlog
from lxml import etree
from src.config.settings import settings as _settings
from src.core.utils.image_downloader import ImageDownloader
logger = structlog.get_logger(__name__)
# Module-level semaphore to limit concurrent TMDB operations to 3.
_TMDB_SEMAPHORE: asyncio.Semaphore = asyncio.Semaphore(3)
# Semaphore to limit concurrent poster image downloads to 3.
_POSTER_DOWNLOAD_SEMAPHORE: asyncio.Semaphore = asyncio.Semaphore(3)
# Semaphore to limit concurrent NFO repair TMDB operations to 3.
_NFO_REPAIR_SEMAPHORE: asyncio.Semaphore = asyncio.Semaphore(3)
async def _create_missing_nfo(series_dir: Path, series_name: str) -> None:
"""Create minimal NFO for series without one.
Creates a fresh :class:`NFOService` per invocation so concurrent
tasks cannot interfere with each other.
A module-level semaphore limits concurrent TMDB operations to 3.
Args:
series_dir: Absolute path to the series folder.
series_name: Human-readable series name for log messages.
"""
from src.core.services.nfo_factory import NFOServiceFactory
async with _NFO_REPAIR_SEMAPHORE:
try:
factory = NFOServiceFactory()
nfo_service = factory.create()
await nfo_service.create_minimal_nfo(
serie_name=series_name,
serie_folder=series_dir.name,
)
except Exception as exc: # pylint: disable=broad-except
logger.error(
"NFO creation failed for %s: %s",
series_name,
exc,
)
async def _repair_one_series(series_dir: Path, series_name: str) -> None:
"""Repair a single series NFO in isolation.
Creates a fresh :class:`NFOService` and :class:`NfoRepairService` per
invocation so that each repair owns its own ``aiohttp`` session/connector
and concurrent tasks cannot interfere with each other.
A module-level semaphore (``_NFO_REPAIR_SEMAPHORE``) limits the number of
simultaneous TMDB requests to avoid rate-limiting.
Any exception is caught and logged so the asyncio task never silently
drops an unhandled error.
Args:
series_dir: Absolute path to the series folder.
series_name: Human-readable series name for log messages.
"""
from src.core.services.nfo_factory import NFOServiceFactory
from src.core.services.nfo_repair_service import NfoRepairService
async with _NFO_REPAIR_SEMAPHORE:
try:
factory = NFOServiceFactory()
nfo_service = factory.create()
repair_service = NfoRepairService(nfo_service)
await repair_service.repair_series(series_dir, series_name)
except Exception as exc: # pylint: disable=broad-except
logger.error(
"NFO repair failed for %s: %s",
series_name,
exc,
)
async def perform_nfo_repair_scan(background_loader=None) -> None:
"""Scan all series folders, repair incomplete and create missing NFO files.
Called from ``FolderScanService.run_folder_scan()`` during the scheduled
daily folder scan (not on every startup). Checks each subfolder of
``settings.anime_directory`` for a ``tvshow.nfo``:
- Missing NFOs: creates minimal NFO via ``_create_missing_nfo``
- Incomplete NFOs: repairs via ``_repair_one_series``
Each repair task creates its own isolated :class:`NFOService` /
:class:`TMDBClient` so concurrent tasks never share an ``aiohttp``
session — this prevents "Connector is closed" errors when many repairs
run in parallel. A semaphore caps TMDB concurrency at 3 to stay within
rate limits.
The ``background_loader`` parameter is accepted for backwards-compatibility
but is no longer used.
Args:
background_loader: Unused. Kept to avoid breaking call-sites.
"""
from src.core.services.nfo_repair_service import nfo_needs_repair
if not _settings.tmdb_api_key:
logger.warning("NFO repair scan skipped — TMDB API key not configured")
return
if not _settings.anime_directory:
logger.warning("NFO repair scan skipped — anime directory not configured")
return
anime_dir = Path(_settings.anime_directory)
if not anime_dir.is_dir():
logger.warning("NFO repair scan skipped — anime directory not found: %s", anime_dir)
return
queued = 0
total = 0
missing_nfo_count = 0
repair_tasks: list[asyncio.Task] = []
for series_dir in sorted(anime_dir.iterdir()):
if not series_dir.is_dir():
continue
nfo_path = series_dir / "tvshow.nfo"
series_name = series_dir.name
if not nfo_path.exists():
# Create minimal NFO for series without one
missing_nfo_count += 1
repair_tasks.append(
asyncio.create_task(
_create_missing_nfo(series_dir, series_name),
name=f"nfo_create:{series_name}",
)
)
continue
total += 1
if nfo_needs_repair(nfo_path):
queued += 1
repair_tasks.append(
asyncio.create_task(
_repair_one_series(series_dir, series_name),
name=f"nfo_repair:{series_name}",
)
)
if repair_tasks:
logger.info(
"NFO repair scan: waiting for %d repair/create tasks to complete",
len(repair_tasks),
)
await asyncio.gather(*repair_tasks, return_exceptions=True)
logger.info("NFO repair scan tasks completed")
logger.info(
"NFO repair scan complete: %d of %d series queued for repair, %d missing NFOs queued for creation",
queued,
total,
missing_nfo_count,
)
class FolderScanServiceError(Exception):
"""Service-level exception for folder-scan operations."""
class FolderScanService:
"""Performs daily maintenance scans over the anime library folder.
The service is intentionally stateless; a new instance can be created
for every scheduled invocation or test case.
"""
async def run_folder_scan(self) -> None:
"""Execute the daily folder scan.
Checks prerequisites, logs progress, and delegates to sub-task
helpers. Any unhandled exception is caught and logged so the
scheduler task never crashes.
"""
logger.info("Folder scan started")
try:
if not self._prerequisites_met():
return
# 1.3 — Repair incomplete NFO files (synchronous, waits for completion).
logger.info("Starting NFO repair scan as part of folder scan")
await perform_nfo_repair_scan(background_loader=None)
logger.info("NFO repair scan complete")
# 1.4 — Validate and rename series folders after NFO repair.
logger.info("Starting folder rename validation")
from src.server.services.scheduler.folder_rename_service import (
validate_and_rename_series_folders,
)
rename_stats = await validate_and_rename_series_folders()
logger.info(
"Folder rename validation complete",
scanned=rename_stats["scanned"],
renamed=rename_stats["renamed"],
skipped=rename_stats["skipped"],
errors=rename_stats["errors"],
)
# 1.5 — Check and download missing poster.jpg files.
logger.info("Starting poster check")
poster_stats = await self.check_and_download_missing_posters()
logger.info(
"Poster check complete",
scanned=poster_stats["scanned"],
downloaded=poster_stats["downloaded"],
skipped=poster_stats["skipped"],
errors=poster_stats["errors"],
)
logger.info("Folder scan completed")
except Exception as exc: # pylint: disable=broad-exception-caught
logger.error("Folder scan failed", error=str(exc), exc_info=True)
# ------------------------------------------------------------------
# Poster check helpers
# ------------------------------------------------------------------
async def check_and_download_missing_posters(self) -> dict[str, int]:
"""Iterate over series folders and download missing poster.jpg files.
For each folder containing a ``tvshow.nfo``:
1. Check if ``poster.jpg`` exists and is at least
:attr:`ImageDownloader.min_file_size` bytes.
2. If missing or too small, parse ``tvshow.nfo`` for a ``<thumb>``
URL (preferring ``aspect="poster"``).
3. Download the image via :class:`ImageDownloader` under a
semaphore that limits concurrency to 3.
Returns:
Dictionary with counts:
- ``"scanned"``: total folders scanned
- ``"downloaded"``: posters successfully downloaded
- ``"skipped"``: folders skipped (no NFO, no thumb URL,
or poster already valid)
- ``"errors"``: folders that caused a download error
"""
from src.config.settings import settings # noqa: PLC0415
stats = {"scanned": 0, "downloaded": 0, "skipped": 0, "errors": 0}
if not settings.anime_directory:
logger.warning("Poster check skipped — anime directory not configured")
return stats
anime_dir = Path(settings.anime_directory)
if not anime_dir.is_dir():
logger.warning(
"Poster check skipped — anime directory not found: %s", anime_dir
)
return stats
# Gather all series directories that contain a tvshow.nfo
series_dirs = [
d for d in anime_dir.iterdir()
if d.is_dir() and (d / "tvshow.nfo").exists()
]
if not series_dirs:
logger.debug("No series folders found for poster check")
return stats
# Process each series folder concurrently with semaphore
tasks = [
self._check_and_download_poster(series_dir, stats)
for series_dir in series_dirs
]
await asyncio.gather(*tasks, return_exceptions=True)
return stats
async def _check_and_download_poster(
self, series_dir: Path, stats: dict[str, int]
) -> None:
"""Check and download poster for a single series folder.
Args:
series_dir: Path to the series folder.
stats: Mutable stats dictionary to update.
"""
stats["scanned"] += 1
poster_path = series_dir / "poster.jpg"
# Check if poster already exists and is large enough
if poster_path.exists():
try:
# Default min_file_size from ImageDownloader is 1024 bytes (1 KB)
if poster_path.stat().st_size >= 1024:
logger.debug(
"Poster already valid for '%s'", series_dir.name
)
stats["skipped"] += 1
return
except OSError:
pass # Fall through to re-download
# Parse NFO for thumb URL
nfo_path = series_dir / "tvshow.nfo"
poster_url = self._extract_poster_url_from_nfo(nfo_path)
if not poster_url:
logger.info(
"No poster URL found in NFO for '%s', skipping",
series_dir.name,
)
stats["skipped"] += 1
return
# Respect the nfo_download_poster setting
from src.config.settings import settings as app_settings # noqa: PLC0415
if not app_settings.nfo_download_poster:
logger.debug(
"Poster download disabled by nfo_download_poster setting for '%s'",
series_dir.name,
)
stats["skipped"] += 1
return
# Download poster with semaphore
async with _POSTER_DOWNLOAD_SEMAPHORE:
try:
async with ImageDownloader() as downloader:
success = await downloader.download_poster(
poster_url, series_dir, skip_existing=False
)
if success:
logger.info(
"Downloaded poster for '%s'", series_dir.name
)
stats["downloaded"] += 1
else:
logger.warning(
"Failed to download poster for '%s'", series_dir.name
)
stats["errors"] += 1
except Exception as exc: # pylint: disable=broad-except
logger.error(
"Error downloading poster for '%s': %s",
series_dir.name,
exc,
)
stats["errors"] += 1
@staticmethod
def _extract_poster_url_from_nfo(nfo_path: Path) -> Optional[str]:
"""Parse tvshow.nfo and extract the poster thumb URL.
Prefers ``<thumb aspect="poster">``; falls back to the first
``<thumb>`` element if no aspect attribute is present.
Args:
nfo_path: Absolute path to the ``tvshow.nfo`` file.
Returns:
The poster URL string, or ``None`` if not found.
"""
if not nfo_path.exists():
return None
try:
tree = etree.parse(str(nfo_path))
root = tree.getroot()
# Prefer thumb with aspect="poster"
for thumb in root.findall(".//thumb"):
if thumb.get("aspect") == "poster" and thumb.text:
return thumb.text.strip()
# Fallback to first thumb with text
for thumb in root.findall(".//thumb"):
if thumb.text:
return thumb.text.strip()
return None
except etree.XMLSyntaxError:
logger.warning("Malformed XML in %s", nfo_path)
return None
except Exception: # pylint: disable=broad-except
return None
# ------------------------------------------------------------------
# Private helpers
# ------------------------------------------------------------------
def _prerequisites_met(self) -> bool:
"""Verify that the environment is ready for a folder scan.
Returns:
True when ``settings.anime_directory`` exists and
``settings.tmdb_api_key`` is configured.
"""
from src.config.settings import settings # noqa: PLC0415
if not settings.tmdb_api_key:
logger.warning("Folder scan skipped — TMDB API key not configured")
return False
if not settings.anime_directory:
logger.warning("Folder scan skipped — anime directory not configured")
return False
anime_dir = Path(settings.anime_directory)
if not anime_dir.is_dir():
logger.warning(
"Folder scan skipped — anime directory not found: %s", anime_dir
)
return False
return True

View File

@@ -0,0 +1,317 @@
"""Key resolution service for orphaned anime folders.
Attempts to resolve provider keys for anime folders that have no key/data
file and no database entry, by searching the anime provider and matching
folder names to search results.
This service runs after nfo_repair_service during the daily folder scan.
"""
from __future__ import annotations
import asyncio
import re
from pathlib import Path
from typing import Optional
import structlog
from src.config.settings import settings as _settings
logger = structlog.get_logger(__name__)
# Limit concurrent provider searches to avoid rate-limiting.
_SEARCH_SEMAPHORE: asyncio.Semaphore = asyncio.Semaphore(2)
def _strip_year_from_folder(folder_name: str) -> str:
"""Remove trailing year suffix like ' (2020)' from folder name.
Args:
folder_name: Folder name, e.g. 'Rent-A-Girlfriend (2020)'
Returns:
Name without year, e.g. 'Rent-A-Girlfriend'
"""
return re.sub(r"\s*\(\d{4}\)\s*$", "", folder_name).strip()
def _extract_year_from_folder(folder_name: str) -> Optional[int]:
"""Extract year from folder name like 'Anime Name (2020)'.
Returns:
Year as int or None if not present.
"""
match = re.search(r"\((\d{4})\)$", folder_name.strip())
if match:
return int(match.group(1))
return None
def _extract_key_from_link(link: str) -> Optional[str]:
"""Extract provider key from search result link.
Args:
link: Link like '/anime/stream/rent-a-girlfriend' or full URL.
Returns:
Key slug like 'rent-a-girlfriend' or None.
"""
if not link:
return None
if "/anime/stream/" in link:
parts = link.split("/anime/stream/")[-1].split("/")
key = parts[0].strip()
return key if key else None
# If link is just a slug
if "/" not in link and link.strip():
return link.strip()
return None
def _normalize_for_comparison(text: str) -> str:
"""Normalize text for case-insensitive comparison.
Strips whitespace, lowercases, and removes common punctuation
differences that shouldn't affect matching.
Args:
text: Raw text string.
Returns:
Normalized lowercase string.
"""
normalized = text.strip().lower()
# Remove common punctuation that varies between sources
normalized = re.sub(r"[:\-–—]", " ", normalized)
# Collapse multiple spaces
normalized = re.sub(r"\s+", " ", normalized)
return normalized.strip()
async def resolve_key_for_folder(folder_name: str) -> Optional[str]:
"""Attempt to resolve the provider key for a single folder.
Strategy:
1. Strip year suffix from folder name to get search query.
2. Search the anime provider with that query.
3. If exactly ONE result matches the folder name (case-insensitive),
return the key extracted from the result link.
4. If zero or multiple matches, return None (not confident enough).
Args:
folder_name: The anime folder name, e.g. 'Rent-A-Girlfriend (2020)'.
Returns:
The provider key string, or None if resolution is not confident.
"""
search_query = _strip_year_from_folder(folder_name)
if not search_query:
logger.debug("Empty search query after stripping year from '%s'", folder_name)
return None
async with _SEARCH_SEMAPHORE:
try:
loop = asyncio.get_running_loop()
results = await loop.run_in_executor(None, _search_provider, search_query)
except Exception as exc:
logger.warning(
"Provider search failed for '%s': %s", search_query, exc
)
return None
if not results:
logger.debug("No search results for folder '%s'", folder_name)
return None
# Filter results: find exact name matches (case-insensitive)
normalized_query = _normalize_for_comparison(search_query)
exact_matches = []
for result in results:
title = result.get("title") or result.get("name") or ""
normalized_title = _normalize_for_comparison(title)
if normalized_title == normalized_query:
key = _extract_key_from_link(result.get("link", ""))
if key:
exact_matches.append((key, title))
if len(exact_matches) == 1:
resolved_key, matched_title = exact_matches[0]
logger.info(
"Resolved key for folder '%s': key='%s' (matched title: '%s')",
folder_name,
resolved_key,
matched_title,
)
return resolved_key
if len(exact_matches) > 1:
logger.info(
"Multiple exact matches for folder '%s' (%d matches), skipping",
folder_name,
len(exact_matches),
)
else:
logger.debug(
"No exact title match for folder '%s' in %d results",
folder_name,
len(results),
)
return None
def _search_provider(query: str) -> list:
"""Call the anime provider search synchronously.
Args:
query: Search term.
Returns:
List of search result dicts with 'link' and 'title'/'name' fields.
"""
from src.core.providers.provider_factory import Loaders
loader = Loaders().GetLoader("aniworld.to")
return loader.search(query)
async def perform_key_resolution_scan() -> dict[str, int]:
"""Scan all anime folders and resolve missing keys.
Iterates over all subfolders of the anime directory. For each folder
that has no corresponding database entry, attempts to resolve the
provider key via provider search and saves it to the database.
Returns:
Dictionary with counts:
- 'scanned': total folders checked
- 'resolved': keys successfully resolved and saved
- 'skipped': folders already in DB or resolution uncertain
- 'errors': folders that caused errors during resolution
"""
from src.server.database.connection import get_db_session
from src.server.database.service import AnimeSeriesService
stats = {"scanned": 0, "resolved": 0, "skipped": 0, "errors": 0}
if not _settings.anime_directory:
logger.warning("Key resolution scan skipped — anime directory not configured")
return stats
anime_dir = Path(_settings.anime_directory)
if not anime_dir.is_dir():
logger.warning(
"Key resolution scan skipped — anime directory not found: %s",
anime_dir,
)
return stats
# Collect folders that need resolution
folders_to_resolve: list[str] = []
async with get_db_session() as db:
for series_dir in sorted(anime_dir.iterdir()):
if not series_dir.is_dir():
continue
folder_name = series_dir.name
stats["scanned"] += 1
# Check if already in database
existing = await AnimeSeriesService.get_by_folder(db, folder_name)
if existing:
stats["skipped"] += 1
continue
folders_to_resolve.append(folder_name)
if not folders_to_resolve:
logger.info("Key resolution scan: all folders already have DB entries")
return stats
logger.info(
"Key resolution scan: %d folders need resolution", len(folders_to_resolve)
)
# Resolve keys one by one (provider search is rate-limited)
for folder_name in folders_to_resolve:
try:
key = await resolve_key_for_folder(folder_name)
if key:
# Save to database
await _save_resolved_key(folder_name, key)
stats["resolved"] += 1
else:
stats["skipped"] += 1
except Exception as exc:
logger.error(
"Error resolving key for folder '%s': %s",
folder_name,
exc,
)
stats["errors"] += 1
logger.info(
"Key resolution scan complete: scanned=%d, resolved=%d, skipped=%d, errors=%d",
stats["scanned"],
stats["resolved"],
stats["skipped"],
stats["errors"],
)
return stats
async def _save_resolved_key(folder_name: str, key: str) -> None:
"""Save a resolved key to the database.
Creates a new AnimeSeries entry with the resolved key and folder name.
Does NOT write any key/data file to disk.
Args:
folder_name: The anime folder name (e.g. 'Rent-A-Girlfriend (2020)').
key: The resolved provider key (e.g. 'rent-a-girlfriend').
"""
from src.server.database.connection import get_db_session
from src.server.database.service import AnimeSeriesService
name = _strip_year_from_folder(folder_name)
year = _extract_year_from_folder(folder_name)
async with get_db_session() as db:
# Double-check: another task might have resolved it concurrently
existing = await AnimeSeriesService.get_by_folder(db, folder_name)
if existing:
logger.debug(
"Folder '%s' already in DB (resolved concurrently), skipping",
folder_name,
)
return
# Also check if a series with this key already exists
existing_key = await AnimeSeriesService.get_by_key(db, key)
if existing_key:
logger.warning(
"Key '%s' already exists in DB for folder '%s', "
"cannot assign to folder '%s'",
key,
existing_key.folder,
folder_name,
)
return
await AnimeSeriesService.create(
db,
key=key,
name=name,
site="aniworld.to",
folder=folder_name,
year=year,
loading_status="pending",
episodes_loaded=False,
)
logger.info(
"Saved resolved key '%s' for folder '%s' to database",
key,
folder_name,
)

View File

@@ -0,0 +1,293 @@
"""Rescan orchestrator — coordinates all scan/cleanup operations during a rescan.
Extracts the rescan workflow from SchedulerService so scheduling and scan
logic are cleanly separated.
Called by SchedulerService.trigger_rescan() and by _run_rescan_job().
"""
from __future__ import annotations
import logging
from datetime import datetime, timezone
from typing import List, Optional
from src.server.models.config import SchedulerConfig
logger = logging.getLogger(__name__)
class RescanOrchestrator:
"""Coordinates rescan, auto-download, folder scan, and key resolution.
This class encapsulates the entire post-rescan workflow so SchedulerService
only needs to call a single method.
"""
def __init__(self, config: Optional[SchedulerConfig] = None) -> None:
"""Initialize the orchestrator.
Args:
config: Optional scheduler config. If None, operations that depend
on config flags (auto_download, folder_scan) will be skipped.
"""
self._config = config
self._last_scan_time: Optional[datetime] = None
# Cooldown tracking for auto-download to prevent rapid re-triggers
self._last_auto_download_time: Optional[datetime] = None
self._auto_download_cooldown_seconds: int = 300 # 5 minutes default
@property
def last_scan_time(self) -> Optional[datetime]:
return self._last_scan_time
# ------------------------------------------------------------------
# Auto-download
# ------------------------------------------------------------------
async def run_auto_download(self) -> int:
"""Queue and start downloads for all series with missing episodes.
Returns:
Number of episodes queued.
"""
from datetime import timedelta
from src.server.models.download import EpisodeIdentifier
from src.server.utils.dependencies import (
get_anime_service,
get_download_service,
)
# Check cooldown to prevent rapid re-triggers
now = datetime.now(timezone.utc)
if self._last_auto_download_time is not None:
elapsed = now - self._last_auto_download_time
if elapsed < timedelta(seconds=self._auto_download_cooldown_seconds):
logger.debug(
"Auto-download skipped: cooldown active (elapsed=%.1fs cooldown=%ds)",
elapsed.total_seconds(),
self._auto_download_cooldown_seconds,
)
return 0
anime_service = get_anime_service()
download_service = get_download_service()
series_list = anime_service._cached_list_missing()
queued_count = 0
for series in series_list:
episode_dict: dict = series.get("episodeDict") or {}
if not episode_dict:
continue
episodes: List[EpisodeIdentifier] = []
for season_str, ep_numbers in episode_dict.items():
for ep_num in ep_numbers:
episodes.append(
EpisodeIdentifier(season=int(season_str), episode=int(ep_num))
)
if not episodes:
continue
await download_service.add_to_queue(
serie_id=series.get("key", ""),
serie_folder=series.get("folder", series.get("name", "")),
serie_name=series.get("name", ""),
episodes=episodes,
)
queued_count += len(episodes)
logger.info(
"Auto-download queued episodes for series=%s count=%d",
series.get("key"),
len(episodes),
)
if queued_count:
await download_service.start_queue_processing()
logger.info("Auto-download queue processing started: queued=%d", queued_count)
self._last_auto_download_time = datetime.now(timezone.utc)
logger.info("Auto-download completed: queued_count=%d", queued_count)
return queued_count
# ------------------------------------------------------------------
# Folder scan
# ------------------------------------------------------------------
async def run_folder_scan(self) -> None:
"""Run the folder scan maintenance task."""
from src.server.services.scheduler.folder_scan_service import FolderScanService
folder_scan_service = FolderScanService()
await folder_scan_service.run_folder_scan()
logger.info("Folder scan completed successfully")
# ------------------------------------------------------------------
# Key resolution
# ------------------------------------------------------------------
async def run_key_resolution(self) -> dict:
"""Run the orphaned folder key resolution scan.
Returns:
Dict with resolved/skipped/errors counts.
"""
from src.server.services.scheduler.key_resolution_service import (
perform_key_resolution_scan,
)
key_stats = await perform_key_resolution_scan()
logger.info(
"Key resolution scan completed: resolved=%d, skipped=%d, errors=%d",
key_stats["resolved"],
key_stats["skipped"],
key_stats["errors"],
)
return key_stats
# ------------------------------------------------------------------
# Main orchestrator entry point
# ------------------------------------------------------------------
async def execute(self) -> dict:
"""Execute the full rescan workflow.
Runs in order:
1. anime_service.rescan()
2. auto-download (if enabled)
3. folder scan (if enabled)
4. key resolution scan (always, if anime_directory configured)
Returns:
Dict with duration and counts for each step.
"""
scan_start = datetime.now(timezone.utc)
results = {
"started_at": scan_start.isoformat(),
"duration_seconds": 0.0,
"rescan_completed": False,
"auto_download_queued": 0,
"folder_scan_completed": False,
"key_resolution": {"resolved": 0, "skipped": 0, "errors": 0},
}
await self._broadcast(
"scheduled_rescan_started",
{"timestamp": scan_start.isoformat()},
)
try:
# 1. Main library rescan
await self._run_rescan()
results["rescan_completed"] = True
# 2. Auto-download
if self._config and self._config.auto_download_after_rescan:
try:
queued = await self.run_auto_download()
results["auto_download_queued"] = queued
await self._broadcast(
"auto_download_started", {"queued_count": queued}
)
except Exception as exc:
logger.error("Auto-download failed: %s", exc, exc_info=True)
await self._broadcast(
"auto_download_error", {"error": str(exc)}
)
# 3. Folder scan
if self._config and self._config.folder_scan_enabled:
try:
await self.run_folder_scan()
results["folder_scan_completed"] = True
except Exception as exc:
logger.error("Folder scan failed: %s", exc, exc_info=True)
await self._broadcast("folder_scan_error", {"error": str(exc)})
# 4. Key resolution scan (always runs if anime_directory configured)
try:
key_stats = await self.run_key_resolution()
results["key_resolution"] = key_stats
except Exception as exc:
logger.error("Key resolution scan failed: %s", exc, exc_info=True)
self._last_scan_time = datetime.now(timezone.utc)
results["duration_seconds"] = (
self._last_scan_time - scan_start
).total_seconds()
await self._broadcast(
"scheduled_rescan_completed",
{
"timestamp": self._last_scan_time.isoformat(),
"duration_seconds": results["duration_seconds"],
},
)
logger.info(
"Scheduled library rescan completed: duration=%.2fs",
results["duration_seconds"],
)
except Exception as exc:
logger.error("Scheduled rescan failed: %s", exc, exc_info=True)
await self._broadcast(
"scheduled_rescan_error",
{"error": str(exc), "timestamp": datetime.now(timezone.utc).isoformat()},
)
raise
return results
# ------------------------------------------------------------------
# Private helpers
# ------------------------------------------------------------------
async def _run_rescan(self) -> None:
"""Run the anime service rescan."""
from src.server.utils.dependencies import get_anime_service
anime_service = get_anime_service()
logger.info("Anime service obtained, calling anime_service.rescan()...")
await anime_service.rescan()
logger.info("anime_service.rescan() completed")
async def _broadcast(self, event_type: str, data: dict) -> None:
"""Broadcast a WebSocket event to all connected clients."""
try:
from src.server.services.websocket_service import get_websocket_service
ws_service = get_websocket_service()
await ws_service.manager.broadcast({"type": event_type, "data": data})
except Exception as exc:
logger.warning(
"WebSocket broadcast failed: event=%s error=%s", event_type, exc
)
# ---------------------------------------------------------------------------
# Module-level orchestrator
# ---------------------------------------------------------------------------
_orchestrator: Optional[RescanOrchestrator] = None
def get_rescan_orchestrator(
config: Optional[SchedulerConfig] = None,
) -> RescanOrchestrator:
"""Return a RescanOrchestrator singleton (or create with optional config)."""
global _orchestrator
if _orchestrator is None or config is not None:
_orchestrator = RescanOrchestrator(config=config)
logger.debug("Created new RescanOrchestrator singleton")
else:
logger.debug("Returning existing RescanOrchestrator singleton")
return _orchestrator
def reset_rescan_orchestrator() -> None:
"""Reset the orchestrator singleton (used in tests)."""
global _orchestrator
_orchestrator = None

View File

@@ -1,25 +1,34 @@
"""Scheduler service for automatic library rescans.
Uses APScheduler's AsyncIOScheduler with CronTrigger for precise
cron-based scheduling. The legacy interval-based loop has been removed
in favour of the cron approach.
cron-based scheduling.
Jobs are held in memory (no separate scheduler database). On startup,
if the last scan timestamp indicates a missed run (server was down at the
scheduled cron time), a rescan is triggered immediately.
Actual rescan logic is delegated to RescanService.
"""
from __future__ import annotations
from datetime import datetime, timezone
from typing import List, Optional
import logging
from datetime import datetime, timedelta, timezone
from typing import Optional
import structlog
from apscheduler.schedulers.asyncio import AsyncIOScheduler
from apscheduler.triggers.cron import CronTrigger
from src.server.models.config import SchedulerConfig
from src.server.services.config_service import ConfigServiceError, get_config_service
logger = structlog.get_logger(__name__)
logger = logging.getLogger(__name__)
_JOB_ID = "scheduled_rescan"
# Grace period for missed jobs (1 hour — handles server downtime between
# scheduled time and startup).
_MISFIRE_GRACE_SECONDS = 3600
class SchedulerServiceError(Exception):
"""Service-level exception for scheduler operations."""
@@ -34,7 +43,9 @@ class SchedulerService:
- Cron-based scheduling (time of day + days of week)
- Immediate manual trigger
- Live config reloading without app restart
- Auto-queueing downloads of missing episodes after rescan
Actual rescan/folder-scan/auto-download work is delegated to
RescanService.
"""
def __init__(self) -> None:
@@ -42,7 +53,6 @@ class SchedulerService:
self._is_running: bool = False
self._scheduler: Optional[AsyncIOScheduler] = None
self._config: Optional[SchedulerConfig] = None
self._last_scan_time: Optional[datetime] = None
self._scan_in_progress: bool = False
logger.info("SchedulerService initialised")
@@ -57,15 +67,18 @@ class SchedulerService:
SchedulerServiceError: If the scheduler is already running or
config cannot be loaded.
"""
logger.info("SchedulerService.start() called")
if self._is_running:
logger.warning("Scheduler start called but already running")
raise SchedulerServiceError("Scheduler is already running")
try:
config_service = get_config_service()
config = config_service.load_config()
self._config = config.scheduler
logger.info("Scheduler config loaded successfully")
except ConfigServiceError as exc:
logger.error("Failed to load scheduler configuration", error=str(exc))
logger.error("Failed to load scheduler configuration: %s", exc)
raise SchedulerServiceError(f"Failed to load config: {exc}") from exc
self._scheduler = AsyncIOScheduler()
@@ -75,6 +88,15 @@ class SchedulerService:
self._is_running = True
return
logger.info(
"Scheduler config loaded: enabled=%s time=%s days=%s auto_download=%s folder_scan=%s",
self._config.enabled,
self._config.schedule_time,
self._config.schedule_days,
self._config.auto_download_after_rescan,
self._config.folder_scan_enabled,
)
trigger = self._build_cron_trigger()
if trigger is None:
logger.warning(
@@ -82,23 +104,38 @@ class SchedulerService:
)
else:
self._scheduler.add_job(
self._perform_rescan,
_run_rescan_job,
trigger=trigger,
id=_JOB_ID,
replace_existing=True,
misfire_grace_time=300,
misfire_grace_time=_MISFIRE_GRACE_SECONDS,
coalesce=True,
)
logger.info(
"Scheduler started with cron trigger",
schedule_time=self._config.schedule_time,
schedule_days=self._config.schedule_days,
"Scheduler started with cron trigger: time=%s days=%s",
self._config.schedule_time,
self._config.schedule_days,
)
self._scheduler.start()
self._is_running = True
# Log next scheduled run for visibility.
job = self._scheduler.get_job(_JOB_ID)
if job:
next_run = job.next_run_time
logger.info(
"Scheduler next run: %s",
next_run.isoformat() if next_run else None,
)
# Startup misfire recovery: check if the last scan was missed while
# the server was down.
await self._check_missed_run()
async def stop(self) -> None:
"""Stop the APScheduler gracefully."""
logger.info("SchedulerService.stop() called")
if not self._is_running:
logger.debug("Scheduler stop called but not running")
return
@@ -106,8 +143,27 @@ class SchedulerService:
if self._scheduler and self._scheduler.running:
self._scheduler.shutdown(wait=False)
logger.info("Scheduler stopped")
else:
logger.info("Scheduler stop: scheduler was not running")
self._is_running = False
logger.info("SchedulerService stopped successfully")
async def ensure_started(self) -> None:
"""Ensure the scheduler is running (idempotent).
If already running, returns immediately. Otherwise, starts the scheduler.
This method is safe to call multiple times and from multiple callers.
Raises:
SchedulerServiceError: If startup fails (except for already running).
"""
if self._is_running:
logger.debug("Scheduler ensure_started called but already running")
return
logger.info("Scheduler ensure_started: starting scheduler")
await self.start()
async def trigger_rescan(self) -> bool:
"""Manually trigger a library rescan.
@@ -140,11 +196,12 @@ class SchedulerService:
"""
self._config = config
logger.info(
"Scheduler config reloaded",
enabled=config.enabled,
schedule_time=config.schedule_time,
schedule_days=config.schedule_days,
auto_download=config.auto_download_after_rescan,
"Scheduler config reloaded: enabled=%s time=%s days=%s auto_download=%s folder_scan=%s",
config.enabled,
config.schedule_time,
config.schedule_days,
config.auto_download_after_rescan,
config.folder_scan_enabled,
)
if not self._scheduler or not self._scheduler.running:
@@ -165,22 +222,23 @@ class SchedulerService:
if self._scheduler.get_job(_JOB_ID):
self._scheduler.reschedule_job(_JOB_ID, trigger=trigger)
logger.info(
"Scheduler rescheduled with cron trigger",
schedule_time=config.schedule_time,
schedule_days=config.schedule_days,
"Scheduler rescheduled with cron trigger: time=%s days=%s",
config.schedule_time,
config.schedule_days,
)
else:
self._scheduler.add_job(
self._perform_rescan,
_run_rescan_job,
trigger=trigger,
id=_JOB_ID,
replace_existing=True,
misfire_grace_time=300,
misfire_grace_time=_MISFIRE_GRACE_SECONDS,
coalesce=True,
)
logger.info(
"Scheduler job added with cron trigger",
schedule_time=config.schedule_time,
schedule_days=config.schedule_days,
"Scheduler job added with cron trigger: time=%s days=%s",
config.schedule_time,
config.schedule_days,
)
def get_status(self) -> dict:
@@ -189,6 +247,10 @@ class SchedulerService:
Returns:
Dict containing scheduler state and config fields.
"""
from src.server.services.rescan_service import get_rescan_service
rescan_service = get_rescan_service()
next_run: Optional[str] = None
if self._scheduler and self._scheduler.running:
job = self._scheduler.get_job(_JOB_ID)
@@ -204,7 +266,14 @@ class SchedulerService:
"auto_download_after_rescan": (
self._config.auto_download_after_rescan if self._config else False
),
"last_run": self._last_scan_time.isoformat() if self._last_scan_time else None,
"folder_scan_enabled": (
self._config.folder_scan_enabled if self._config else False
),
"last_run": (
rescan_service.last_scan_time.isoformat()
if rescan_service.last_scan_time
else None
),
"next_run": next_run,
"scan_in_progress": self._scan_in_progress,
}
@@ -231,136 +300,109 @@ class SchedulerService:
day_of_week=day_of_week,
)
logger.debug(
"CronTrigger built",
hour=hour_str,
minute=minute_str,
day_of_week=day_of_week,
"CronTrigger built: hour=%s minute=%s day_of_week=%s",
hour_str,
minute_str,
day_of_week,
)
return trigger
async def _broadcast(self, event_type: str, data: dict) -> None:
"""Broadcast a WebSocket event to all connected clients."""
async def _check_missed_run(self) -> None:
"""Check if a scheduled rescan was missed while the server was down.
Compares system_settings.last_scan_timestamp against the expected
schedule. If the last scan is overdue (more than 24h ago for a daily
schedule) but within the grace period, triggers an immediate rescan.
"""
if not self._config or not self._config.enabled:
return
if not self._config.schedule_days:
return
try:
from src.server.services.websocket_service import ( # noqa: PLC0415
get_websocket_service,
from src.server.database.connection import get_db_session
from src.server.database.system_settings_service import (
SystemSettingsService,
)
ws_service = get_websocket_service()
await ws_service.manager.broadcast({"type": event_type, "data": data})
async with get_db_session() as db:
settings = await SystemSettingsService.get_or_create(db)
last_scan = settings.last_scan_timestamp
if last_scan is None:
# Never scanned before — trigger immediately
logger.info("No previous scan recorded — triggering immediate rescan")
await self._perform_rescan()
return
# Ensure timezone-aware comparison
if last_scan.tzinfo is None:
last_scan = last_scan.replace(tzinfo=timezone.utc)
now = datetime.now(timezone.utc)
elapsed = now - last_scan
# If last scan was more than 24h + grace period ago, don't trigger
# (avoids surprise rescans after long downtime).
max_overdue = timedelta(hours=24, seconds=_MISFIRE_GRACE_SECONDS)
if elapsed > max_overdue:
logger.info(
"Last scan was %s ago (> %s) — skipping missed-run recovery",
elapsed,
max_overdue,
)
return
# Check if a run should have happened between last_scan and now.
if elapsed > timedelta(hours=23):
logger.info(
"Missed scheduled rescan detected (last scan %s ago) — triggering now",
elapsed,
)
await self._perform_rescan()
except Exception as exc: # pylint: disable=broad-exception-caught
logger.warning("WebSocket broadcast failed", event=event_type, error=str(exc))
async def _auto_download_missing(self) -> None:
"""Queue and start downloads for all series with missing episodes."""
from src.server.models.download import EpisodeIdentifier # noqa: PLC0415
from src.server.utils.dependencies import ( # noqa: PLC0415
get_anime_service,
get_download_service,
)
anime_service = get_anime_service()
download_service = get_download_service()
series_list = anime_service._cached_list_missing()
queued_count = 0
for series in series_list:
episode_dict: dict = series.get("episodeDict") or {}
if not episode_dict:
continue
episodes: List[EpisodeIdentifier] = []
for season_str, ep_numbers in episode_dict.items():
for ep_num in ep_numbers:
episodes.append(
EpisodeIdentifier(season=int(season_str), episode=int(ep_num))
)
if not episodes:
continue
await download_service.add_to_queue(
serie_id=series.get("key", ""),
serie_folder=series.get("folder", series.get("name", "")),
serie_name=series.get("name", ""),
episodes=episodes,
)
queued_count += len(episodes)
logger.info(
"Auto-download queued episodes",
series=series.get("key"),
count=len(episodes),
)
if queued_count:
await download_service.start_queue_processing()
logger.info("Auto-download queue processing started", queued=queued_count)
await self._broadcast("auto_download_started", {"queued_count": queued_count})
logger.info("Auto-download completed", queued_count=queued_count)
logger.warning("Missed-run check failed (non-fatal): %s", exc)
async def _perform_rescan(self) -> None:
"""Execute a library rescan and optionally trigger auto-download."""
"""Execute a library rescan via RescanService."""
from src.server.services.rescan_service import get_rescan_service
logger.info(
"Scheduler _perform_rescan entered: scan_in_progress=%s",
self._scan_in_progress,
)
if self._scan_in_progress:
logger.warning("Skipping rescan: previous scan still in progress")
return
self._scan_in_progress = True
scan_start = datetime.now(timezone.utc)
try:
logger.info("Starting scheduled library rescan")
from src.server.utils.dependencies import get_anime_service # noqa: PLC0415
anime_service = get_anime_service()
await self._broadcast(
"scheduled_rescan_started",
{"timestamp": scan_start.isoformat()},
)
await anime_service.rescan()
self._last_scan_time = datetime.now(timezone.utc)
duration = (self._last_scan_time - scan_start).total_seconds()
logger.info("Scheduled library rescan completed", duration_seconds=duration)
await self._broadcast(
"scheduled_rescan_completed",
{
"timestamp": self._last_scan_time.isoformat(),
"duration_seconds": duration,
},
)
# Auto-download after rescan
if self._config and self._config.auto_download_after_rescan:
logger.info("Auto-download after rescan is enabled — starting")
try:
await self._auto_download_missing()
except Exception as dl_exc: # pylint: disable=broad-exception-caught
logger.error(
"Auto-download after rescan failed",
error=str(dl_exc),
exc_info=True,
)
await self._broadcast(
"auto_download_error", {"error": str(dl_exc)}
)
else:
logger.debug("Auto-download after rescan is disabled — skipping")
except Exception as exc: # pylint: disable=broad-exception-caught
logger.error("Scheduled rescan failed", error=str(exc), exc_info=True)
await self._broadcast(
"scheduled_rescan_error",
{"error": str(exc), "timestamp": datetime.now(timezone.utc).isoformat()},
)
rescan_service = get_rescan_service(config=self._config)
await rescan_service.execute()
finally:
self._scan_in_progress = False
logger.info("Scheduled rescan finished: scan_in_progress reset to False")
# ---------------------------------------------------------------------------
# Module-level job runner
#
# APScheduler cannot serialize bound methods (SchedulerService instance
# contains a reference to the scheduler itself, creating a circular pickle
# error). Using a module-level function avoids this.
# ---------------------------------------------------------------------------
async def _run_rescan_job() -> None:
"""Module-level job entry point — delegates to the current service."""
logger.info("=" * 60)
logger.info("APScheduler triggered _run_rescan_job")
logger.info("Getting scheduler service singleton...")
svc = get_scheduler_service()
logger.info("Scheduler service obtained, calling _perform_rescan()")
await svc._perform_rescan()
logger.info("_run_rescan_job completed")
logger.info("=" * 60)
# ---------------------------------------------------------------------------
@@ -374,7 +416,10 @@ def get_scheduler_service() -> SchedulerService:
"""Return the singleton SchedulerService instance."""
global _scheduler_service
if _scheduler_service is None:
logger.info("Creating new SchedulerService singleton")
_scheduler_service = SchedulerService()
else:
logger.debug("Returning existing SchedulerService singleton")
return _scheduler_service

View File

@@ -57,6 +57,44 @@ _rate_limit_lock = Lock()
_RATE_LIMIT_WINDOW_SECONDS = 60.0
def _make_db_lookup():
"""Build a synchronous ``(folder) -> Serie | None`` callable for SerieScanner.
The returned function opens a short-lived sync DB session, queries for a
series whose ``folder`` column matches the given name, and converts the
ORM row to a ``Serie`` domain object. Returns ``None`` when the DB is not
yet initialised or no matching row is found.
"""
from src.core.entities.series import Serie
def _lookup(folder: str) -> Optional["Serie"]:
try:
from src.server.database.connection import get_sync_session
from src.server.database.service import AnimeSeriesService
db = get_sync_session()
try:
row = AnimeSeriesService.get_by_folder_sync(db, folder)
finally:
db.close()
if row is None:
return None
return Serie(
key=row.key,
name=row.name or "",
site=row.site,
folder=row.folder,
episodeDict={},
year=row.year,
)
except RuntimeError:
# DB not initialised yet (e.g. first boot before init_db())
return None
return _lookup
def get_series_app() -> SeriesApp:
"""
Dependency to get SeriesApp instance.
@@ -134,7 +172,7 @@ def get_series_app() -> SeriesApp:
),
)
_series_app = SeriesApp(anime_dir)
_series_app = SeriesApp(anime_dir, db_lookup=_make_db_lookup())
except Exception as e:
raise HTTPException(
status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,

Some files were not shown because too many files have changed in this diff Show More