Added documentation for API, architecture, configuration, database, development guide, testing, and navigation. Includes helper scripts, diagrams, and guides for NFO files and migration.
16 KiB
Development Guide
Document Purpose
This document provides guidance for developers working on the Aniworld project.
What This Document Contains
- Prerequisites: Required software and tools
- Environment Setup: Step-by-step local development setup
- Project Structure: Source code organization explanation
- Development Workflow: Branch strategy, commit conventions
- Coding Standards: Style guide, linting, formatting
- Running the Application: Development server, CLI usage
- Debugging Tips: Common debugging approaches
- IDE Configuration: VS Code settings, recommended extensions
- Contributing Guidelines: How to submit changes
- Code Review Process: Review checklist and expectations
What This Document Does NOT Contain
- Production deployment (see DEPLOYMENT.md)
- API reference (see API.md)
- Architecture decisions (see ARCHITECTURE.md)
- Test writing guides (see TESTING.md)
- Security guidelines (see SECURITY.md)
Target Audience
- New Developers joining the project
- Contributors (internal and external)
- Anyone setting up a development environment
Sections to Document
- Prerequisites
- Python version
- Conda environment
- Node.js (if applicable)
- Git
- Getting Started
- Clone repository
- Setup conda environment
- Install dependencies
- Configuration setup
- Project Structure Overview
- Development Server
- Starting FastAPI server
- Hot reload configuration
- Debug mode
- CLI Development
- Code Style
- PEP 8 compliance
- Type hints requirements
- Docstring format
- Import organization
- Git Workflow
- Branch naming
- Commit message format
- Pull request process
- Common Development Tasks
Adding Queue Deduplication
The download queue prevents duplicate entries at two levels:
In-Memory Deduplication (src/server/services/download_service.py):
_pending_by_episodedict tracks pending episodes: key =(serie_id, season, episode)_add_to_pending_queue()updates the dict when adding itemsadd_to_queue()checks this dict before adding episodes (includes batch-local dedup)_remove_from_pending_queue()cleans up the dict when items are removed
Database Constraint (src/server/models.py):
DownloadQueueItemhas a unique index onepisode_idvia__table_args__- Prevents duplicate queue entries at the database level
- Unique constraint:
Index("ix_download_queue_episode_pending", "episode_id", unique=True)
Scheduler Cooldown (src/server/services/scheduler_service.py):
_last_auto_download_timetracks when auto-download last ran- 5-minute cooldown prevents rapid re-triggers
- Checked at start of
_auto_download_missing()
Episode Lifecycle
Episodes transition through states stored in the episodes table:
| State | is_downloaded |
file_path |
Description |
|---|---|---|---|
| Missing | False |
NULL |
Episode not yet downloaded |
| Downloaded | True |
Set | Episode exists on disk |
State Transitions:
- Missing → Downloaded: When download completes,
_remove_episode_from_missing_list()callsEpisodeService.mark_downloaded()to setis_downloaded=Trueand populatefile_path. The episode record is NOT deleted.
Query Implications:
get_series_with_missing_episodes(): Filters foris_downloaded=Falseto find series with undownloaded episodesget_series_with_no_episodes(): Finds series withis_downloaded=Falseepisodes but NOis_downloaded=Trueepisodes (completely unwatched series)
Mocking the Download Queue
When testing components that use the download queue:
# Mock repository for unit tests
class MockQueueRepository:
def __init__(self):
self._items: Dict[str, DownloadItem] = {}
async def save_item(self, item: DownloadItem) -> DownloadItem:
self._items[item.id] = item
return item
async def get_all_items(self) -> List[DownloadItem]:
return list(self._items.values())
# Use in fixture
@pytest.fixture
def mock_queue_repository():
return MockQueueRepository()
@pytest.fixture
def download_service(mock_anime_service, mock_queue_repository):
return DownloadService(
anime_service=mock_anime_service,
queue_repository=mock_queue_repository,
max_retries=3,
)
- Troubleshooting Development Issues
Async Context Managers for aiohttp
All aiohttp.ClientSession usages must be wrapped in async with:
# Correct — session properly closed on exit
async with TMDBClient(api_key="key") as client:
result = await client.search_tv_show("Show")
# Wrong — session may leak if exception occurs
client = TMDBClient(api_key="key")
result = await client.search_tv_show("Show")
await client.close() # May not be called if exception raised earlier
Why:
aiohttp.ClientSessionholds TCP connections that must be explicitly closed- If exception occurs before
close(), session leaks - Context manager guarantees
__aexit__runs even on exceptions
Services that use aiohttp:
TMDBClient— has__aenter__/__aexit__, useasync withImageDownloader— has__aenter__/__aexit__, useasync withNFOService— wraps both above, useasync with
Verification:
- Missing context manager usage triggers
__del__warning on garbage collection - Integration tests verify no "Unclosed client session" errors in logs
Scheduler Persistence and Recovery
The scheduler uses APScheduler's in-memory job store. Jobs are reconstructed from config.json on every startup — no separate database is needed.
# Jobs are built from config on startup — no persistence DB required
scheduler = AsyncIOScheduler() # default MemoryJobStore
scheduler.add_job(..., replace_existing=True)
Startup misfire recovery: On start(), the scheduler checks system_settings.last_scan_timestamp in aniworld.db. If the last scan is overdue (>23h but <25h ago), an immediate rescan is triggered. This replaces APScheduler's built-in misfire handling which required a separate SQLite database.
Grace period: If the server was down for more than 25 hours, no automatic recovery occurs to avoid surprise rescans after long downtime.
Health endpoint: GET /health returns scheduler_next_run and scheduler_last_run for external monitors (Uptime Kuma, Prometheus, etc.).
If server is down too long: Manual trigger via POST /api/scheduler/trigger-rescan or wait for next scheduled run.
Database Session Management
get_async_session_factory() returns a new AsyncSession instance directly (not a factory). The function name is historical — callers receive the session immediately:
# Correct usage:
db = get_async_session_factory() # db IS the session
await db.execute(...)
await db.commit()
await db.close()
Do NOT call the result again with () — that tries to call an AsyncSession object, causing 'AsyncSession' object is not callable.
For context manager usage, prefer get_db_session() (auto-commits) or get_transactional_session() (manual commit).
Health Check Endpoints
The application provides health check endpoints for monitoring and container orchestration:
GET /health
Basic health check returning service status and startup health check results.
Response fields:
status: "healthy", "degraded", or "unhealthy" based on startup checkstimestamp: ISO timestamp of the checkseries_app_initialized: Whether the series app is loadedanime_directory_configured: Whether anime_directory is setscheduler_next_run/scheduler_last_run: Scheduler timeschecks: Detailed startup check results (ffmpeg, DNS, anime_directory)
GET /health/ready
Readiness check for container orchestrators (Kubernetes, Docker Swarm).
Response when ready:
{
"status": "ready",
"ready": true,
"timestamp": "2024-01-01T00:00:00",
"checks": {...}
}
Response when not ready (503):
{
"status": "not_ready",
"ready": false,
"timestamp": "2024-01-01T00:00:00",
"critical_failures": ["anime_directory: not configured"],
"checks": {...}
}
GET /health/detailed
Comprehensive health check including database, filesystem, and system metrics.
Startup Health Checks
On application startup, the following checks are performed:
| Check | Failure Status | Impact |
|---|---|---|
ffmpeg |
warning | HLS downloads may fail |
dns_aniworld |
warning | Provider requests may fail |
dns_tmdb |
warning | TMDB API calls may fail |
anime_directory |
error | Download service disabled |
DNS checks are warnings because failures can be transient. anime_directory errors disable the download service to prevent failures.
Troubleshooting Development Issues
Scheduler missed a run
- Server was down at scheduled time (03:00 UTC by default).
- On restart, the scheduler checks
last_scan_timestamp— if overdue by 23-25h, it triggers immediately. - If server was down >25 hours, missed job is skipped to avoid surprise rescans.
- Trigger manually:
POST /api/scheduler/trigger-rescan - Monitor next run:
GET /health→scheduler_next_run
Scheduler not firing (no events at scheduled time)
If the scheduler appears configured but never triggers:
-
Check application logs for scheduler startup:
grep "Scheduler service started" fastapi_app.log- If missing, the scheduler failed to start — check for errors above this line
- If present, scheduler started successfully
-
Verify the job is registered:
grep "Scheduler started with cron trigger" fastapi_app.log -
Verify APScheduler events in logs:
grep "apscheduler.executors.default" fastapi_app.logRunning job= job triggeredexecuted successfully= job completed- No output = job never fired
-
Test manual trigger:
curl -X POST http://localhost:8000/api/scheduler/trigger-rescan -H "Authorization: Bearer <token>"- If manual trigger works but cron doesn't, the issue is APScheduler configuration
-
Check next_run_time via health endpoint:
curl http://localhost:8000/health | jq .scheduler_next_run- If
null, the job is not scheduled - If set, the scheduler knows when to run next
- If
-
Check timezone handling:
- APScheduler uses UTC internally
- The schedule_time config (e.g., "03:00") is interpreted as UTC
- If you expect local time, adjust the schedule_time accordingly
Startup health check failures
If /health returns unhealthy status:
-
anime_directory error: Directory not configured or not writable
- Check
ANIME_DIRECTORYenvironment variable - Verify directory exists and permissions allow write access
- Download service will not initialize until resolved
- Check
-
ffmpeg warning: ffmpeg not found in PATH
- HLS stream downloads will fail
- Install ffmpeg:
apt install ffmpegorbrew install ffmpeg
-
DNS warnings: Domain resolution failed
- Check network connectivity
- DNS failures are transient — warnings don't block startup
- Retry later to verify:
GET /health
Provider Failure Handling
Download providers (VOE, Doodstream, Vidmoly, Vidoza, SpeedFiles, Streamtape,
Luluvdo) regularly break: URLs expire, sites change their player markup, geo
blocks appear, and yt-dlp extractors lag behind upstream changes. The
AniworldLoader.download() flow is designed to fail fast and rotate.
Rotation order
- The episode page is scraped for the providers AniWorld actually advertises.
- Results are ordered by the preference in
DEFAULT_PROVIDERS(provider_config.py); providers not listed run last. - For each candidate the loader:
- Calls
_check_url_alive()— HEAD probe with GET fallback. Any 4xx response or connection error skips the provider immediately. - Resolves the redirect via
_resolve_direct_link()to obtain a direct stream URL plus headers. Provider-specific extractors (e.g.VOE) are preferred; unknown providers fall back to the embed URL soyt-dlpcan attempt extraction. - Tries
_try_direct_stream()— straightrequests.get(stream=True)whenContent-Typeisvideo/*orapplication/octet-stream. This avoidsyt-dlpentirely for direct MP4 links. - Falls back to
yt-dlpwith the ffmpeg downloader for HLS streams.
- Calls
- On any failure, temp files are cleaned and the loop moves to the next
provider. When the chain is exhausted, the loader logs
All download providers failed for S{season}E{episode} ...; tried=[...]to both the application log andlogs/download_errors.log.
Do not hardcode provider URLs. Provider domains shift constantly (e.g.
Doodstream alternates between dood.li, dood.so, dood.la). Only the
referer hints in PROVIDER_HEADERS are persisted — discovery still happens
at runtime through AniWorld's redirect endpoint.
HLS Stream Handling
HLS (HTTP Live Streaming) manifests (.m3u8) require yt-dlp to use the
ffmpeg downloader with --hls-use-mpegts. Both providers configure this
automatically:
ydl_opts = {
"downloader": "ffmpeg", # Use ffmpeg instead of native
"hls_use_mpegts": True, # Write transport stream (.ts) segments
}
Why this matters: Without ffmpeg, yt-dlp logs:
"Live HLS streams are not supported by the native downloader"
Requirements:
- ffmpeg must be installed and in PATH (
which ffmpeg) - Install:
apt install ffmpeg(Debian/Ubuntu) orbrew install ffmpeg(macOS) - Startup health check (see Health Check Endpoints) verifies ffmpeg presence
Trade-offs:
- HLS downloads are slower than direct MP4 (reassembly of .ts segments)
- Requires more disk space during download
- May need post-processing if .ts format is not desired
Detection: VOE provider extracts HLS URLs via HLS_PATTERN regex. Other
providers let yt-dlp auto-detect from URL/content-type.
Updating yt-dlp
When extractors break (typical symptoms: every provider HEAD probe succeeds
but yt-dlp raises Unable to extract or HTTP Error 404):
- Check the upstream tracker first: https://github.com/yt-dlp/yt-dlp/issues
- Upgrade in the conda environment:
conda run -n AniWorld pip install --upgrade yt-dlp - Smoke-test against a known-good episode before pinning a new floor in
requirements.txt(yt-dlp>=YYYY.MM.DD). - Re-run the provider test suite:
conda run -n AniWorld python -m pytest tests/unit/test_aniworld_provider.py -v - If a specific extractor is removed upstream, drop the provider from
DEFAULT_PROVIDERSrather than patchingyt-dlpin tree.
User Notification on Total Failure
SeriesApp.download_episode() already emits a download_status="failed"
WebSocket event when loader.download() returns False. Operators should
forward this to notification_service.notify_download_failed() so users see
a HIGH-priority alert. The loader keeps the failure detail in
logs/download_errors.log for post-mortem.
Series Storage
Overview
Series metadata now stored in the database (SQLAlchemy ORM).
Legacy files (key and data per folder) are deprecated but preserved
for backward compatibility.
Architecture
- Database: Single source of truth for all series metadata
- In-Memory Cache: SeriesApp maintains a cache for performance
- Filesystem: Only used for episode files themselves, not metadata
Migration
First startup after upgrade automatically imports any legacy series files into the database.
Legacy Files
keyfile: Contains series provider key (deprecated)datafile: Contains Serie JSON object (deprecated)
Both are safe to delete after migration; not needed for normal operation.