Scheduler used a separate SQLite file (scheduler.db) only to persist one cron job. This was originally required because APScheduler's SQLAlchemyJobStore is sync-only, creating an async/sync driver conflict when accessing the same file. The job is rebuilt from config.json on every startup regardless (replace_existing=True), so the persisted state only served misfire detection. Moved misfire detection into the app layer by querying system_settings.last_scan_timestamp on startup: if the last scan is >23h but <25h ago, an immediate rescan is triggered. Change summary: - Remove SQLAlchemyJobStore; use default MemoryJobStore instead - Add _check_missed_run() that reads last_scan_timestamp from aniworld.db - Update docs/DEVELOPMENT.md scheduler troubleshooting section - Update the scheduler unit test that verified SQLAlchemyJobStore
16 KiB
Development Guide
Document Purpose
This document provides guidance for developers working on the Aniworld project.
What This Document Contains
- Prerequisites: Required software and tools
- Environment Setup: Step-by-step local development setup
- Project Structure: Source code organization explanation
- Development Workflow: Branch strategy, commit conventions
- Coding Standards: Style guide, linting, formatting
- Running the Application: Development server, CLI usage
- Debugging Tips: Common debugging approaches
- IDE Configuration: VS Code settings, recommended extensions
- Contributing Guidelines: How to submit changes
- Code Review Process: Review checklist and expectations
What This Document Does NOT Contain
- Production deployment (see DEPLOYMENT.md)
- API reference (see API.md)
- Architecture decisions (see ARCHITECTURE.md)
- Test writing guides (see TESTING.md)
- Security guidelines (see SECURITY.md)
Target Audience
- New Developers joining the project
- Contributors (internal and external)
- Anyone setting up a development environment
Sections to Document
- Prerequisites
- Python version
- Conda environment
- Node.js (if applicable)
- Git
- Getting Started
- Clone repository
- Setup conda environment
- Install dependencies
- Configuration setup
- Project Structure Overview
- Development Server
- Starting FastAPI server
- Hot reload configuration
- Debug mode
- CLI Development
- Code Style
- PEP 8 compliance
- Type hints requirements
- Docstring format
- Import organization
- Git Workflow
- Branch naming
- Commit message format
- Pull request process
- Common Development Tasks
Adding Queue Deduplication
The download queue prevents duplicate entries at two levels:
In-Memory Deduplication (src/server/services/download_service.py):
_pending_by_episodedict tracks pending episodes: key =(serie_id, season, episode)_add_to_pending_queue()updates the dict when adding itemsadd_to_queue()checks this dict before adding episodes (includes batch-local dedup)_remove_from_pending_queue()cleans up the dict when items are removed
Database Constraint (src/server/models.py):
DownloadQueueItemhas a unique index onepisode_idvia__table_args__- Prevents duplicate queue entries at the database level
- Unique constraint:
Index("ix_download_queue_episode_pending", "episode_id", unique=True)
Scheduler Cooldown (src/server/services/scheduler_service.py):
_last_auto_download_timetracks when auto-download last ran- 5-minute cooldown prevents rapid re-triggers
- Checked at start of
_auto_download_missing()
Episode Lifecycle
Episodes transition through states stored in the episodes table:
| State | is_downloaded |
file_path |
Description |
|---|---|---|---|
| Missing | False |
NULL |
Episode not yet downloaded |
| Downloaded | True |
Set | Episode exists on disk |
State Transitions:
- Missing → Downloaded: When download completes,
_remove_episode_from_missing_list()callsEpisodeService.mark_downloaded()to setis_downloaded=Trueand populatefile_path. The episode record is NOT deleted.
Query Implications:
get_series_with_missing_episodes(): Filters foris_downloaded=Falseto find series with undownloaded episodesget_series_with_no_episodes(): Finds series withis_downloaded=Falseepisodes but NOis_downloaded=Trueepisodes (completely unwatched series)
Mocking the Download Queue
When testing components that use the download queue:
# Mock repository for unit tests
class MockQueueRepository:
def __init__(self):
self._items: Dict[str, DownloadItem] = {}
async def save_item(self, item: DownloadItem) -> DownloadItem:
self._items[item.id] = item
return item
async def get_all_items(self) -> List[DownloadItem]:
return list(self._items.values())
# Use in fixture
@pytest.fixture
def mock_queue_repository():
return MockQueueRepository()
@pytest.fixture
def download_service(mock_anime_service, mock_queue_repository):
return DownloadService(
anime_service=mock_anime_service,
queue_repository=mock_queue_repository,
max_retries=3,
)
- Troubleshooting Development Issues
Async Context Managers for aiohttp
All aiohttp.ClientSession usages must be wrapped in async with:
# Correct — session properly closed on exit
async with TMDBClient(api_key="key") as client:
result = await client.search_tv_show("Show")
# Wrong — session may leak if exception occurs
client = TMDBClient(api_key="key")
result = await client.search_tv_show("Show")
await client.close() # May not be called if exception raised earlier
Why:
aiohttp.ClientSessionholds TCP connections that must be explicitly closed- If exception occurs before
close(), session leaks - Context manager guarantees
__aexit__runs even on exceptions
Services that use aiohttp:
TMDBClient— has__aenter__/__aexit__, useasync withImageDownloader— has__aenter__/__aexit__, useasync withNFOService— wraps both above, useasync with
Verification:
- Missing context manager usage triggers
__del__warning on garbage collection - Integration tests verify no "Unclosed client session" errors in logs
Scheduler Persistence and Recovery
The scheduler uses APScheduler's in-memory job store. Jobs are reconstructed from config.json on every startup — no separate database is needed.
# Jobs are built from config on startup — no persistence DB required
scheduler = AsyncIOScheduler() # default MemoryJobStore
scheduler.add_job(..., replace_existing=True)
Startup misfire recovery: On start(), the scheduler checks system_settings.last_scan_timestamp in aniworld.db. If the last scan is overdue (>23h but <25h ago), an immediate rescan is triggered. This replaces APScheduler's built-in misfire handling which required a separate SQLite database.
Grace period: If the server was down for more than 25 hours, no automatic recovery occurs to avoid surprise rescans after long downtime.
Health endpoint: GET /health returns scheduler_next_run and scheduler_last_run for external monitors (Uptime Kuma, Prometheus, etc.).
If server is down too long: Manual trigger via POST /api/scheduler/trigger-rescan or wait for next scheduled run.
Database Session Management
get_async_session_factory() returns a new AsyncSession instance directly (not a factory). The function name is historical — callers receive the session immediately:
# Correct usage:
db = get_async_session_factory() # db IS the session
await db.execute(...)
await db.commit()
await db.close()
Do NOT call the result again with () — that tries to call an AsyncSession object, causing 'AsyncSession' object is not callable.
For context manager usage, prefer get_db_session() (auto-commits) or get_transactional_session() (manual commit).
Health Check Endpoints
The application provides health check endpoints for monitoring and container orchestration:
GET /health
Basic health check returning service status and startup health check results.
Response fields:
status: "healthy", "degraded", or "unhealthy" based on startup checkstimestamp: ISO timestamp of the checkseries_app_initialized: Whether the series app is loadedanime_directory_configured: Whether anime_directory is setscheduler_next_run/scheduler_last_run: Scheduler timeschecks: Detailed startup check results (ffmpeg, DNS, anime_directory)
GET /health/ready
Readiness check for container orchestrators (Kubernetes, Docker Swarm).
Response when ready:
{
"status": "ready",
"ready": true,
"timestamp": "2024-01-01T00:00:00",
"checks": {...}
}
Response when not ready (503):
{
"status": "not_ready",
"ready": false,
"timestamp": "2024-01-01T00:00:00",
"critical_failures": ["anime_directory: not configured"],
"checks": {...}
}
GET /health/detailed
Comprehensive health check including database, filesystem, and system metrics.
Startup Health Checks
On application startup, the following checks are performed:
| Check | Failure Status | Impact |
|---|---|---|
ffmpeg |
warning | HLS downloads may fail |
dns_aniworld |
warning | Provider requests may fail |
dns_tmdb |
warning | TMDB API calls may fail |
anime_directory |
error | Download service disabled |
DNS checks are warnings because failures can be transient. anime_directory errors disable the download service to prevent failures.
Troubleshooting Development Issues
Scheduler missed a run
- Server was down at scheduled time (03:00 UTC by default).
- On restart, the scheduler checks
last_scan_timestamp— if overdue by 23-25h, it triggers immediately. - If server was down >25 hours, missed job is skipped to avoid surprise rescans.
- Trigger manually:
POST /api/scheduler/trigger-rescan - Monitor next run:
GET /health→scheduler_next_run
Scheduler not firing (no events at scheduled time)
If the scheduler appears configured but never triggers:
-
Check application logs for scheduler startup:
grep "Scheduler service started" fastapi_app.log- If missing, the scheduler failed to start — check for errors above this line
- If present, scheduler started successfully
-
Verify the job is registered:
grep "Scheduler started with cron trigger" fastapi_app.log -
Verify APScheduler events in logs:
grep "apscheduler.executors.default" fastapi_app.logRunning job= job triggeredexecuted successfully= job completed- No output = job never fired
-
Test manual trigger:
curl -X POST http://localhost:8000/api/scheduler/trigger-rescan -H "Authorization: Bearer <token>"- If manual trigger works but cron doesn't, the issue is APScheduler configuration
-
Check next_run_time via health endpoint:
curl http://localhost:8000/health | jq .scheduler_next_run- If
null, the job is not scheduled - If set, the scheduler knows when to run next
- If
-
Check timezone handling:
- APScheduler uses UTC internally
- The schedule_time config (e.g., "03:00") is interpreted as UTC
- If you expect local time, adjust the schedule_time accordingly
Startup health check failures
If /health returns unhealthy status:
-
anime_directory error: Directory not configured or not writable
- Check
ANIME_DIRECTORYenvironment variable - Verify directory exists and permissions allow write access
- Download service will not initialize until resolved
- Check
-
ffmpeg warning: ffmpeg not found in PATH
- HLS stream downloads will fail
- Install ffmpeg:
apt install ffmpegorbrew install ffmpeg
-
DNS warnings: Domain resolution failed
- Check network connectivity
- DNS failures are transient — warnings don't block startup
- Retry later to verify:
GET /health
Provider Failure Handling
Download providers (VOE, Doodstream, Vidmoly, Vidoza, SpeedFiles, Streamtape,
Luluvdo) regularly break: URLs expire, sites change their player markup, geo
blocks appear, and yt-dlp extractors lag behind upstream changes. The
AniworldLoader.download() flow is designed to fail fast and rotate.
Rotation order
- The episode page is scraped for the providers AniWorld actually advertises.
- Results are ordered by the preference in
DEFAULT_PROVIDERS(provider_config.py); providers not listed run last. - For each candidate the loader:
- Calls
_check_url_alive()— HEAD probe with GET fallback. Any 4xx response or connection error skips the provider immediately. - Resolves the redirect via
_resolve_direct_link()to obtain a direct stream URL plus headers. Provider-specific extractors (e.g.VOE) are preferred; unknown providers fall back to the embed URL soyt-dlpcan attempt extraction. - Tries
_try_direct_stream()— straightrequests.get(stream=True)whenContent-Typeisvideo/*orapplication/octet-stream. This avoidsyt-dlpentirely for direct MP4 links. - Falls back to
yt-dlpwith the ffmpeg downloader for HLS streams.
- Calls
- On any failure, temp files are cleaned and the loop moves to the next
provider. When the chain is exhausted, the loader logs
All download providers failed for S{season}E{episode} ...; tried=[...]to both the application log andlogs/download_errors.log.
Do not hardcode provider URLs. Provider domains shift constantly (e.g.
Doodstream alternates between dood.li, dood.so, dood.la). Only the
referer hints in PROVIDER_HEADERS are persisted — discovery still happens
at runtime through AniWorld's redirect endpoint.
HLS Stream Handling
HLS (HTTP Live Streaming) manifests (.m3u8) require yt-dlp to use the
ffmpeg downloader with --hls-use-mpegts. Both providers configure this
automatically:
ydl_opts = {
"downloader": "ffmpeg", # Use ffmpeg instead of native
"hls_use_mpegts": True, # Write transport stream (.ts) segments
}
Why this matters: Without ffmpeg, yt-dlp logs:
"Live HLS streams are not supported by the native downloader"
Requirements:
- ffmpeg must be installed and in PATH (
which ffmpeg) - Install:
apt install ffmpeg(Debian/Ubuntu) orbrew install ffmpeg(macOS) - Startup health check (see Health Check Endpoints) verifies ffmpeg presence
Trade-offs:
- HLS downloads are slower than direct MP4 (reassembly of .ts segments)
- Requires more disk space during download
- May need post-processing if .ts format is not desired
Detection: VOE provider extracts HLS URLs via HLS_PATTERN regex. Other
providers let yt-dlp auto-detect from URL/content-type.
Updating yt-dlp
When extractors break (typical symptoms: every provider HEAD probe succeeds
but yt-dlp raises Unable to extract or HTTP Error 404):
- Check the upstream tracker first: https://github.com/yt-dlp/yt-dlp/issues
- Upgrade in the conda environment:
conda run -n AniWorld pip install --upgrade yt-dlp - Smoke-test against a known-good episode before pinning a new floor in
requirements.txt(yt-dlp>=YYYY.MM.DD). - Re-run the provider test suite:
conda run -n AniWorld python -m pytest tests/unit/test_aniworld_provider.py -v - If a specific extractor is removed upstream, drop the provider from
DEFAULT_PROVIDERSrather than patchingyt-dlpin tree.
User Notification on Total Failure
SeriesApp.download_episode() already emits a download_status="failed"
WebSocket event when loader.download() returns False. Operators should
forward this to notification_service.notify_download_failed() so users see
a HIGH-priority alert. The loader keeps the failure detail in
logs/download_errors.log for post-mortem.
Series Storage
Overview
Series metadata now stored in the database (SQLAlchemy ORM).
Legacy files (key and data per folder) are deprecated but preserved
for backward compatibility.
Architecture
- Database: Single source of truth for all series metadata
- In-Memory Cache: SeriesApp maintains a cache for performance
- Filesystem: Only used for episode files themselves, not metadata
Migration
First startup after upgrade automatically imports any legacy series files into the database.
Legacy Files
keyfile: Contains series provider key (deprecated)datafile: Contains Serie JSON object (deprecated)
Both are safe to delete after migration; not needed for normal operation.