Aniworld/docs/infrastructure.md
Lukas 684337fd0c Add data file to database sync functionality
- Add get_all_series_from_data_files() to SeriesApp
- Sync series from data files to DB on startup
- Add unit tests for new SeriesApp method
- Add integration tests for sync functionality
- Update documentation
2025-12-13 09:32:57 +01:00

17 KiB

Aniworld Web Application Infrastructure

conda activate AniWorld

Project Structure

src/
├── core/                  # Core application logic
│   ├── SeriesApp.py       # Main application class
│   ├── SerieScanner.py    # Directory scanner
│   ├── entities/          # Domain entities (series.py, SerieList.py)
│   ├── interfaces/        # Abstract interfaces (providers.py, callbacks.py)
│   ├── providers/         # Content providers (aniworld, streaming)
│   └── exceptions/        # Custom exceptions
├── server/                # FastAPI web application
│   ├── fastapi_app.py     # Main FastAPI application
│   ├── controllers/       # Route controllers (health, page, error)
│   ├── api/               # API routes (auth, config, anime, download, websocket)
│   ├── models/            # Pydantic models
│   ├── services/          # Business logic services
│   ├── database/          # SQLAlchemy ORM layer
│   ├── utils/             # Utilities (dependencies, templates, security)
│   └── web/               # Frontend (templates, static assets)
├── cli/                   # CLI application
data/                      # Config, database, queue state
logs/                      # Application logs
tests/                     # Test suites

Technology Stack

Layer Technology
Backend FastAPI, Uvicorn, SQLAlchemy, SQLite, Pydantic
Frontend HTML5, CSS3, Vanilla JS, Bootstrap 5, HTMX
Security JWT (python-jose), bcrypt (passlib)
Real-time Native WebSocket

Series Identifier Convention

Throughout the codebase, three identifiers are used for anime series:

Identifier Type Purpose Example
key Unique, Indexed PRIMARY - All lookups, API operations, WebSocket events "attack-on-titan"
folder String Display/filesystem metadata only (never for lookups) "Attack on Titan (2013)"
id Primary Key Internal database key for relationships 1, 42

Key Format Requirements

  • Lowercase only: No uppercase letters allowed
  • URL-safe: Only alphanumeric characters and hyphens
  • Hyphen-separated: Words separated by single hyphens
  • No leading/trailing hyphens: Must start and end with alphanumeric
  • No consecutive hyphens: attack--titan is invalid

Valid examples: "attack-on-titan", "one-piece", "86-eighty-six", "re-zero" Invalid examples: "Attack On Titan", "attack_on_titan", "attack on titan"

Notes

  • Backward Compatibility: API endpoints accepting anime_id will check key first, then fall back to folder lookup
  • New Code: Always use key for identification; folder is metadata only

API Endpoints

Authentication (/api/auth)

  • POST /login - Master password authentication (returns JWT)
  • POST /logout - Invalidate session
  • GET /status - Check authentication status

Configuration (/api/config)

  • GET / - Get configuration
  • PUT / - Update configuration
  • POST /validate - Validate without applying
  • GET /backups - List backups
  • POST /backups/{name}/restore - Restore backup

Anime (/api/anime)

  • GET / - List anime with missing episodes (returns key as identifier)
  • GET /{anime_id} - Get anime details (accepts key or folder for backward compatibility)
  • POST /search - Search for anime (returns key as identifier)
  • POST /add - Add new series (extracts key from link URL)
  • POST /rescan - Trigger library rescan

Response Models:

  • AnimeSummary: key (primary identifier), name, site, folder (metadata), missing_episodes, link
  • AnimeDetail: key (primary identifier), title, folder (metadata), episodes, description

Download Queue (/api/queue)

  • GET /status - Queue status and statistics
  • POST /add - Add episodes to queue
  • DELETE /{item_id} - Remove item
  • POST /start | /stop | /pause | /resume - Queue control
  • POST /retry - Retry failed downloads
  • DELETE /completed - Clear completed items

Request Models:

  • DownloadRequest: serie_id (key, primary identifier), serie_folder (filesystem path), serie_name (display), episodes, priority

Response Models:

  • DownloadItem: id, serie_id (key), serie_folder (metadata), serie_name, episode, status, progress
  • QueueStatus: is_running, is_paused, active_downloads, pending_queue, completed_downloads, failed_downloads

WebSocket (/ws/connect)

Real-time updates for downloads, scans, and queue operations.

Rooms: downloads, download_progress, scan_progress

Message Types: download_progress, download_complete, download_failed, queue_status, scan_progress, scan_complete, scan_failed

Series Identifier in Messages: All series-related WebSocket events include key as the primary identifier in their data payload:

{
    "type": "download_progress",
    "timestamp": "2025-10-17T10:30:00.000Z",
    "data": {
        "download_id": "abc123",
        "key": "attack-on-titan",
        "folder": "Attack on Titan (2013)",
        "percent": 45.2,
        "speed_mbps": 2.5,
        "eta_seconds": 180
    }
}

Database Models

Model Purpose
AnimeSeries Series metadata (key, name, folder, etc)
Episode Episodes linked to series
DownloadQueueItem Queue items with status and progress
UserSession JWT sessions with expiry

Mixins: TimestampMixin (created_at, updated_at), SoftDeleteMixin

AnimeSeries Identifier Fields

Field Type Purpose
id Primary Key Internal database key for relationships
key Unique, Indexed PRIMARY IDENTIFIER for all lookups
folder String Filesystem metadata only (not for identification)

Database Service Methods:

  • AnimeSeriesService.get_by_key(key) - Primary lookup method
  • AnimeSeriesService.get_by_id(id) - Internal lookup by database ID
  • No get_by_folder() method exists - folder is never used for lookups

DownloadQueueItem Fields

Field Type Purpose
id String (PK) UUID for the queue item
serie_id String Series key for identification
serie_folder String Filesystem folder path
serie_name String Display name for the series
season Integer Season number
episode Integer Episode number
status Enum pending, downloading, completed, failed
priority Enum low, normal, high
progress Float Download progress percentage (0.0-100.0)
error String Error message if failed
retry_count Integer Number of retry attempts
added_at DateTime When item was added to queue
started_at DateTime When download started (nullable)
completed_at DateTime When download completed/failed (nullable)

Data Storage

Storage Architecture

The application uses SQLite database as the primary storage for all application data.

Data Type Storage Location Service
Anime Series data/aniworld.db AnimeSeriesService
Episodes data/aniworld.db AnimeSeriesService
Download Queue data/aniworld.db DownloadService via QueueRepository
User Sessions data/aniworld.db AuthService
Configuration data/config.json ConfigService

Download Queue Storage

The download queue is stored in SQLite via QueueRepository, which wraps DownloadQueueService:

# QueueRepository provides async operations for queue items
repository = QueueRepository(session_factory)

# Save item to database
saved_item = await repository.save_item(download_item)

# Get pending items (ordered by priority and add time)
pending = await repository.get_pending_items()

# Update item status
await repository.update_status(item_id, DownloadStatus.COMPLETED)

# Update download progress
await repository.update_progress(item_id, progress=45.5, downloaded=450, total=1000, speed=2.5)

Queue Persistence Features:

  • Queue state survives server restarts
  • Items in downloading status are reset to pending on startup
  • Failed items within retry limit are automatically re-queued
  • Completed and failed history is preserved (with limits)
  • Real-time progress updates are persisted to database

Anime Series Database Storage

# Add series to database
await AnimeSeriesService.create(db_session, series_data)

# Query series by key
series = await AnimeSeriesService.get_by_key(db_session, "attack-on-titan")

# Update series
await AnimeSeriesService.update(db_session, series_id, update_data)

Legacy File Storage (Deprecated)

The legacy file-based storage is deprecated and will be removed in v3.0.0:

  • Serie.save_to_file() - Deprecated, use AnimeSeriesService.create()
  • Serie.load_from_file() - Deprecated, use AnimeSeriesService.get_by_key()
  • SerieList.add() - Deprecated, use SerieList.add_to_db()

Deprecation warnings are raised when using these methods.

Core Services

SeriesApp (src/core/SeriesApp.py)

Main engine for anime series management with async support, progress callbacks, and cancellation.

Key Methods:

  • search(words) - Search for anime series
  • download(serie_folder, season, episode, key, language) - Download an episode
  • rescan() - Rescan directory for missing episodes
  • get_all_series_from_data_files() - Load all series from data files in the anime directory (used for database sync on startup)

Data File to Database Sync

On application startup, the system automatically syncs series from data files to the database:

  1. After download_service.initialize() succeeds
  2. SeriesApp.get_all_series_from_data_files() loads all series from data files
  3. Each series is added to the database via SerieList.add_to_db()
  4. Existing series are skipped (no duplicates)
  5. Sync continues silently even if individual series fail

This ensures that series metadata stored in filesystem data files is available in the database for the web application.

Callback System (src/core/interfaces/callbacks.py)

  • ProgressCallback, ErrorCallback, CompletionCallback
  • Context classes include key + optional folder fields
  • Thread-safe CallbackManager for multiple callback registration

Services (src/server/services/)

Service Purpose
AnimeService Series management, scans (uses SeriesApp)
DownloadService Queue management, download execution
ScanService Library scan operations with callbacks
ProgressService Centralized progress tracking + WebSocket
WebSocketService Real-time connection management
AuthService JWT authentication, rate limiting
ConfigService Configuration persistence with backups

Validation Utilities (src/server/utils/validators.py)

Provides data validation functions for ensuring data integrity across the application.

Series Key Validation

  • validate_series_key(key): Validates key format (URL-safe, lowercase, hyphens only)
    • Valid: "attack-on-titan", "one-piece", "86-eighty-six"
    • Invalid: "Attack On Titan", "attack_on_titan", "attack on titan"
  • validate_series_key_or_folder(identifier, allow_folder=True): Backward-compatible validation
    • Returns tuple (identifier, is_key) where is_key indicates if it's a valid key format
    • Set allow_folder=False to require strict key format

Other Validators

Function Purpose
validate_series_name Series display name validation
validate_episode_range Episode range validation (1-1000)
validate_download_quality Quality setting (360p-1080p, best, worst)
validate_language Language codes (ger-sub, ger-dub, etc.)
validate_anime_url Aniworld.to/s.to URL validation
validate_backup_name Backup filename validation
validate_config_data Configuration data structure validation
sanitize_filename Sanitize filenames for safe filesystem use

Template Helpers (src/server/utils/template_helpers.py)

Provides utilities for template rendering and series data preparation.

Core Functions

Function Purpose
get_base_context Base context for all templates
render_template Render template with context
validate_template_exists Check if template file exists
list_available_templates List all available template files

Series Context Helpers

All series helpers use key as the primary identifier:

Function Purpose
prepare_series_context Prepare series data for templates (uses key)
get_series_by_key Find series by key (not folder)
filter_series_by_missing_episodes Filter series with missing episodes

Example Usage:

from src.server.utils.template_helpers import prepare_series_context

series_data = [
    {"key": "attack-on-titan", "name": "Attack on Titan", "folder": "Attack on Titan (2013)"},
    {"key": "one-piece", "name": "One Piece", "folder": "One Piece (1999)"}
]
prepared = prepare_series_context(series_data, sort_by="name")
# Returns sorted list using 'key' as identifier

Frontend

Static Files

  • CSS: styles.css (Fluent UI design), ux_features.css (accessibility)
  • JS: app.js, queue.js, websocket_client.js, accessibility modules

WebSocket Client

Native WebSocket wrapper with Socket.IO-compatible API:

const socket = io();
socket.join("download_progress");
socket.on("download_progress", (data) => {
    /* ... */
});

Authentication

JWT tokens stored in localStorage, included as Authorization: Bearer <token>.

Testing

# All tests
conda run -n AniWorld python -m pytest tests/ -v

# Unit tests only
conda run -n AniWorld python -m pytest tests/unit/ -v

# API tests
conda run -n AniWorld python -m pytest tests/api/ -v

Production Notes

Current (Single-Process)

  • SQLite with WAL mode
  • In-memory WebSocket connections
  • File-based config and queue persistence

Multi-Process Deployment

  • Switch to PostgreSQL/MySQL
  • Move WebSocket registry to Redis
  • Use distributed locking for queue operations
  • Consider Redis for session/cache storage

Code Examples

API Usage with Key Identifier

# Fetching anime list - response includes 'key' as identifier
response = requests.get("/api/anime", headers={"Authorization": f"Bearer {token}"})
anime_list = response.json()
# Each item has: key="attack-on-titan", folder="Attack on Titan (2013)", ...

# Fetching specific anime by key (preferred)
response = requests.get("/api/anime/attack-on-titan", headers={"Authorization": f"Bearer {token}"})

# Adding to download queue using key
download_request = {
    "serie_id": "attack-on-titan",  # Use key, not folder
    "serie_folder": "Attack on Titan (2013)",  # Metadata for filesystem
    "serie_name": "Attack on Titan",
    "episodes": ["S01E01", "S01E02"],
    "priority": 1
}
response = requests.post("/api/queue/add", json=download_request, headers=headers)

WebSocket Event Handling

// WebSocket events always include 'key' as identifier
socket.on("download_progress", (data) => {
    const key = data.key; // Primary identifier: "attack-on-titan"
    const folder = data.folder; // Metadata: "Attack on Titan (2013)"
    updateProgressBar(key, data.percent);
});