Files
Aniworld/docs/architecture/async_loading_architecture.md
Lukas f18c31a035 Implement async series data loading with background processing
- Add loading status fields to AnimeSeries model
- Create BackgroundLoaderService for async task processing
- Update POST /api/anime/add to return 202 Accepted immediately
- Add GET /api/anime/{key}/loading-status endpoint
- Integrate background loader with startup/shutdown lifecycle
- Create database migration script for loading status fields
- Add unit tests for BackgroundLoaderService (10 tests, all passing)
- Update AnimeSeriesService.create() to accept loading status fields

Architecture follows clean separation with no code duplication:
- BackgroundLoader orchestrates, doesn't reimplement
- Reuses existing AnimeService, NFOService, WebSocket patterns
- Database-backed status survives restarts
2026-01-19 07:14:55 +01:00

24 KiB

Asynchronous Series Data Loading Architecture

Version: 1.0
Date: 2026-01-18
Status: Planning Phase

Table of Contents

  1. Executive Summary
  2. Current State Analysis
  3. Reusable Components
  4. Proposed Architecture
  5. Data Flow
  6. Database Schema Changes
  7. API Specifications
  8. Error Handling Strategy
  9. Integration Points
  10. Code Reuse Strategy
  11. Implementation Plan

Executive Summary

This document describes the architecture for implementing asynchronous series data loading with background processing. The goal is to allow users to add series immediately while metadata (episodes, NFO files, logos, images) loads asynchronously in the background, improving UX by not blocking during time-consuming operations.

Key Principles:

  • No Code Duplication: Reuse existing services and methods
  • Clean Separation: New BackgroundLoaderService orchestrates existing components
  • Progressive Enhancement: Add async loading without breaking existing functionality
  • Existing Patterns: Follow current WebSocket, service, and database patterns

Current State Analysis

Existing Services and Components

1. AnimeService (src/server/services/anime_service.py)

  • Purpose: Web layer wrapper around SeriesApp
  • Key Methods:
    • add_series_to_db(serie, db): Adds series to database with episodes
    • Event handlers for download/scan status
    • Progress tracking integration
  • Database Integration: Uses AnimeSeriesService and EpisodeService
  • Reusability: Can be reused for database operations

2. SeriesApp (src/core/SeriesApp.py)

  • Purpose: Core domain logic for series management
  • Key Functionality:
    • Series scanning and episode detection
    • Download management with progress tracking
    • Event-based status updates
  • NFO Service: Has NFOService instance for metadata generation
  • Reusability: Event system can be used for background tasks

3. NFOService (src/core/services/nfo_service.py)

  • Purpose: Create and manage tvshow.nfo files
  • Key Methods:
    • create_tvshow_nfo(serie_name, serie_folder, year, ...): Full NFO creation with images
    • check_nfo_exists(serie_folder): Check if NFO exists
    • update_tvshow_nfo(...): Update existing NFO
  • Image Downloads: Handles poster, logo, fanart downloads
  • Reusability: Direct reuse for NFO and image loading

4. WebSocketService (src/server/services/websocket_service.py)

  • Purpose: Real-time communication with clients
  • Features:
    • Connection management with room-based messaging
    • Broadcast to all or specific rooms
    • Personal messaging
  • Message Format: JSON with type field and payload
  • Reusability: Existing broadcast methods can be used

5. Database Models (src/server/database/models.py)

Current AnimeSeries Model Fields:

- id: int (PK, autoincrement)
- key: str (unique, indexed) - PRIMARY IDENTIFIER
- name: str (indexed)
- site: str
- folder: str - METADATA ONLY
- year: Optional[int]
- has_nfo: bool (default False)
- nfo_created_at: Optional[datetime]
- nfo_updated_at: Optional[datetime]
- tmdb_id: Optional[int]
- tvdb_id: Optional[int]
- episodes: relationship
- download_items: relationship

Fields to Add:

- loading_status: str - "pending", "loading", "completed", "failed"
- episodes_loaded: bool - Whether episodes have been scanned
- logo_loaded: bool - Whether logo image exists
- images_loaded: bool - Whether poster/fanart exist
- loading_started_at: Optional[datetime]
- loading_completed_at: Optional[datetime]
- loading_error: Optional[str]

6. Current API Pattern (src/server/api/anime.py)

Current /api/anime/add endpoint:

@router.post("/add")
async def add_series(...):
    # 1. Validate link and extract key
    # 2. Fetch year from provider
    # 3. Create sanitized folder name
    # 4. Save to database
    # 5. Create folder on disk
    # 6. Trigger targeted scan for episodes
    # 7. Return complete result

Issues with Current Approach:

  • Blocks until scan completes (can take 10-30 seconds)
  • User must wait before seeing series in UI
  • NFO/images not created automatically
  • No background processing on startup for incomplete series

Reusable Components

Components That Will Be Reused (No Duplication)

1. Episode Loading

Existing Method: AnimeService.rescan() or SeriesApp.scan()

  • Already handles episode detection and database sync
  • Reuse Strategy: Call anime_service.rescan() for specific series key

2. NFO Generation

Existing Method: NFOService.create_tvshow_nfo()

  • Already downloads poster, logo, fanart
  • Reuse Strategy: Direct call via SeriesApp.nfo_service.create_tvshow_nfo()

3. Database Operations

Existing Services: AnimeSeriesService, EpisodeService

  • CRUD operations for series and episodes
  • Reuse Strategy: Use existing service methods for status updates

4. WebSocket Broadcasting

Existing Methods: WebSocketService.broadcast(), broadcast_to_room()

  • Reuse Strategy: Create new broadcast method broadcast_loading_status() following existing pattern

5. Progress Tracking

Existing Service: ProgressService

  • Reuse Strategy: May integrate for UI progress bars (optional)

Components That Need Creation

1. BackgroundLoaderService

Purpose: Orchestrate async loading tasks

  • What it does: Queue management, task scheduling, status tracking
  • What it doesn't do: Actual loading (delegates to existing services)

2. Loading Status Models

Purpose: Type-safe status tracking

  • Enums for loading status
  • Data classes for loading tasks

Proposed Architecture

Component Diagram

┌─────────────────────────────────────────────────────────────┐
│                        FastAPI Application                   │
└─────────────────────────────────────────────────────────────┘
                               │
                               ▼
┌─────────────────────────────────────────────────────────────┐
│                     API Layer (anime.py)                     │
│  POST /api/anime/add (202 Accepted - immediate return)      │
│  GET /api/anime/{key}/loading-status                         │
└─────────────────────────────────────────────────────────────┘
                               │
                               ▼
┌─────────────────────────────────────────────────────────────┐
│               BackgroundLoaderService (NEW)                  │
│  - add_series_loading_task(key)                              │
│  - check_missing_data(key)                                   │
│  - _worker() [background task queue consumer]                │
│  - _load_series_data(task) [orchestrator]                    │
└─────────────────────────────────────────────────────────────┘
           │              │              │              │
           ▼              ▼              ▼              ▼
┌──────────────┐ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ AnimeService │ │  NFOService  │ │  Database    │ │  WebSocket   │
│  (EXISTING)  │ │  (EXISTING)  │ │   Service    │ │   Service    │
│              │ │              │ │  (EXISTING)  │ │  (EXISTING)  │
│ - rescan()   │ │ - create_nfo│ │ - update_    │ │ - broadcast_ │
│              │ │ - download  │ │   status     │ │   loading    │
└──────────────┘ └──────────────┘ └──────────────┘ └──────────────┘

Sequence Diagram: Add Series Flow

User → API: POST /api/anime/add {"link": "...", "name": "..."}
API → Database: Create AnimeSeries (loading_status="pending")
API → BackgroundLoader: add_series_loading_task(key)
API → User: 202 Accepted {"key": "...", "status": "loading"}
API → WebSocket: broadcast_loading_status("pending")

[Background Worker Task]
BackgroundLoader → BackgroundLoader: _worker() picks up task
BackgroundLoader → Database: check_missing_data(key)
BackgroundLoader → WebSocket: broadcast("loading_episodes")
BackgroundLoader → AnimeService: rescan(key) [REUSE EXISTING]
AnimeService → Database: Update episodes
BackgroundLoader → Database: Update episodes_loaded=True

BackgroundLoader → WebSocket: broadcast("loading_nfo")
BackgroundLoader → NFOService: create_tvshow_nfo() [REUSE EXISTING]
NFOService → TMDB API: Fetch metadata
NFOService → Filesystem: Download poster/logo/fanart
BackgroundLoader → Database: Update nfo_loaded=True, logo_loaded=True, images_loaded=True

BackgroundLoader → Database: Update loading_status="completed"
BackgroundLoader → WebSocket: broadcast("completed")

Data Flow

Immediate Series Addition (Synchronous)

  1. User submits series link and name
  2. API validates input, extracts key
  3. API fetches year from provider (quick operation)
  4. API creates database record with loading_status="pending"
  5. API creates folder on disk
  6. API queues background loading task
  7. API returns 202 Accepted immediately
  8. WebSocket broadcasts initial status

Background Data Loading (Asynchronous)

  1. Worker picks up task from queue
  2. Worker checks what data is missing
  3. For each missing data type:
    • Update status and broadcast via WebSocket
    • Call existing service (episodes/NFO/images)
    • Update database flags
  4. Mark as completed and broadcast final status

Database Schema Changes

Migration: Add Loading Status Fields

File: migrations/add_loading_status_fields.py

"""Add loading status fields to anime_series table.

Revision ID: 001_async_loading
Create Date: 2026-01-18
"""

from alembic import op
import sqlalchemy as sa
from sqlalchemy import Boolean, DateTime, String

def upgrade():
    # Add new columns
    op.add_column('anime_series',
        sa.Column('loading_status', String(50), nullable=False,
                  server_default='completed')
    )
    op.add_column('anime_series',
        sa.Column('episodes_loaded', Boolean, nullable=False,
                  server_default='1')
    )
    op.add_column('anime_series',
        sa.Column('logo_loaded', Boolean, nullable=False,
                  server_default='0')
    )
    op.add_column('anime_series',
        sa.Column('images_loaded', Boolean, nullable=False,
                  server_default='0')
    )
    op.add_column('anime_series',
        sa.Column('loading_started_at', DateTime(timezone=True),
                  nullable=True)
    )
    op.add_column('anime_series',
        sa.Column('loading_completed_at', DateTime(timezone=True),
                  nullable=True)
    )
    op.add_column('anime_series',
        sa.Column('loading_error', String(1000), nullable=True)
    )

    # Set existing series as completed since they were added synchronously
    op.execute(
        "UPDATE anime_series SET loading_status = 'completed', "
        "episodes_loaded = 1 WHERE loading_status = 'completed'"
    )

def downgrade():
    op.drop_column('anime_series', 'loading_error')
    op.drop_column('anime_series', 'loading_completed_at')
    op.drop_column('anime_series', 'loading_started_at')
    op.drop_column('anime_series', 'images_loaded')
    op.drop_column('anime_series', 'logo_loaded')
    op.drop_column('anime_series', 'episodes_loaded')
    op.drop_column('anime_series', 'loading_status')

Updated AnimeSeries Model

class AnimeSeries(Base, TimestampMixin):
    __tablename__ = "anime_series"

    # ... existing fields ...

    # Loading status fields (NEW)
    loading_status: Mapped[str] = mapped_column(
        String(50), default="completed", server_default="completed",
        doc="Loading status: pending, loading_episodes, loading_nfo, "
            "loading_logo, loading_images, completed, failed"
    )
    episodes_loaded: Mapped[bool] = mapped_column(
        Boolean, default=True, server_default="1",
        doc="Whether episodes have been scanned and loaded"
    )
    logo_loaded: Mapped[bool] = mapped_column(
        Boolean, default=False, server_default="0",
        doc="Whether logo.png has been downloaded"
    )
    images_loaded: Mapped[bool] = mapped_column(
        Boolean, default=False, server_default="0",
        doc="Whether poster/fanart have been downloaded"
    )
    loading_started_at: Mapped[Optional[datetime]] = mapped_column(
        DateTime(timezone=True), nullable=True,
        doc="When background loading started"
    )
    loading_completed_at: Mapped[Optional[datetime]] = mapped_column(
        DateTime(timezone=True), nullable=True,
        doc="When background loading completed"
    )
    loading_error: Mapped[Optional[str]] = mapped_column(
        String(1000), nullable=True,
        doc="Error message if loading failed"
    )

API Specifications

POST /api/anime/add

Purpose: Add a new series immediately and queue background loading

Changes from Current:

  • Returns 202 Accepted instead of 200 OK (indicates async processing)
  • Returns immediately without waiting for scan
  • Includes loading_status in response

Request:

{
    "link": "https://aniworld.to/anime/stream/attack-on-titan",
    "name": "Attack on Titan"
}

Response: 202 Accepted

{
    "status": "success",
    "message": "Series added and queued for background loading",
    "key": "attack-on-titan",
    "folder": "Attack on Titan (2013)",
    "db_id": 123,
    "loading_status": "pending",
    "loading_progress": {
        "episodes": false,
        "nfo": false,
        "logo": false,
        "images": false
    }
}

GET /api/anime/{key}/loading-status (NEW)

Purpose: Get current loading status for a series

Request:

GET /api/anime/attack-on-titan/loading-status

Response: 200 OK

{
    "key": "attack-on-titan",
    "loading_status": "loading_nfo",
    "progress": {
        "episodes": true,
        "nfo": false,
        "logo": false,
        "images": false
    },
    "started_at": "2026-01-18T10:30:00Z",
    "message": "Generating NFO file...",
    "error": null
}

When Completed:

{
    "key": "attack-on-titan",
    "loading_status": "completed",
    "progress": {
        "episodes": true,
        "nfo": true,
        "logo": true,
        "images": true
    },
    "started_at": "2026-01-18T10:30:00Z",
    "completed_at": "2026-01-18T10:30:45Z",
    "message": "All data loaded successfully",
    "error": null
}

WebSocket Message Format

Following Existing Pattern:

{
    "type": "series_loading_update",
    "key": "attack-on-titan",
    "loading_status": "loading_episodes",
    "progress": {
        "episodes": false,
        "nfo": false,
        "logo": false,
        "images": false
    },
    "message": "Loading episodes...",
    "timestamp": "2026-01-18T10:30:15Z"
}

Error Handling Strategy

Error Types

  1. Network Errors (TMDB API, provider site)

    • Retry with exponential backoff
    • Max 3 retries
    • Mark as failed if all retries exhausted
  2. Filesystem Errors (disk space, permissions)

    • No retry
    • Mark as failed immediately
    • Log detailed error
  3. Database Errors (connection, constraints)

    • Retry once after 1 second
    • Mark as failed if retry fails

Error Recording

  • Store error message in loading_error field
  • Set loading_status to "failed"
  • Broadcast error via WebSocket
  • Log with full context for debugging

Partial Success

  • If episodes load but NFO fails: Mark specific flags
  • Allow manual retry for failed components
  • Show partial status in UI

Integration Points

1. AnimeService Integration

Current Usage:

# In anime.py API
anime_service = Depends(get_anime_service)
await anime_service.rescan()

New Usage in BackgroundLoader:

# Reuse rescan for specific series
await anime_service.rescan_series(key)

No Changes Needed to AnimeService - Reuse as-is

2. NFOService Integration

Current Access:

# Via SeriesApp
series_app.nfo_service.create_tvshow_nfo(...)

New Usage in BackgroundLoader:

# Get NFOService from SeriesApp
if series_app.nfo_service:
    await series_app.nfo_service.create_tvshow_nfo(
        serie_name=name,
        serie_folder=folder,
        year=year,
        download_poster=True,
        download_logo=True,
        download_fanart=True
    )

No Changes Needed to NFOService - Reuse as-is

3. WebSocketService Integration

Existing Pattern:

# In websocket_service.py
async def broadcast_download_progress(...):
    message = {
        "type": "download_progress",
        "key": key,
        ...
    }
    await self.broadcast(message)

New Method (Following Pattern):

async def broadcast_loading_status(
    self,
    key: str,
    loading_status: str,
    progress: Dict[str, bool],
    message: str
):
    """Broadcast loading status update."""
    payload = {
        "type": "series_loading_update",
        "key": key,
        "loading_status": loading_status,
        "progress": progress,
        "message": message,
        "timestamp": datetime.now(timezone.utc).isoformat()
    }
    await self.broadcast(payload)

4. Database Service Integration

Existing Services:

  • AnimeSeriesService.get_by_key(db, key)
  • AnimeSeriesService.update(db, series_id, **kwargs)

New Helper Methods Needed:

# In AnimeSeriesService
async def update_loading_status(
    db,
    key: str,
    loading_status: str,
    **progress_flags
):
    """Update loading status and progress flags."""
    series = await self.get_by_key(db, key)
    if series:
        for field, value in progress_flags.items():
            setattr(series, field, value)
        series.loading_status = loading_status
        await db.commit()

Code Reuse Strategy

DO NOT DUPLICATE

Episode Loading Logic

Wrong:

# DON'T create new episode scanning logic
async def _scan_episodes(self, key: str):
    # Duplicate logic...

Right:

# Reuse existing AnimeService method
await self.anime_service.rescan_series(key)

NFO Generation Logic

Wrong:

# DON'T reimplement TMDB API calls
async def _create_nfo(self, series):
    # Duplicate TMDB logic...

Right:

# Reuse existing NFOService
await self.series_app.nfo_service.create_tvshow_nfo(...)

Database CRUD Operations

Wrong:

# DON'T write raw SQL
await db.execute("UPDATE anime_series SET ...")

Right:

# Use existing service methods
await AnimeSeriesService.update(db, series_id, loading_status="completed")

WHAT TO CREATE

Task Queue Management

class BackgroundLoaderService:
    def __init__(self):
        self.task_queue: Queue[SeriesLoadingTask] = Queue()
        self.active_tasks: Dict[str, SeriesLoadingTask] = {}

Orchestration Logic

async def _load_series_data(self, task: SeriesLoadingTask):
    """Orchestrate loading by calling existing services."""
    # Check what's missing
    # Call appropriate existing services
    # Update status

Status Tracking

class LoadingStatus(Enum):
    PENDING = "pending"
    LOADING_EPISODES = "loading_episodes"
    LOADING_NFO = "loading_nfo"
    COMPLETED = "completed"
    FAILED = "failed"

Implementation Plan

Phase 1: Database and Models (Step 1-2 of instructions)

  • Create Alembic migration for new fields
  • Run migration to update database
  • Update AnimeSeries model with new fields
  • Test database changes

Phase 2: BackgroundLoaderService (Step 3-4)

  • Create background_loader_service.py
  • Implement task queue and worker
  • Implement orchestration methods (calling existing services)
  • Add status tracking
  • Write unit tests

Phase 3: API Updates (Step 5-6)

  • Update POST /api/anime/add for immediate return
  • Create GET /api/anime/{key}/loading-status endpoint
  • Update response models
  • Write API tests

Phase 4: WebSocket Integration (Step 7)

  • Add broadcast_loading_status() to WebSocketService
  • Integrate broadcasts in BackgroundLoader
  • Write WebSocket tests

Phase 5: Startup Check (Step 8)

  • Add startup event handler to check incomplete series
  • Queue incomplete series for background loading
  • Add graceful shutdown for background tasks
  • Write integration tests

Phase 6: Frontend (Step 9-10)

  • Add loading indicators to series cards
  • Handle WebSocket loading status messages
  • Add CSS for loading states
  • Test UI responsiveness

Validation Checklist

Code Duplication Prevention

  • No duplicate episode loading logic (reuse AnimeService.rescan())
  • No duplicate NFO generation (reuse NFOService.create_tvshow_nfo())
  • No duplicate database operations (reuse AnimeSeriesService)
  • No duplicate WebSocket logic (extend existing patterns)
  • BackgroundLoader only orchestrates, doesn't reimplement

Architecture Quality

  • Clear separation of concerns
  • Existing functionality not broken (backward compatible)
  • New services follow project patterns
  • API design consistent with existing endpoints
  • Database changes are backward compatible (defaults for new fields)
  • All integration points documented
  • Error handling consistent across services

Service Integration

  • AnimeService methods identified for reuse
  • NFOService integration documented
  • WebSocket pattern followed
  • Database service usage clear
  • Dependency injection strategy defined

Testing Strategy

  • Unit tests for BackgroundLoaderService
  • Integration tests for end-to-end flow
  • API tests for new endpoints
  • WebSocket tests for broadcasts
  • Database migration tests

Key Design Decisions

1. Queue-Based Architecture

Rationale: Provides natural async processing, rate limiting, and graceful shutdown

2. Reuse Existing Services

Rationale: Avoid code duplication, leverage tested code, maintain consistency

3. Incremental Progress Updates

Rationale: Better UX, allows UI to show detailed progress

4. Database-Backed Status

Rationale: Survives restarts, enables startup checks, provides audit trail

5. 202 Accepted Response

Rationale: HTTP standard for async operations, clear client expectation


Next Steps

  1. Review this document with team/stakeholders
  2. Get approval on architecture approach
  3. Begin Phase 1 (Database changes)
  4. Implement incrementally following the phase plan
  5. Test thoroughly at each phase
  6. Document as you implement

Questions for Review

  1. Does BackgroundLoaderService correctly reuse existing services?
  2. Are database changes backward compatible?
  3. Is WebSocket message format consistent?
  4. Are error handling strategies appropriate?
  5. Is startup check logic sound?
  6. Are API responses following REST best practices?

Document Status: READY FOR REVIEW AND IMPLEMENTATION

This architecture ensures clean integration without code duplication while following all project patterns and best practices.