Files

Lukas f18c31a035 Implement async series data loading with background processing

- Add loading status fields to AnimeSeries model
- Create BackgroundLoaderService for async task processing
- Update POST /api/anime/add to return 202 Accepted immediately
- Add GET /api/anime/{key}/loading-status endpoint
- Integrate background loader with startup/shutdown lifecycle
- Create database migration script for loading status fields
- Add unit tests for BackgroundLoaderService (10 tests, all passing)
- Update AnimeSeriesService.create() to accept loading status fields

Architecture follows clean separation with no code duplication:
- BackgroundLoader orchestrates, doesn't reimplement
- Reuses existing AnimeService, NFOService, WebSocket patterns
- Database-backed status survives restarts

2026-01-19 07:14:55 +01:00

24 KiB

Raw Blame History

Asynchronous Series Data Loading Architecture

Version: 1.0
Date: 2026-01-18
Status: Planning Phase

Executive Summary
Current State Analysis
Reusable Components
Proposed Architecture
Data Flow
Database Schema Changes
API Specifications
Error Handling Strategy
Integration Points
Code Reuse Strategy
Implementation Plan

Executive Summary

This document describes the architecture for implementing asynchronous series data loading with background processing. The goal is to allow users to add series immediately while metadata (episodes, NFO files, logos, images) loads asynchronously in the background, improving UX by not blocking during time-consuming operations.

Key Principles:

No Code Duplication: Reuse existing services and methods
Clean Separation: New BackgroundLoaderService orchestrates existing components
Progressive Enhancement: Add async loading without breaking existing functionality
Existing Patterns: Follow current WebSocket, service, and database patterns

Current State Analysis

Existing Services and Components

1. AnimeService (`src/server/services/anime_service.py`)

Purpose: Web layer wrapper around SeriesApp
Key Methods:
- add_series_to_db(serie, db): Adds series to database with episodes
- Event handlers for download/scan status
- Progress tracking integration
Database Integration: Uses AnimeSeriesService and EpisodeService
Reusability: ✅ Can be reused for database operations

2. SeriesApp (`src/core/SeriesApp.py`)

Purpose: Core domain logic for series management
Key Functionality:
- Series scanning and episode detection
- Download management with progress tracking
- Event-based status updates
NFO Service: Has NFOService instance for metadata generation
Reusability: ✅ Event system can be used for background tasks

3. NFOService (`src/core/services/nfo_service.py`)

Purpose: Create and manage tvshow.nfo files
Key Methods:
- create_tvshow_nfo(serie_name, serie_folder, year, ...): Full NFO creation with images
- check_nfo_exists(serie_folder): Check if NFO exists
- update_tvshow_nfo(...): Update existing NFO
Image Downloads: Handles poster, logo, fanart downloads
Reusability: ✅ Direct reuse for NFO and image loading

4. WebSocketService (`src/server/services/websocket_service.py`)

Purpose: Real-time communication with clients
Features:
- Connection management with room-based messaging
- Broadcast to all or specific rooms
- Personal messaging
Message Format: JSON with type field and payload
Reusability: ✅ Existing broadcast methods can be used

5. Database Models (`src/server/database/models.py`)

Current AnimeSeries Model Fields:

- id: int (PK, autoincrement)
- key: str (unique, indexed) - PRIMARY IDENTIFIER
- name: str (indexed)
- site: str
- folder: str - METADATA ONLY
- year: Optional[int]
- has_nfo: bool (default False)
- nfo_created_at: Optional[datetime]
- nfo_updated_at: Optional[datetime]
- tmdb_id: Optional[int]
- tvdb_id: Optional[int]
- episodes: relationship
- download_items: relationship

Fields to Add:

- loading_status: str - "pending", "loading", "completed", "failed"
- episodes_loaded: bool - Whether episodes have been scanned
- logo_loaded: bool - Whether logo image exists
- images_loaded: bool - Whether poster/fanart exist
- loading_started_at: Optional[datetime]
- loading_completed_at: Optional[datetime]
- loading_error: Optional[str]

6. Current API Pattern (`src/server/api/anime.py`)

Current /api/anime/add endpoint:

@router.post("/add")
async def add_series(...):
    # 1. Validate link and extract key
    # 2. Fetch year from provider
    # 3. Create sanitized folder name
    # 4. Save to database
    # 5. Create folder on disk
    # 6. Trigger targeted scan for episodes
    # 7. Return complete result

Issues with Current Approach:

❌ Blocks until scan completes (can take 10-30 seconds)
❌ User must wait before seeing series in UI
❌ NFO/images not created automatically
❌ No background processing on startup for incomplete series

Reusable Components

Components That Will Be Reused (No Duplication)

1. Episode Loading

Existing Method: AnimeService.rescan() or SeriesApp.scan()

Already handles episode detection and database sync
Reuse Strategy: Call anime_service.rescan() for specific series key

2. NFO Generation

Existing Method: NFOService.create_tvshow_nfo()

Already downloads poster, logo, fanart
Reuse Strategy: Direct call via SeriesApp.nfo_service.create_tvshow_nfo()

3. Database Operations

Existing Services: AnimeSeriesService, EpisodeService

CRUD operations for series and episodes
Reuse Strategy: Use existing service methods for status updates

4. WebSocket Broadcasting

Existing Methods: WebSocketService.broadcast(), broadcast_to_room()

Reuse Strategy: Create new broadcast method broadcast_loading_status() following existing pattern

5. Progress Tracking

Existing Service: ProgressService

Reuse Strategy: May integrate for UI progress bars (optional)

Components That Need Creation

1. BackgroundLoaderService

Purpose: Orchestrate async loading tasks

What it does: Queue management, task scheduling, status tracking
What it doesn't do: Actual loading (delegates to existing services)

2. Loading Status Models

Purpose: Type-safe status tracking

Enums for loading status
Data classes for loading tasks

Proposed Architecture

Component Diagram

┌─────────────────────────────────────────────────────────────┐
│                        FastAPI Application                   │
└─────────────────────────────────────────────────────────────┘
                               │
                               ▼
┌─────────────────────────────────────────────────────────────┐
│                     API Layer (anime.py)                     │
│  POST /api/anime/add (202 Accepted - immediate return)      │
│  GET /api/anime/{key}/loading-status                         │
└─────────────────────────────────────────────────────────────┘
                               │
                               ▼
┌─────────────────────────────────────────────────────────────┐
│               BackgroundLoaderService (NEW)                  │
│  - add_series_loading_task(key)                              │
│  - check_missing_data(key)                                   │
│  - _worker() [background task queue consumer]                │
│  - _load_series_data(task) [orchestrator]                    │
└─────────────────────────────────────────────────────────────┘
           │              │              │              │
           ▼              ▼              ▼              ▼
┌──────────────┐ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ AnimeService │ │  NFOService  │ │  Database    │ │  WebSocket   │
│  (EXISTING)  │ │  (EXISTING)  │ │   Service    │ │   Service    │
│              │ │              │ │  (EXISTING)  │ │  (EXISTING)  │
│ - rescan()   │ │ - create_nfo│ │ - update_    │ │ - broadcast_ │
│              │ │ - download  │ │   status     │ │   loading    │
└──────────────┘ └──────────────┘ └──────────────┘ └──────────────┘

Sequence Diagram: Add Series Flow

User → API: POST /api/anime/add {"link": "...", "name": "..."}
API → Database: Create AnimeSeries (loading_status="pending")
API → BackgroundLoader: add_series_loading_task(key)
API → User: 202 Accepted {"key": "...", "status": "loading"}
API → WebSocket: broadcast_loading_status("pending")

[Background Worker Task]
BackgroundLoader → BackgroundLoader: _worker() picks up task
BackgroundLoader → Database: check_missing_data(key)
BackgroundLoader → WebSocket: broadcast("loading_episodes")
BackgroundLoader → AnimeService: rescan(key) [REUSE EXISTING]
AnimeService → Database: Update episodes
BackgroundLoader → Database: Update episodes_loaded=True

BackgroundLoader → WebSocket: broadcast("loading_nfo")
BackgroundLoader → NFOService: create_tvshow_nfo() [REUSE EXISTING]
NFOService → TMDB API: Fetch metadata
NFOService → Filesystem: Download poster/logo/fanart
BackgroundLoader → Database: Update nfo_loaded=True, logo_loaded=True, images_loaded=True

BackgroundLoader → Database: Update loading_status="completed"
BackgroundLoader → WebSocket: broadcast("completed")

Data Flow

Immediate Series Addition (Synchronous)

User submits series link and name
API validates input, extracts key
API fetches year from provider (quick operation)
API creates database record with loading_status="pending"
API creates folder on disk
API queues background loading task
API returns 202 Accepted immediately
WebSocket broadcasts initial status

Background Data Loading (Asynchronous)

Worker picks up task from queue
Worker checks what data is missing
For each missing data type:
- Update status and broadcast via WebSocket
- Call existing service (episodes/NFO/images)
- Update database flags
Mark as completed and broadcast final status

Database Schema Changes

Migration: Add Loading Status Fields

File: migrations/add_loading_status_fields.py

"""Add loading status fields to anime_series table.

Revision ID: 001_async_loading
Create Date: 2026-01-18
"""

from alembic import op
import sqlalchemy as sa
from sqlalchemy import Boolean, DateTime, String

def upgrade():
    # Add new columns
    op.add_column('anime_series',
        sa.Column('loading_status', String(50), nullable=False,
                  server_default='completed')
    )
    op.add_column('anime_series',
        sa.Column('episodes_loaded', Boolean, nullable=False,
                  server_default='1')
    )
    op.add_column('anime_series',
        sa.Column('logo_loaded', Boolean, nullable=False,
                  server_default='0')
    )
    op.add_column('anime_series',
        sa.Column('images_loaded', Boolean, nullable=False,
                  server_default='0')
    )
    op.add_column('anime_series',
        sa.Column('loading_started_at', DateTime(timezone=True),
                  nullable=True)
    )
    op.add_column('anime_series',
        sa.Column('loading_completed_at', DateTime(timezone=True),
                  nullable=True)
    )
    op.add_column('anime_series',
        sa.Column('loading_error', String(1000), nullable=True)
    )

    # Set existing series as completed since they were added synchronously
    op.execute(
        "UPDATE anime_series SET loading_status = 'completed', "
        "episodes_loaded = 1 WHERE loading_status = 'completed'"
    )

def downgrade():
    op.drop_column('anime_series', 'loading_error')
    op.drop_column('anime_series', 'loading_completed_at')
    op.drop_column('anime_series', 'loading_started_at')
    op.drop_column('anime_series', 'images_loaded')
    op.drop_column('anime_series', 'logo_loaded')
    op.drop_column('anime_series', 'episodes_loaded')
    op.drop_column('anime_series', 'loading_status')

Updated AnimeSeries Model

class AnimeSeries(Base, TimestampMixin):
    __tablename__ = "anime_series"

    # ... existing fields ...

    # Loading status fields (NEW)
    loading_status: Mapped[str] = mapped_column(
        String(50), default="completed", server_default="completed",
        doc="Loading status: pending, loading_episodes, loading_nfo, "
            "loading_logo, loading_images, completed, failed"
    )
    episodes_loaded: Mapped[bool] = mapped_column(
        Boolean, default=True, server_default="1",
        doc="Whether episodes have been scanned and loaded"
    )
    logo_loaded: Mapped[bool] = mapped_column(
        Boolean, default=False, server_default="0",
        doc="Whether logo.png has been downloaded"
    )
    images_loaded: Mapped[bool] = mapped_column(
        Boolean, default=False, server_default="0",
        doc="Whether poster/fanart have been downloaded"
    )
    loading_started_at: Mapped[Optional[datetime]] = mapped_column(
        DateTime(timezone=True), nullable=True,
        doc="When background loading started"
    )
    loading_completed_at: Mapped[Optional[datetime]] = mapped_column(
        DateTime(timezone=True), nullable=True,
        doc="When background loading completed"
    )
    loading_error: Mapped[Optional[str]] = mapped_column(
        String(1000), nullable=True,
        doc="Error message if loading failed"
    )

API Specifications

POST /api/anime/add

Purpose: Add a new series immediately and queue background loading

Changes from Current:

Returns 202 Accepted instead of 200 OK (indicates async processing)
Returns immediately without waiting for scan
Includes loading_status in response

Request:

{
    "link": "https://aniworld.to/anime/stream/attack-on-titan",
    "name": "Attack on Titan"
}

Response: 202 Accepted

{
    "status": "success",
    "message": "Series added and queued for background loading",
    "key": "attack-on-titan",
    "folder": "Attack on Titan (2013)",
    "db_id": 123,
    "loading_status": "pending",
    "loading_progress": {
        "episodes": false,
        "nfo": false,
        "logo": false,
        "images": false
    }
}

GET /api/anime/{key}/loading-status (NEW)

Purpose: Get current loading status for a series

Request:

GET /api/anime/attack-on-titan/loading-status

Response: 200 OK

{
    "key": "attack-on-titan",
    "loading_status": "loading_nfo",
    "progress": {
        "episodes": true,
        "nfo": false,
        "logo": false,
        "images": false
    },
    "started_at": "2026-01-18T10:30:00Z",
    "message": "Generating NFO file...",
    "error": null
}

When Completed:

{
    "key": "attack-on-titan",
    "loading_status": "completed",
    "progress": {
        "episodes": true,
        "nfo": true,
        "logo": true,
        "images": true
    },
    "started_at": "2026-01-18T10:30:00Z",
    "completed_at": "2026-01-18T10:30:45Z",
    "message": "All data loaded successfully",
    "error": null
}

WebSocket Message Format

Following Existing Pattern:

{
    "type": "series_loading_update",
    "key": "attack-on-titan",
    "loading_status": "loading_episodes",
    "progress": {
        "episodes": false,
        "nfo": false,
        "logo": false,
        "images": false
    },
    "message": "Loading episodes...",
    "timestamp": "2026-01-18T10:30:15Z"
}

Error Handling Strategy

Error Types

Network Errors (TMDB API, provider site)
- Retry with exponential backoff
- Max 3 retries
- Mark as failed if all retries exhausted
Filesystem Errors (disk space, permissions)
- No retry
- Mark as failed immediately
- Log detailed error
Database Errors (connection, constraints)
- Retry once after 1 second
- Mark as failed if retry fails

Error Recording

Store error message in loading_error field
Set loading_status to "failed"
Broadcast error via WebSocket
Log with full context for debugging

Partial Success

If episodes load but NFO fails: Mark specific flags
Allow manual retry for failed components
Show partial status in UI

Integration Points

1. AnimeService Integration

Current Usage:

# In anime.py API
anime_service = Depends(get_anime_service)
await anime_service.rescan()

New Usage in BackgroundLoader:

# Reuse rescan for specific series
await anime_service.rescan_series(key)

No Changes Needed to AnimeService - Reuse as-is

2. NFOService Integration

Current Access:

# Via SeriesApp
series_app.nfo_service.create_tvshow_nfo(...)

New Usage in BackgroundLoader:

# Get NFOService from SeriesApp
if series_app.nfo_service:
    await series_app.nfo_service.create_tvshow_nfo(
        serie_name=name,
        serie_folder=folder,
        year=year,
        download_poster=True,
        download_logo=True,
        download_fanart=True
    )

No Changes Needed to NFOService - Reuse as-is

3. WebSocketService Integration

Existing Pattern:

# In websocket_service.py
async def broadcast_download_progress(...):
    message = {
        "type": "download_progress",
        "key": key,
        ...
    }
    await self.broadcast(message)

New Method (Following Pattern):

async def broadcast_loading_status(
    self,
    key: str,
    loading_status: str,
    progress: Dict[str, bool],
    message: str
):
    """Broadcast loading status update."""
    payload = {
        "type": "series_loading_update",
        "key": key,
        "loading_status": loading_status,
        "progress": progress,
        "message": message,
        "timestamp": datetime.now(timezone.utc).isoformat()
    }
    await self.broadcast(payload)

4. Database Service Integration

Existing Services:

AnimeSeriesService.get_by_key(db, key)
AnimeSeriesService.update(db, series_id, **kwargs)

New Helper Methods Needed:

# In AnimeSeriesService
async def update_loading_status(
    db,
    key: str,
    loading_status: str,
    **progress_flags
):
    """Update loading status and progress flags."""
    series = await self.get_by_key(db, key)
    if series:
        for field, value in progress_flags.items():
            setattr(series, field, value)
        series.loading_status = loading_status
        await db.commit()

Code Reuse Strategy

DO NOT DUPLICATE

❌ Episode Loading Logic

Wrong:

# DON'T create new episode scanning logic
async def _scan_episodes(self, key: str):
    # Duplicate logic...

Right:

# Reuse existing AnimeService method
await self.anime_service.rescan_series(key)

❌ NFO Generation Logic

Wrong:

# DON'T reimplement TMDB API calls
async def _create_nfo(self, series):
    # Duplicate TMDB logic...

Right:

# Reuse existing NFOService
await self.series_app.nfo_service.create_tvshow_nfo(...)

❌ Database CRUD Operations

Wrong:

# DON'T write raw SQL
await db.execute("UPDATE anime_series SET ...")

Right:

# Use existing service methods
await AnimeSeriesService.update(db, series_id, loading_status="completed")

WHAT TO CREATE

✅ Task Queue Management

class BackgroundLoaderService:
    def __init__(self):
        self.task_queue: Queue[SeriesLoadingTask] = Queue()
        self.active_tasks: Dict[str, SeriesLoadingTask] = {}

✅ Orchestration Logic

async def _load_series_data(self, task: SeriesLoadingTask):
    """Orchestrate loading by calling existing services."""
    # Check what's missing
    # Call appropriate existing services
    # Update status

✅ Status Tracking

class LoadingStatus(Enum):
    PENDING = "pending"
    LOADING_EPISODES = "loading_episodes"
    LOADING_NFO = "loading_nfo"
    COMPLETED = "completed"
    FAILED = "failed"

Implementation Plan

Phase 1: Database and Models (Step 1-2 of instructions)

Create Alembic migration for new fields
Run migration to update database
Update AnimeSeries model with new fields
Test database changes

Phase 2: BackgroundLoaderService (Step 3-4)

Create background_loader_service.py
Implement task queue and worker
Implement orchestration methods (calling existing services)
Add status tracking
Write unit tests

Phase 3: API Updates (Step 5-6)

Update POST /api/anime/add for immediate return
Create GET /api/anime/{key}/loading-status endpoint
Update response models
Write API tests

Phase 4: WebSocket Integration (Step 7)

Add broadcast_loading_status() to WebSocketService
Integrate broadcasts in BackgroundLoader
Write WebSocket tests

Phase 5: Startup Check (Step 8)

Add startup event handler to check incomplete series
Queue incomplete series for background loading
Add graceful shutdown for background tasks
Write integration tests

Phase 6: Frontend (Step 9-10)

Add loading indicators to series cards
Handle WebSocket loading status messages
Add CSS for loading states
Test UI responsiveness

Validation Checklist

Code Duplication Prevention

✅ No duplicate episode loading logic (reuse AnimeService.rescan())
✅ No duplicate NFO generation (reuse NFOService.create_tvshow_nfo())
✅ No duplicate database operations (reuse AnimeSeriesService)
✅ No duplicate WebSocket logic (extend existing patterns)
✅ BackgroundLoader only orchestrates, doesn't reimplement

Architecture Quality

✅ Clear separation of concerns
✅ Existing functionality not broken (backward compatible)
✅ New services follow project patterns
✅ API design consistent with existing endpoints
✅ Database changes are backward compatible (defaults for new fields)
✅ All integration points documented
✅ Error handling consistent across services

Service Integration

✅ AnimeService methods identified for reuse
✅ NFOService integration documented
✅ WebSocket pattern followed
✅ Database service usage clear
✅ Dependency injection strategy defined

Testing Strategy

✅ Unit tests for BackgroundLoaderService
✅ Integration tests for end-to-end flow
✅ API tests for new endpoints
✅ WebSocket tests for broadcasts
✅ Database migration tests

Key Design Decisions

1. Queue-Based Architecture

Rationale: Provides natural async processing, rate limiting, and graceful shutdown

2. Reuse Existing Services

Rationale: Avoid code duplication, leverage tested code, maintain consistency

3. Incremental Progress Updates

Rationale: Better UX, allows UI to show detailed progress

4. Database-Backed Status

Rationale: Survives restarts, enables startup checks, provides audit trail

5. 202 Accepted Response

Rationale: HTTP standard for async operations, clear client expectation

Next Steps

Review this document with team/stakeholders
Get approval on architecture approach
Begin Phase 1 (Database changes)
Implement incrementally following the phase plan
Test thoroughly at each phase
Document as you implement

Questions for Review

✅ Does BackgroundLoaderService correctly reuse existing services?
✅ Are database changes backward compatible?
✅ Is WebSocket message format consistent?
✅ Are error handling strategies appropriate?
✅ Is startup check logic sound?
✅ Are API responses following REST best practices?

Document Status: ✅ READY FOR REVIEW AND IMPLEMENTATION

This architecture ensures clean integration without code duplication while following all project patterns and best practices.

24 KiB Raw Blame History

Asynchronous Series Data Loading Architecture

Table of Contents

Executive Summary

Current State Analysis

Existing Services and Components

1. AnimeService (src/server/services/anime_service.py)

2. SeriesApp (src/core/SeriesApp.py)

3. NFOService (src/core/services/nfo_service.py)

4. WebSocketService (src/server/services/websocket_service.py)

5. Database Models (src/server/database/models.py)

6. Current API Pattern (src/server/api/anime.py)

Reusable Components

Components That Will Be Reused (No Duplication)

1. Episode Loading

2. NFO Generation

3. Database Operations

4. WebSocket Broadcasting

5. Progress Tracking

Components That Need Creation

1. BackgroundLoaderService

2. Loading Status Models

Proposed Architecture

Component Diagram

Sequence Diagram: Add Series Flow

Data Flow

Immediate Series Addition (Synchronous)

Background Data Loading (Asynchronous)

Database Schema Changes

Migration: Add Loading Status Fields

Updated AnimeSeries Model

API Specifications

POST /api/anime/add

GET /api/anime/{key}/loading-status (NEW)

WebSocket Message Format

Error Handling Strategy

Error Types

Error Recording

Partial Success

Integration Points

1. AnimeService Integration

2. NFOService Integration

3. WebSocketService Integration

4. Database Service Integration

Code Reuse Strategy

DO NOT DUPLICATE

❌ Episode Loading Logic

❌ NFO Generation Logic

❌ Database CRUD Operations

WHAT TO CREATE

✅ Task Queue Management

✅ Orchestration Logic

✅ Status Tracking

Implementation Plan

Phase 1: Database and Models (Step 1-2 of instructions)

Phase 2: BackgroundLoaderService (Step 3-4)

Phase 3: API Updates (Step 5-6)

Phase 4: WebSocket Integration (Step 7)

Phase 5: Startup Check (Step 8)

Phase 6: Frontend (Step 9-10)

Validation Checklist

Code Duplication Prevention

Architecture Quality

Service Integration

Testing Strategy

Key Design Decisions

1. Queue-Based Architecture

2. Reuse Existing Services

3. Incremental Progress Updates

4. Database-Backed Status

5. 202 Accepted Response

Next Steps

Questions for Review

24 KiB

Raw Blame History

1. AnimeService (`src/server/services/anime_service.py`)

2. SeriesApp (`src/core/SeriesApp.py`)

3. NFOService (`src/core/services/nfo_service.py`)

4. WebSocketService (`src/server/services/websocket_service.py`)

5. Database Models (`src/server/database/models.py`)

6. Current API Pattern (`src/server/api/anime.py`)