Files

Lukas 6d30747f25 Fix stale data file updates on download completion

When episodes are downloaded successfully, the in-memory Serie.episodeDict
is updated, but the deprecated data file was not being synced. This caused
UI to show episodes as missing when already downloaded.

Changes:
- Update data file in _remove_episode_from_memory when download completes
- DB is authoritative; data file is optional backup (deprecated)
- Gracefully skip update if data file doesn't exist

New integration tests for episode download sync:
- Verify episode removed from missing list after download
- Verify in-memory cache updated after download
- Verify data file updated after download (when it exists)
- Verify downloads work without data file

2026-05-26 18:57:04 +02:00

27 KiB

Raw Blame History

Database Documentation

Document Purpose

This document describes the database schema, models, and data layer of the Aniworld application.

1. Database Overview

Technology

Database Engine: SQLite 3 (default), PostgreSQL supported
ORM: SQLAlchemy 2.0 with async support (aiosqlite)
Location: data/aniworld.db (configurable via DATABASE_URL)

Source: src/config/settings.py

Connection Configuration

# Default connection string
DATABASE_URL = "sqlite+aiosqlite:///./data/aniworld.db"

# PostgreSQL alternative
DATABASE_URL = "postgresql+asyncpg://user:pass@localhost/aniworld"

Source: src/server/database/connection.py

2. Entity Relationship Diagram

+---------------------+       +-------------------+       +-------------------+       +------------------------+
| system_settings     |       |   anime_series    |       |     episodes      |       |  download_queue_item   |
+---------------------+       +-------------------+       +-------------------+       +------------------------+
| id (PK)             |       | id (PK)           |<--+   | id (PK)           |   +-->| id (PK, VARCHAR)       |
| initial_scan_...    |       | key (UNIQUE)      |   |   | series_id (FK)----+---+   | series_id (FK)---------+
| initial_nfo_scan... |       | name              |   +---|                   |       | status                 |
| initial_media_...   |       | site              |       | season            |       | priority               |
| last_scan_timestamp |       | folder            |       | episode_number    |       | season                 |
| created_at          |       | created_at        |       | title             |       | episode                |
| updated_at          |       | updated_at        |       | file_path         |       | progress_percent       |
+---------------------+       +-------------------+       | is_downloaded     |       | error_message          |
                                                          | created_at        |       | retry_count            |
                                                          | updated_at        |       | added_at               |
                                                          +-------------------+       | started_at             |
                                                                                      | completed_at           |
                                                                                      | created_at             |
                                                                                      | updated_at             |
                                                                                      +------------------------+

3. Table Schemas

3.1 system_settings

Stores application-wide system settings and initialization state.

Column	Type	Constraints	Description
`id`	INTEGER	PRIMARY KEY, AUTOINCREMENT	Internal database ID (only one row)
`initial_scan_completed`	BOOLEAN	NOT NULL, DEFAULT FALSE	Whether initial anime folder scan is complete
`initial_nfo_scan_completed`	BOOLEAN	NOT NULL, DEFAULT FALSE	Whether initial NFO scan is complete
`initial_media_scan_completed`	BOOLEAN	NOT NULL, DEFAULT FALSE	Whether initial media scan is complete
`last_scan_timestamp`	DATETIME	NULLABLE	Timestamp of last completed scan
`created_at`	DATETIME	NOT NULL, DEFAULT NOW	Record creation timestamp
`updated_at`	DATETIME	NOT NULL, ON UPDATE NOW	Last update timestamp

Purpose:

This table tracks the initialization status of the application to ensure that expensive one-time setup operations (like scanning the entire anime directory) only run on the first startup, not on every restart.

Only one row exists in this table
The initial_scan_completed flag prevents redundant full directory scans on each startup
The NFO and media scan flags similarly track completion of those setup tasks

Source: src/server/database/models.py, src/server/database/system_settings_service.py

3.2 anime_series

Stores anime series metadata. Corresponds to the core Serie class.

Column	Type	Constraints	Description
`id`	INTEGER	PRIMARY KEY, AUTOINCREMENT	Internal database ID
`key`	VARCHAR(255)	UNIQUE, NOT NULL, INDEX	Primary identifier - provider-assigned URL-safe key
`name`	VARCHAR(500)	NOT NULL, INDEX	Display name of the series
`site`	VARCHAR(500)	NOT NULL	Provider site URL
`folder`	VARCHAR(1000)	NOT NULL	Filesystem folder name (metadata only)
`year`	INTEGER	NULLABLE	Release year of the series
`nfo_path`	VARCHAR(1000)	NULLABLE	Path to tvshow.nfo metadata file
`tmdb_id`	INTEGER	NULLABLE, INDEX	TMDB (The Movie Database) ID for metadata
`tvdb_id`	INTEGER	NULLABLE, INDEX	TVDB (TheTVDB) ID for metadata
`has_nfo`	BOOLEAN	NOT NULL, DEFAULT FALSE	Whether tvshow.nfo exists
`loading_status`	VARCHAR(50)	NOT NULL, DEFAULT 'completed'	Status: pending, loading_episodes, loading_nfo, completed, failed
`created_at`	DATETIME	NOT NULL, DEFAULT NOW	Record creation timestamp
`updated_at`	DATETIME	NOT NULL, ON UPDATE NOW	Last update timestamp

Identifier Convention:

key is the primary identifier for all operations (e.g., "attack-on-titan")
folder is metadata only for filesystem operations (e.g., "Attack on Titan (2013)")
id is used only for database relationships

EpisodeDict Mapping:

The episodeDict (season → episode numbers mapping) is stored as individual Episode records:

Each Episode has season and episode_number columns
Relationship: AnimeSeries.episodes returns all Episode records for that series

Source: src/server/database/models.py

3.3 episodes

Stores missing episodes that need to be downloaded. Episodes are automatically managed during scans:

New missing episodes are added to the database
Episodes that are no longer missing (files now exist) are removed from the database
When an episode is downloaded, it can be marked with is_downloaded=True or removed from tracking

Column	Type	Constraints	Description
`id`	INTEGER	PRIMARY KEY, AUTOINCREMENT	Internal database ID
`series_id`	INTEGER	FOREIGN KEY, NOT NULL, INDEX	Reference to anime_series.id
`season`	INTEGER	NOT NULL	Season number (1-based)
`episode_number`	INTEGER	NOT NULL	Episode number within season
`title`	VARCHAR(500)	NULLABLE	Episode title if known
`file_path`	VARCHAR(1000)	NULLABLE	Local file path if downloaded
`is_downloaded`	BOOLEAN	NOT NULL, DEFAULT FALSE	Download status flag
`created_at`	DATETIME	NOT NULL, DEFAULT NOW	Record creation timestamp
`updated_at`	DATETIME	NOT NULL, ON UPDATE NOW	Last update timestamp

Foreign Key:

series_id -> anime_series.id (ON DELETE CASCADE)

Source: src/server/database/models.py

3.4 download_queue_item

Stores download queue items with status tracking.

Column	Type	Constraints	Description
`id`	VARCHAR(36)	PRIMARY KEY	UUID identifier
`series_id`	INTEGER	FOREIGN KEY, NOT NULL	Reference to anime_series.id
`season`	INTEGER	NOT NULL	Season number
`episode`	INTEGER	NOT NULL	Episode number
`status`	VARCHAR(20)	NOT NULL, DEFAULT 'pending'	Download status
`priority`	VARCHAR(10)	NOT NULL, DEFAULT 'NORMAL'	Queue priority
`progress_percent`	FLOAT	NULLABLE	Download progress (0-100)
`error_message`	TEXT	NULLABLE	Error description if failed
`retry_count`	INTEGER	NOT NULL, DEFAULT 0	Number of retry attempts
`source_url`	VARCHAR(2000)	NULLABLE	Download source URL
`added_at`	DATETIME	NOT NULL, DEFAULT NOW	When added to queue
`started_at`	DATETIME	NULLABLE	When download started
`completed_at`	DATETIME	NULLABLE	When download completed/failed
`created_at`	DATETIME	NOT NULL, DEFAULT NOW	Record creation timestamp
`updated_at`	DATETIME	NOT NULL, ON UPDATE NOW	Last update timestamp

Status Values: pending, downloading, paused, completed, failed, cancelled

Priority Values: LOW, NORMAL, HIGH

Foreign Key:

series_id -> anime_series.id (ON DELETE CASCADE)

Source: src/server/database/models.py

4. Indexes

Table	Index Name	Columns	Purpose
`system_settings`	N/A (single row)	N/A	Only one row, no indexes needed
`anime_series`	`ix_anime_series_key`	`key`	Fast lookup by primary identifier
`anime_series`	`ix_anime_series_name`	`name`	Search by name
`episodes`	`ix_episodes_series_id`	`series_id`	Join with series
`download_queue_item`	`ix_download_series_id`	`series_id`	Filter by series
`download_queue_item`	`ix_download_status`	`status`	Filter by status

5. Model Layer

5.1 SQLAlchemy ORM Models

# src/server/database/models.py

class AnimeSeries(Base, TimestampMixin):
    __tablename__ = "anime_series"

    id: Mapped[int] = mapped_column(Integer, primary_key=True)
    key: Mapped[str] = mapped_column(String(255), unique=True, index=True)
    name: Mapped[str] = mapped_column(String(500), index=True)
    site: Mapped[str] = mapped_column(String(500))
    folder: Mapped[str] = mapped_column(String(1000))

    episodes: Mapped[List["Episode"]] = relationship(
        "Episode", back_populates="series", cascade="all, delete-orphan"
    )

Source: src/server/database/models.py

5.2 Pydantic API Models

# src/server/models/download.py

class DownloadItem(BaseModel):
    id: str
    serie_id: str      # Maps to anime_series.key
    serie_folder: str  # Metadata only
    serie_name: str
    episode: EpisodeIdentifier
    status: DownloadStatus
    priority: DownloadPriority

Source: src/server/models/download.py

5.3 Model Mapping

API Field	Database Column	Notes
`serie_id`	`anime_series.key`	Primary identifier
`serie_folder`	`anime_series.folder`	Metadata only
`serie_name`	`anime_series.name`	Display name

6. Transaction Support

6.1 Overview

The database layer provides comprehensive transaction support to ensure data consistency across compound operations. All write operations can be wrapped in explicit transactions.

Source: src/server/database/transaction.py

6.2 Transaction Utilities

Component	Type	Description
`@transactional`	Decorator	Wraps function in transaction boundary
`atomic()`	Async context mgr	Provides atomic operation block
`atomic_sync()`	Sync context mgr	Sync version of atomic()
`TransactionContext`	Class	Explicit sync transaction control
`AsyncTransactionContext`	Class	Explicit async transaction control
`TransactionManager`	Class	Helper for manual transaction management

6.3 Transaction Propagation Modes

Mode	Behavior
`REQUIRED`	Use existing transaction or create new (default)
`REQUIRES_NEW`	Always create new transaction
`NESTED`	Create savepoint within existing transaction

6.4 Usage Examples

Using @transactional decorator:

from src.server.database.transaction import transactional

@transactional()
async def compound_operation(db: AsyncSession, data: dict):
    # All operations commit together or rollback on error
    series = await AnimeSeriesService.create(db, ...)
    episode = await EpisodeService.create(db, series_id=series.id, ...)
    return series, episode

Using atomic() context manager:

from src.server.database.transaction import atomic

async def some_function(db: AsyncSession):
    async with atomic(db) as tx:
        await operation1(db)
        await operation2(db)
        # Auto-commits on success, rolls back on exception

Using savepoints for partial rollback:

async with atomic(db) as tx:
    await outer_operation(db)

    async with tx.savepoint() as sp:
        await risky_operation(db)
        if error_condition:
            await sp.rollback()  # Only rollback nested ops

    await final_operation(db)  # Still executes

Source: src/server/database/transaction.py

6.5 Connection Module Additions

Function	Description
`get_transactional_session`	Session without auto-commit for transactions
`TransactionManager`	Helper class for manual transaction control
`is_session_in_transaction`	Check if session is in active transaction
`get_session_transaction_depth`	Get nesting depth of transactions

Source: src/server/database/connection.py

7. Repository Pattern

The QueueRepository class provides data access abstraction.

class QueueRepository:
    async def save_item(self, item: DownloadItem) -> None:
        """Save or update a download item (atomic operation)."""

    async def get_all_items(self) -> List[DownloadItem]:
        """Get all items from database."""

    async def delete_item(self, item_id: str) -> bool:
        """Delete item by ID."""

    async def clear_all(self) -> int:
        """Clear all items (atomic operation)."""

Note: Compound operations (save_item, clear_all) are wrapped in atomic() transactions.

Source: src/server/services/queue_repository.py

8. Database Service

The AnimeSeriesService provides async CRUD operations.

class AnimeSeriesService:
    @staticmethod
    async def create(
        db: AsyncSession,
        key: str,
        name: str,
        site: str,
        folder: str
    ) -> AnimeSeries:
        """Create a new anime series."""

    @staticmethod
    async def get_by_key(
        db: AsyncSession,
        key: str
    ) -> Optional[AnimeSeries]:
        """Get series by primary key identifier."""

Bulk Operations

Services provide bulk operations for transaction-safe batch processing:

Service	Method	Description
`EpisodeService`	`bulk_mark_downloaded`	Mark multiple episodes at once
`DownloadQueueService`	`bulk_delete`	Delete multiple queue items
`DownloadQueueService`	`clear_all`	Clear entire queue
`UserSessionService`	`rotate_session`	Revoke old + create new atomic
`UserSessionService`	`cleanup_expired`	Bulk delete expired sessions

Source: src/server/database/service.py

9. Data Integrity Rules

Validation Constraints

Field	Rule	Error Message
`anime_series.key`	Non-empty, max 255 chars	"Series key cannot be empty"
`anime_series.name`	Non-empty, max 500 chars	"Series name cannot be empty"
`episodes.season`	0-1000	"Season number must be non-negative"
`episodes.episode_number`	0-10000	"Episode number must be non-negative"

Source: src/server/database/models.py

Cascade Rules

Deleting anime_series deletes all related episodes and download_queue_item

10. Migration Strategy

Currently, SQLAlchemy's create_all() is used for schema creation.

# src/server/database/connection.py
async def init_db():
    async with engine.begin() as conn:
        await conn.run_sync(Base.metadata.create_all)

For production migrations, Alembic is recommended but not yet implemented.

Source: src/server/database/connection.py

11. Common Query Patterns

Get all series with missing episodes

series = await db.execute(
    select(AnimeSeries).options(selectinload(AnimeSeries.episodes))
)
for serie in series.scalars():
    downloaded = [e for e in serie.episodes if e.is_downloaded]

Get pending downloads ordered by priority

items = await db.execute(
    select(DownloadQueueItem)
    .where(DownloadQueueItem.status == "pending")
    .order_by(
        case(
            (DownloadQueueItem.priority == "HIGH", 1),
            (DownloadQueueItem.priority == "NORMAL", 2),
            (DownloadQueueItem.priority == "LOW", 3),
        ),
        DownloadQueueItem.added_at
    )
)

12. Series Storage: Database vs Files (Deprecated)

File-Based Storage (Removed in v2.0)

Prior to v2.0, series metadata was stored in two files per anime folder:

File	Contents
`key`	Series provider key (e.g., `"attack-on-titan"`)
`data`	JSON serialization of `Serie` object

File structure example:

/anime/Attack on Titan (2013)/
├── key          # Contains: attack-on-titan
├── data         # Contains: {"key": "...", "name": "...", "episodeDict": {...}}
├── Season 1/
│   └── ...

Database Storage (Current)

Since v2.0, all series metadata is stored in the anime_series table with Episode records for episode tracking. This provides:

ACID transactions for data consistency
Foreign key constraints (cascade delete)
Indexed queries for fast lookups
No filesystem dependency for metadata

Migration from Files to Database

The Serie.save_to_file() and Serie.load_from_file() methods are deprecated but still functional for backward compatibility during migration:

from src.core.entities.series import Serie

# Old file-based loading (deprecated)
serie = Serie.load_from_file("/anime/Attack on Titan (2013)/data")

# New database-based loading
from src.server.database.service import AnimeSeriesService
serie = await AnimeSeriesService.get_by_key(db, "attack-on-titan")

Removing File Dependencies

After verifying database schema supports all fields, file-based storage can be removed:

✅ Schema verified: All Serie fields have corresponding DB columns
✅ Migration complete: All existing series migrated to database
❌ File cleanup: Remove key and data files (pending)

Note: The save_to_file() and load_from_file() methods will be removed in v3.0.0.

12. Series Persistence Flow

When a directory scan discovers or updates series, the scanner persists data to the database instead of writing to disk files.

Scan Flow

Scan Directory
    │
    ▼
Find MP4 Files → Extract Serie Key
    │
    ▼
Check DB for Existing Series (by key)
    │
    ├─── EXISTS ──────────────────────► Update Series Metadata
    │                                        │
    │                                        ▼
    │                                 Sync Episodes to DB
    │                                      │
    │◄──────────────────────────────────────┘
    │
    └─── NEW ───────────────────────────► Create New Series Record
                                             │
                                             ▼
                                      Create Episode Records
                                             │
                                             ▼
                                      Return to Scan Loop

Key Methods

SerieScanner._persist_serie_to_db()

Called after get_missing_episodes_and_season() computes episodeDict
Uses AnimeSeriesService.get_by_key() to check if series exists
If exists: calls AnimeSeriesService.update() + _sync_episodes_to_db()
If new: calls AnimeSeriesService.create() + creates episodes

SerieScanner._sync_episodes_to_db()

Gets existing episodes from DB via EpisodeService.get_by_series()
Compares with new episodeDict
Removes episodes no longer missing (unless is_downloaded=True)
Adds new missing episodes
Preserves is_downloaded=True episodes when removing missing ones

SerieList.add_to_db()

Used when adding a new discovered series via API
Creates filesystem folder + database record + episode records

Episode Sync Logic

# For each episode in DB but not in new episodeDict:
if episode.is_downloaded:
    # Keep - file exists, don't remove
    pass
else:
    # Remove - no longer missing
    EpisodeService.delete()

# For each episode in new episodeDict but not in DB:
# Add as new missing episode
EpisodeService.create(is_downloaded=False)

Transaction Handling

DB operations use their own session with commit/rollback
If DB write fails, error is logged and scan continues
File-based save_to_file() no longer called during scan

Migration Path

v2.x: Scanner writes to both DB (primary) and files (fallback)
v3.0: Scanner writes only to DB, file methods removed

13. Series Persistence

Schema

AnimeSeries Table: Stores series metadata (key, name, site, folder, year)

Column	Type	Constraints	Description
`id`	INTEGER	PRIMARY KEY	Auto-increment
`key`	VARCHAR(255)	UNIQUE, NOT NULL	Series provider key
`name`	VARCHAR(500)	NOT NULL	Display name
`site`	VARCHAR(500)		Provider site URL
`folder`	VARCHAR(1000)		Filesystem folder

Episode Table: Stores per-episode metadata (season, episode_number, is_downloaded)

Column	Type	Constraints	Description
`id`	INTEGER	PRIMARY KEY	Auto-increment
`series_id`	INTEGER	FOREIGN KEY → anime_series	Parent series
`season`	INTEGER	NOT NULL	Season number
`episode_number`	INTEGER	NOT NULL	Episode number
`is_downloaded`	BOOLEAN	DEFAULT FALSE	Download status

Relationships

AnimeSeries.episodes → List of Episode objects (one-to-many)
Episode.series → Parent AnimeSeries (many-to-one)
Cascade delete: Deleting a series removes all its episodes

Queries

# Get all series with episodes
AnimeSeriesService.get_all(db, with_episodes=True)

# Get by provider key
AnimeSeriesService.get_by_key(db, key)

# Get by folder path
AnimeSeriesService.get_by_folder(db, folder)

14. Database Location

Environment	Default Location
Development	`./data/aniworld.db`
Production	Via `DATABASE_URL` environment variable
Testing	In-memory SQLite (`sqlite+aiosqlite:///:memory:`)

27 KiB Raw Blame History

Database Documentation

Document Purpose

1. Database Overview

Technology

Connection Configuration

2. Entity Relationship Diagram

3. Table Schemas

3.1 system_settings

3.2 anime_series

3.3 episodes

3.4 download_queue_item

4. Indexes

5. Model Layer

5.1 SQLAlchemy ORM Models

5.2 Pydantic API Models

5.3 Model Mapping

6. Transaction Support

6.1 Overview

6.2 Transaction Utilities

6.3 Transaction Propagation Modes

6.4 Usage Examples

6.5 Connection Module Additions

7. Repository Pattern

8. Database Service

Bulk Operations

9. Data Integrity Rules

Validation Constraints

Cascade Rules

10. Migration Strategy

11. Common Query Patterns

Get all series with missing episodes

Get pending downloads ordered by priority

12. Series Storage: Database vs Files (Deprecated)

File-Based Storage (Removed in v2.0)

Database Storage (Current)

Migration from Files to Database

Removing File Dependencies

12. Series Persistence Flow

Scan Flow

Key Methods

Episode Sync Logic

Transaction Handling

Migration Path

13. Series Persistence

Schema

Relationships

Queries

14. Database Location

27 KiB

Raw Blame History