docs: add comprehensive documentation files

Added documentation for API, architecture, configuration, database,
development guide, testing, and navigation. Includes helper scripts,
diagrams, and guides for NFO files and migration.
This commit is contained in:
2026-06-06 23:15:46 +02:00
parent 4076b9dd43
commit 486c5440f2
20 changed files with 6301 additions and 0 deletions

642
Docs/DATABASE.md Normal file
View File

@@ -0,0 +1,642 @@
# Database Documentation
## Document Purpose
This document describes the database schema, models, and data layer of the Aniworld application.
---
## 1. Database Overview
### Technology
- **Database Engine**: SQLite 3 (default), PostgreSQL supported
- **ORM**: SQLAlchemy 2.0 with async support (aiosqlite)
- **Location**: `data/aniworld.db` (configurable via `DATABASE_URL`)
Source: [src/config/settings.py](../src/config/settings.py#L53-L55)
### Connection Configuration
```python
# Default connection string
DATABASE_URL = "sqlite+aiosqlite:///./data/aniworld.db"
# PostgreSQL alternative
DATABASE_URL = "postgresql+asyncpg://user:pass@localhost/aniworld"
```
Source: [src/server/database/connection.py](../src/server/database/connection.py)
---
## 2. Entity Relationship Diagram
```
+---------------------+ +-------------------+ +-------------------+ +------------------------+
| system_settings | | anime_series | | episodes | | download_queue_item |
+---------------------+ +-------------------+ +-------------------+ +------------------------+
| id (PK) | | id (PK) |<--+ | id (PK) | +-->| id (PK, VARCHAR) |
| initial_scan_... | | key (UNIQUE) | | | series_id (FK)----+---+ | series_id (FK)---------+
| initial_nfo_scan... | | name | +---| | | status |
| initial_media_... | | site | | season | | priority |
| last_scan_timestamp | | folder | | episode_number | | season |
| created_at | | created_at | | title | | episode |
| updated_at | | updated_at | | file_path | | progress_percent |
+---------------------+ +-------------------+ | is_downloaded | | error_message |
| created_at | | retry_count |
| updated_at | | added_at |
+-------------------+ | started_at |
| completed_at |
| created_at |
| updated_at |
+------------------------+
```
---
## 3. Table Schemas
### 3.1 system_settings
Stores application-wide system settings and initialization state.
| Column | Type | Constraints | Description |
| ------------------------------ | -------- | -------------------------- | --------------------------------------------- |
| `id` | INTEGER | PRIMARY KEY, AUTOINCREMENT | Internal database ID (only one row) |
| `initial_scan_completed` | BOOLEAN | NOT NULL, DEFAULT FALSE | Whether initial anime folder scan is complete |
| `initial_nfo_scan_completed` | BOOLEAN | NOT NULL, DEFAULT FALSE | Whether initial NFO scan is complete |
| `initial_media_scan_completed` | BOOLEAN | NOT NULL, DEFAULT FALSE | Whether initial media scan is complete |
| `last_scan_timestamp` | DATETIME | NULLABLE | Timestamp of last completed scan |
| `created_at` | DATETIME | NOT NULL, DEFAULT NOW | Record creation timestamp |
| `updated_at` | DATETIME | NOT NULL, ON UPDATE NOW | Last update timestamp |
**Purpose:**
This table tracks the initialization status of the application to ensure that expensive one-time setup operations (like scanning the entire anime directory) only run on the first startup, not on every restart.
- Only one row exists in this table
- The `initial_scan_completed` flag prevents redundant full directory scans on each startup
- The NFO and media scan flags similarly track completion of those setup tasks
Source: [src/server/database/models.py](../src/server/database/models.py), [src/server/database/system_settings_service.py](../src/server/database/system_settings_service.py)
### 3.2 anime_series
Stores anime series metadata. Corresponds to the core `Serie` class.
| Column | Type | Constraints | Description |
| ---------------- | ------------- | -------------------------- | ------------------------------------------------------- |
| `id` | INTEGER | PRIMARY KEY, AUTOINCREMENT | Internal database ID |
| `key` | VARCHAR(255) | UNIQUE, NOT NULL, INDEX | **Primary identifier** - provider-assigned URL-safe key |
| `name` | VARCHAR(500) | NOT NULL, INDEX | Display name of the series |
| `site` | VARCHAR(500) | NOT NULL | Provider site URL |
| `folder` | VARCHAR(1000) | NOT NULL | Filesystem folder name (metadata only) |
| `year` | INTEGER | NULLABLE | Release year of the series |
| `nfo_path` | VARCHAR(1000) | NULLABLE | Path to tvshow.nfo metadata file |
| `tmdb_id` | INTEGER | NULLABLE, INDEX | TMDB (The Movie Database) ID for metadata |
| `tvdb_id` | INTEGER | NULLABLE, INDEX | TVDB (TheTVDB) ID for metadata |
| `has_nfo` | BOOLEAN | NOT NULL, DEFAULT FALSE | Whether tvshow.nfo exists |
| `loading_status` | VARCHAR(50) | NOT NULL, DEFAULT 'completed' | Status: pending, loading_episodes, loading_nfo, completed, failed |
| `created_at` | DATETIME | NOT NULL, DEFAULT NOW | Record creation timestamp |
| `updated_at` | DATETIME | NOT NULL, ON UPDATE NOW | Last update timestamp |
**Identifier Convention:**
- `key` is the **primary identifier** for all operations (e.g., `"attack-on-titan"`)
- `folder` is **metadata only** for filesystem operations (e.g., `"Attack on Titan (2013)"`)
- `id` is used only for database relationships
**EpisodeDict Mapping:**
The `episodeDict` (season → episode numbers mapping) is stored as individual `Episode` records:
- Each `Episode` has `season` and `episode_number` columns
- Relationship: `AnimeSeries.episodes` returns all Episode records for that series
Source: [src/server/database/models.py](../src/server/database/models.py#L23-L150)
### 3.3 episodes
Stores **missing episodes** that need to be downloaded. Episodes are automatically managed during scans:
- New missing episodes are added to the database
- Episodes that are no longer missing (files now exist) are removed from the database
- When an episode is downloaded, it can be marked with `is_downloaded=True` or removed from tracking
| Column | Type | Constraints | Description |
| ---------------- | ------------- | ---------------------------- | ----------------------------- |
| `id` | INTEGER | PRIMARY KEY, AUTOINCREMENT | Internal database ID |
| `series_id` | INTEGER | FOREIGN KEY, NOT NULL, INDEX | Reference to anime_series.id |
| `season` | INTEGER | NOT NULL | Season number (1-based) |
| `episode_number` | INTEGER | NOT NULL | Episode number within season |
| `title` | VARCHAR(500) | NULLABLE | Episode title if known |
| `file_path` | VARCHAR(1000) | NULLABLE | Local file path if downloaded |
| `is_downloaded` | BOOLEAN | NOT NULL, DEFAULT FALSE | Download status flag |
| `created_at` | DATETIME | NOT NULL, DEFAULT NOW | Record creation timestamp |
| `updated_at` | DATETIME | NOT NULL, ON UPDATE NOW | Last update timestamp |
**Foreign Key:**
- `series_id` -> `anime_series.id` (ON DELETE CASCADE)
Source: [src/server/database/models.py](../src/server/database/models.py#L122-L181)
### 3.4 download_queue_item
Stores download queue items with status tracking.
| Column | Type | Constraints | Description |
| ------------------ | ------------- | --------------------------- | ------------------------------ |
| `id` | VARCHAR(36) | PRIMARY KEY | UUID identifier |
| `series_id` | INTEGER | FOREIGN KEY, NOT NULL | Reference to anime_series.id |
| `season` | INTEGER | NOT NULL | Season number |
| `episode` | INTEGER | NOT NULL | Episode number |
| `status` | VARCHAR(20) | NOT NULL, DEFAULT 'pending' | Download status |
| `priority` | VARCHAR(10) | NOT NULL, DEFAULT 'NORMAL' | Queue priority |
| `progress_percent` | FLOAT | NULLABLE | Download progress (0-100) |
| `error_message` | TEXT | NULLABLE | Error description if failed |
| `retry_count` | INTEGER | NOT NULL, DEFAULT 0 | Number of retry attempts |
| `source_url` | VARCHAR(2000) | NULLABLE | Download source URL |
| `added_at` | DATETIME | NOT NULL, DEFAULT NOW | When added to queue |
| `started_at` | DATETIME | NULLABLE | When download started |
| `completed_at` | DATETIME | NULLABLE | When download completed/failed |
| `created_at` | DATETIME | NOT NULL, DEFAULT NOW | Record creation timestamp |
| `updated_at` | DATETIME | NOT NULL, ON UPDATE NOW | Last update timestamp |
**Status Values:** `pending`, `downloading`, `paused`, `completed`, `failed`, `cancelled`
**Priority Values:** `LOW`, `NORMAL`, `HIGH`
**Foreign Key:**
- `series_id` -> `anime_series.id` (ON DELETE CASCADE)
Source: [src/server/database/models.py](../src/server/database/models.py#L200-L300)
---
## 4. Indexes
| Table | Index Name | Columns | Purpose |
| --------------------- | ----------------------- | ----------- | --------------------------------- |
| `system_settings` | N/A (single row) | N/A | Only one row, no indexes needed |
| `anime_series` | `ix_anime_series_key` | `key` | Fast lookup by primary identifier |
| `anime_series` | `ix_anime_series_name` | `name` | Search by name |
| `episodes` | `ix_episodes_series_id` | `series_id` | Join with series |
| `download_queue_item` | `ix_download_series_id` | `series_id` | Filter by series |
| `download_queue_item` | `ix_download_status` | `status` | Filter by status |
---
## 5. Model Layer
### 5.1 SQLAlchemy ORM Models
```python
# src/server/database/models.py
class AnimeSeries(Base, TimestampMixin):
__tablename__ = "anime_series"
id: Mapped[int] = mapped_column(Integer, primary_key=True)
key: Mapped[str] = mapped_column(String(255), unique=True, index=True)
name: Mapped[str] = mapped_column(String(500), index=True)
site: Mapped[str] = mapped_column(String(500))
folder: Mapped[str] = mapped_column(String(1000))
episodes: Mapped[List["Episode"]] = relationship(
"Episode", back_populates="series", cascade="all, delete-orphan"
)
```
Source: [src/server/database/models.py](../src/server/database/models.py#L23-L87)
### 5.2 Pydantic API Models
```python
# src/server/models/download.py
class DownloadItem(BaseModel):
id: str
serie_id: str # Maps to anime_series.key
serie_folder: str # Metadata only
serie_name: str
episode: EpisodeIdentifier
status: DownloadStatus
priority: DownloadPriority
```
Source: [src/server/models/download.py](../src/server/models/download.py#L63-L118)
### 5.3 Model Mapping
| API Field | Database Column | Notes |
| -------------- | --------------------- | ------------------ |
| `serie_id` | `anime_series.key` | Primary identifier |
| `serie_folder` | `anime_series.folder` | Metadata only |
| `serie_name` | `anime_series.name` | Display name |
---
## 6. Transaction Support
### 6.1 Overview
The database layer provides comprehensive transaction support to ensure data consistency across compound operations. All write operations can be wrapped in explicit transactions.
Source: [src/server/database/transaction.py](../src/server/database/transaction.py)
### 6.2 Transaction Utilities
| Component | Type | Description |
| ------------------------- | ----------------- | ---------------------------------------- |
| `@transactional` | Decorator | Wraps function in transaction boundary |
| `atomic()` | Async context mgr | Provides atomic operation block |
| `atomic_sync()` | Sync context mgr | Sync version of atomic() |
| `TransactionContext` | Class | Explicit sync transaction control |
| `AsyncTransactionContext` | Class | Explicit async transaction control |
| `TransactionManager` | Class | Helper for manual transaction management |
### 6.3 Transaction Propagation Modes
| Mode | Behavior |
| -------------- | ------------------------------------------------ |
| `REQUIRED` | Use existing transaction or create new (default) |
| `REQUIRES_NEW` | Always create new transaction |
| `NESTED` | Create savepoint within existing transaction |
### 6.4 Usage Examples
**Using @transactional decorator:**
```python
from src.server.database.transaction import transactional
@transactional()
async def compound_operation(db: AsyncSession, data: dict):
# All operations commit together or rollback on error
series = await AnimeSeriesService.create(db, ...)
episode = await EpisodeService.create(db, series_id=series.id, ...)
return series, episode
```
**Using atomic() context manager:**
```python
from src.server.database.transaction import atomic
async def some_function(db: AsyncSession):
async with atomic(db) as tx:
await operation1(db)
await operation2(db)
# Auto-commits on success, rolls back on exception
```
**Using savepoints for partial rollback:**
```python
async with atomic(db) as tx:
await outer_operation(db)
async with tx.savepoint() as sp:
await risky_operation(db)
if error_condition:
await sp.rollback() # Only rollback nested ops
await final_operation(db) # Still executes
```
Source: [src/server/database/transaction.py](../src/server/database/transaction.py)
### 6.5 Connection Module Additions
| Function | Description |
| ------------------------------- | -------------------------------------------- |
| `get_transactional_session` | Session without auto-commit for transactions |
| `TransactionManager` | Helper class for manual transaction control |
| `is_session_in_transaction` | Check if session is in active transaction |
| `get_session_transaction_depth` | Get nesting depth of transactions |
Source: [src/server/database/connection.py](../src/server/database/connection.py)
---
## 7. Repository Pattern
The `QueueRepository` class provides data access abstraction.
```python
class QueueRepository:
async def save_item(self, item: DownloadItem) -> None:
"""Save or update a download item (atomic operation)."""
async def get_all_items(self) -> List[DownloadItem]:
"""Get all items from database."""
async def delete_item(self, item_id: str) -> bool:
"""Delete item by ID."""
async def clear_all(self) -> int:
"""Clear all items (atomic operation)."""
```
Note: Compound operations (`save_item`, `clear_all`) are wrapped in `atomic()` transactions.
Source: [src/server/services/queue_repository.py](../src/server/services/queue_repository.py)
---
## 8. Database Service
The `AnimeSeriesService` provides async CRUD operations.
```python
class AnimeSeriesService:
@staticmethod
async def create(
db: AsyncSession,
key: str,
name: str,
site: str,
folder: str
) -> AnimeSeries:
"""Create a new anime series."""
@staticmethod
async def get_by_key(
db: AsyncSession,
key: str
) -> Optional[AnimeSeries]:
"""Get series by primary key identifier."""
```
### Bulk Operations
Services provide bulk operations for transaction-safe batch processing:
| Service | Method | Description |
| ---------------------- | ---------------------- | ------------------------------ |
| `EpisodeService` | `bulk_mark_downloaded` | Mark multiple episodes at once |
| `DownloadQueueService` | `bulk_delete` | Delete multiple queue items |
| `DownloadQueueService` | `clear_all` | Clear entire queue |
| `UserSessionService` | `rotate_session` | Revoke old + create new atomic |
| `UserSessionService` | `cleanup_expired` | Bulk delete expired sessions |
Source: [src/server/database/service.py](../src/server/database/service.py)
---
## 9. Data Integrity Rules
### Validation Constraints
| Field | Rule | Error Message |
| ------------------------- | ------------------------ | ------------------------------------- |
| `anime_series.key` | Non-empty, max 255 chars | "Series key cannot be empty" |
| `anime_series.name` | Non-empty, max 500 chars | "Series name cannot be empty" |
| `episodes.season` | 0-1000 | "Season number must be non-negative" |
| `episodes.episode_number` | 0-10000 | "Episode number must be non-negative" |
Source: [src/server/database/models.py](../src/server/database/models.py#L89-L119)
### Cascade Rules
- Deleting `anime_series` deletes all related `episodes` and `download_queue_item`
---
## 10. Migration Strategy
Currently, SQLAlchemy's `create_all()` is used for schema creation.
```python
# src/server/database/connection.py
async def init_db():
async with engine.begin() as conn:
await conn.run_sync(Base.metadata.create_all)
```
For production migrations, Alembic is recommended but not yet implemented.
Source: [src/server/database/connection.py](../src/server/database/connection.py)
---
## 11. Common Query Patterns
### Get all series with missing episodes
```python
series = await db.execute(
select(AnimeSeries).options(selectinload(AnimeSeries.episodes))
)
for serie in series.scalars():
downloaded = [e for e in serie.episodes if e.is_downloaded]
```
### Get pending downloads ordered by priority
```python
items = await db.execute(
select(DownloadQueueItem)
.where(DownloadQueueItem.status == "pending")
.order_by(
case(
(DownloadQueueItem.priority == "HIGH", 1),
(DownloadQueueItem.priority == "NORMAL", 2),
(DownloadQueueItem.priority == "LOW", 3),
),
DownloadQueueItem.added_at
)
)
```
---
## 12. Series Storage: Database vs Files (Deprecated)
### File-Based Storage (Removed in v2.0)
Prior to v2.0, series metadata was stored in two files per anime folder:
| File | Contents |
| -------- | ------------------------------------------------------- |
| `key` | Series provider key (e.g., `"attack-on-titan"`) |
| `data` | JSON serialization of `Serie` object |
File structure example:
```
/anime/Attack on Titan (2013)/
├── key # Contains: attack-on-titan
├── data # Contains: {"key": "...", "name": "...", "episodeDict": {...}}
├── Season 1/
│ └── ...
```
### Database Storage (Current)
Since v2.0, all series metadata is stored in the `anime_series` table with `Episode` records for episode tracking. This provides:
- **ACID transactions** for data consistency
- **Foreign key constraints** (cascade delete)
- **Indexed queries** for fast lookups
- **No filesystem dependency** for metadata
### Migration from Files to Database
The `Serie.save_to_file()` and `Serie.load_from_file()` methods are deprecated but still functional for backward compatibility during migration:
```python
from src.core.entities.series import Serie
# Old file-based loading (deprecated)
serie = Serie.load_from_file("/anime/Attack on Titan (2013)/data")
# New database-based loading
from src.server.database.service import AnimeSeriesService
serie = await AnimeSeriesService.get_by_key(db, "attack-on-titan")
```
### Removing File Dependencies
After verifying database schema supports all fields, file-based storage can be removed:
1. ✅ Schema verified: All `Serie` fields have corresponding DB columns
2. ✅ Migration complete: All existing series migrated to database
3. ❌ File cleanup: Remove `key` and `data` files (pending)
**Note:** The `save_to_file()` and `load_from_file()` methods will be removed in v3.0.0.
---
## 12. Series Persistence Flow
When a directory scan discovers or updates series, the scanner persists data to the database instead of writing to disk files.
### Scan Flow
```
Scan Directory
Find MP4 Files → Extract Serie Key
Check DB for Existing Series (by key)
├─── EXISTS ──────────────────────► Update Series Metadata
│ │
│ ▼
│ Sync Episodes to DB
│ │
│◄──────────────────────────────────────┘
└─── NEW ───────────────────────────► Create New Series Record
Create Episode Records
Return to Scan Loop
```
### Key Methods
**SerieScanner._persist_serie_to_db()**
- Called after `get_missing_episodes_and_season()` computes episodeDict
- Uses `AnimeSeriesService.get_by_key()` to check if series exists
- If exists: calls `AnimeSeriesService.update()` + `_sync_episodes_to_db()`
- If new: calls `AnimeSeriesService.create()` + creates episodes
**SerieScanner._sync_episodes_to_db()**
- Gets existing episodes from DB via `EpisodeService.get_by_series()`
- Compares with new episodeDict
- Removes episodes no longer missing (unless `is_downloaded=True`)
- Adds new missing episodes
- Preserves `is_downloaded=True` episodes when removing missing ones
**SerieList.add_to_db()**
- Used when adding a new discovered series via API
- Creates filesystem folder + database record + episode records
### Episode Sync Logic
```python
# For each episode in DB but not in new episodeDict:
if episode.is_downloaded:
# Keep - file exists, don't remove
pass
else:
# Remove - no longer missing
EpisodeService.delete()
# For each episode in new episodeDict but not in DB:
# Add as new missing episode
EpisodeService.create(is_downloaded=False)
```
### Transaction Handling
- DB operations use their own session with commit/rollback
- If DB write fails, error is logged and scan continues
- File-based `save_to_file()` no longer called during scan
### Migration Path
1. v2.x: Scanner writes to both DB (primary) and files (fallback)
2. v3.0: Scanner writes only to DB, file methods removed
---
## 13. Series Persistence
### Schema
**AnimeSeries Table**: Stores series metadata (key, name, site, folder, year)
| Column | Type | Constraints | Description |
|-----------|--------------|---------------------------|----------------------|
| `id` | INTEGER | PRIMARY KEY | Auto-increment |
| `key` | VARCHAR(255) | UNIQUE, NOT NULL | Series provider key |
| `name` | VARCHAR(500) | NOT NULL | Display name |
| `site` | VARCHAR(500) | | Provider site URL |
| `folder` | VARCHAR(1000)| | Filesystem folder |
**Episode Table**: Stores per-episode metadata (season, episode_number, is_downloaded)
| Column | Type | Constraints | Description |
|-----------------|--------------|---------------------------|----------------------|
| `id` | INTEGER | PRIMARY KEY | Auto-increment |
| `series_id` | INTEGER | FOREIGN KEY → anime_series| Parent series |
| `season` | INTEGER | NOT NULL | Season number |
| `episode_number`| INTEGER | NOT NULL | Episode number |
| `is_downloaded` | BOOLEAN | DEFAULT FALSE | Download status |
### Relationships
- `AnimeSeries.episodes` → List of Episode objects (one-to-many)
- `Episode.series` → Parent AnimeSeries (many-to-one)
- Cascade delete: Deleting a series removes all its episodes
### Queries
```python
# Get all series with episodes
AnimeSeriesService.get_all(db, with_episodes=True)
# Get by provider key
AnimeSeriesService.get_by_key(db, key)
# Get by folder path
AnimeSeriesService.get_by_folder(db, folder)
```
---
## 14. Database Location
| Environment | Default Location |
| ----------- | ------------------------------------------------- |
| Development | `./data/aniworld.db` |
| Production | Via `DATABASE_URL` environment variable |
| Testing | In-memory SQLite (`sqlite+aiosqlite:///:memory:`) |