Aniworld/docs/DATABASE.md
Lukas 1ba67357dc Add database transaction support with atomic operations
- Create transaction.py with @transactional decorator, atomic() context manager
- Add TransactionPropagation modes: REQUIRED, REQUIRES_NEW, NESTED
- Add savepoint support for nested transactions with partial rollback
- Update connection.py with TransactionManager, get_transactional_session
- Update service.py with bulk operations (bulk_mark_downloaded, bulk_delete)
- Wrap QueueRepository.save_item() and clear_all() in atomic transactions
- Add comprehensive tests (66 transaction tests, 90% coverage)
- All 1090 tests passing
2025-12-25 18:05:33 +01:00

422 lines
17 KiB
Markdown

# Database Documentation
## Document Purpose
This document describes the database schema, models, and data layer of the Aniworld application.
---
## 1. Database Overview
### Technology
- **Database Engine**: SQLite 3 (default), PostgreSQL supported
- **ORM**: SQLAlchemy 2.0 with async support (aiosqlite)
- **Location**: `data/aniworld.db` (configurable via `DATABASE_URL`)
Source: [src/config/settings.py](../src/config/settings.py#L53-L55)
### Connection Configuration
```python
# Default connection string
DATABASE_URL = "sqlite+aiosqlite:///./data/aniworld.db"
# PostgreSQL alternative
DATABASE_URL = "postgresql+asyncpg://user:pass@localhost/aniworld"
```
Source: [src/server/database/connection.py](../src/server/database/connection.py)
---
## 2. Entity Relationship Diagram
```
+-------------------+ +-------------------+ +------------------------+
| anime_series | | episodes | | download_queue_item |
+-------------------+ +-------------------+ +------------------------+
| id (PK) |<--+ | id (PK) | +-->| id (PK, VARCHAR) |
| key (UNIQUE) | | | series_id (FK)----+---+ | series_id (FK)---------+
| name | +---| | | status |
| site | | season | | priority |
| folder | | episode_number | | season |
| created_at | | title | | episode |
| updated_at | | file_path | | progress_percent |
+-------------------+ | is_downloaded | | error_message |
| created_at | | retry_count |
| updated_at | | added_at |
+-------------------+ | started_at |
| completed_at |
| created_at |
| updated_at |
+------------------------+
```
---
## 3. Table Schemas
### 3.1 anime_series
Stores anime series metadata.
| Column | Type | Constraints | Description |
| ------------ | ------------- | -------------------------- | ------------------------------------------------------- |
| `id` | INTEGER | PRIMARY KEY, AUTOINCREMENT | Internal database ID |
| `key` | VARCHAR(255) | UNIQUE, NOT NULL, INDEX | **Primary identifier** - provider-assigned URL-safe key |
| `name` | VARCHAR(500) | NOT NULL, INDEX | Display name of the series |
| `site` | VARCHAR(500) | NOT NULL | Provider site URL |
| `folder` | VARCHAR(1000) | NOT NULL | Filesystem folder name (metadata only) |
| `created_at` | DATETIME | NOT NULL, DEFAULT NOW | Record creation timestamp |
| `updated_at` | DATETIME | NOT NULL, ON UPDATE NOW | Last update timestamp |
**Identifier Convention:**
- `key` is the **primary identifier** for all operations (e.g., `"attack-on-titan"`)
- `folder` is **metadata only** for filesystem operations (e.g., `"Attack on Titan (2013)"`)
- `id` is used only for database relationships
Source: [src/server/database/models.py](../src/server/database/models.py#L23-L87)
### 3.2 episodes
Stores individual episode information.
| Column | Type | Constraints | Description |
| ---------------- | ------------- | ---------------------------- | ----------------------------- |
| `id` | INTEGER | PRIMARY KEY, AUTOINCREMENT | Internal database ID |
| `series_id` | INTEGER | FOREIGN KEY, NOT NULL, INDEX | Reference to anime_series.id |
| `season` | INTEGER | NOT NULL | Season number (1-based) |
| `episode_number` | INTEGER | NOT NULL | Episode number within season |
| `title` | VARCHAR(500) | NULLABLE | Episode title if known |
| `file_path` | VARCHAR(1000) | NULLABLE | Local file path if downloaded |
| `is_downloaded` | BOOLEAN | NOT NULL, DEFAULT FALSE | Download status flag |
| `created_at` | DATETIME | NOT NULL, DEFAULT NOW | Record creation timestamp |
| `updated_at` | DATETIME | NOT NULL, ON UPDATE NOW | Last update timestamp |
**Foreign Key:**
- `series_id` -> `anime_series.id` (ON DELETE CASCADE)
Source: [src/server/database/models.py](../src/server/database/models.py#L122-L181)
### 3.3 download_queue_item
Stores download queue items with status tracking.
| Column | Type | Constraints | Description |
| ------------------ | ------------- | --------------------------- | ------------------------------ |
| `id` | VARCHAR(36) | PRIMARY KEY | UUID identifier |
| `series_id` | INTEGER | FOREIGN KEY, NOT NULL | Reference to anime_series.id |
| `season` | INTEGER | NOT NULL | Season number |
| `episode` | INTEGER | NOT NULL | Episode number |
| `status` | VARCHAR(20) | NOT NULL, DEFAULT 'pending' | Download status |
| `priority` | VARCHAR(10) | NOT NULL, DEFAULT 'NORMAL' | Queue priority |
| `progress_percent` | FLOAT | NULLABLE | Download progress (0-100) |
| `error_message` | TEXT | NULLABLE | Error description if failed |
| `retry_count` | INTEGER | NOT NULL, DEFAULT 0 | Number of retry attempts |
| `source_url` | VARCHAR(2000) | NULLABLE | Download source URL |
| `added_at` | DATETIME | NOT NULL, DEFAULT NOW | When added to queue |
| `started_at` | DATETIME | NULLABLE | When download started |
| `completed_at` | DATETIME | NULLABLE | When download completed/failed |
| `created_at` | DATETIME | NOT NULL, DEFAULT NOW | Record creation timestamp |
| `updated_at` | DATETIME | NOT NULL, ON UPDATE NOW | Last update timestamp |
**Status Values:** `pending`, `downloading`, `paused`, `completed`, `failed`, `cancelled`
**Priority Values:** `LOW`, `NORMAL`, `HIGH`
**Foreign Key:**
- `series_id` -> `anime_series.id` (ON DELETE CASCADE)
Source: [src/server/database/models.py](../src/server/database/models.py#L200-L300)
---
## 4. Indexes
| Table | Index Name | Columns | Purpose |
| --------------------- | ----------------------- | ----------- | --------------------------------- |
| `anime_series` | `ix_anime_series_key` | `key` | Fast lookup by primary identifier |
| `anime_series` | `ix_anime_series_name` | `name` | Search by name |
| `episodes` | `ix_episodes_series_id` | `series_id` | Join with series |
| `download_queue_item` | `ix_download_series_id` | `series_id` | Filter by series |
| `download_queue_item` | `ix_download_status` | `status` | Filter by status |
---
## 5. Model Layer
### 5.1 SQLAlchemy ORM Models
```python
# src/server/database/models.py
class AnimeSeries(Base, TimestampMixin):
__tablename__ = "anime_series"
id: Mapped[int] = mapped_column(Integer, primary_key=True)
key: Mapped[str] = mapped_column(String(255), unique=True, index=True)
name: Mapped[str] = mapped_column(String(500), index=True)
site: Mapped[str] = mapped_column(String(500))
folder: Mapped[str] = mapped_column(String(1000))
episodes: Mapped[List["Episode"]] = relationship(
"Episode", back_populates="series", cascade="all, delete-orphan"
)
```
Source: [src/server/database/models.py](../src/server/database/models.py#L23-L87)
### 5.2 Pydantic API Models
```python
# src/server/models/download.py
class DownloadItem(BaseModel):
id: str
serie_id: str # Maps to anime_series.key
serie_folder: str # Metadata only
serie_name: str
episode: EpisodeIdentifier
status: DownloadStatus
priority: DownloadPriority
```
Source: [src/server/models/download.py](../src/server/models/download.py#L63-L118)
### 5.3 Model Mapping
| API Field | Database Column | Notes |
| -------------- | --------------------- | ------------------ |
| `serie_id` | `anime_series.key` | Primary identifier |
| `serie_folder` | `anime_series.folder` | Metadata only |
| `serie_name` | `anime_series.name` | Display name |
---
## 6. Transaction Support
### 6.1 Overview
The database layer provides comprehensive transaction support to ensure data consistency across compound operations. All write operations can be wrapped in explicit transactions.
Source: [src/server/database/transaction.py](../src/server/database/transaction.py)
### 6.2 Transaction Utilities
| Component | Type | Description |
| ------------------------- | ----------------- | ---------------------------------------- |
| `@transactional` | Decorator | Wraps function in transaction boundary |
| `atomic()` | Async context mgr | Provides atomic operation block |
| `atomic_sync()` | Sync context mgr | Sync version of atomic() |
| `TransactionContext` | Class | Explicit sync transaction control |
| `AsyncTransactionContext` | Class | Explicit async transaction control |
| `TransactionManager` | Class | Helper for manual transaction management |
### 6.3 Transaction Propagation Modes
| Mode | Behavior |
| -------------- | ------------------------------------------------ |
| `REQUIRED` | Use existing transaction or create new (default) |
| `REQUIRES_NEW` | Always create new transaction |
| `NESTED` | Create savepoint within existing transaction |
### 6.4 Usage Examples
**Using @transactional decorator:**
```python
from src.server.database.transaction import transactional
@transactional()
async def compound_operation(db: AsyncSession, data: dict):
# All operations commit together or rollback on error
series = await AnimeSeriesService.create(db, ...)
episode = await EpisodeService.create(db, series_id=series.id, ...)
return series, episode
```
**Using atomic() context manager:**
```python
from src.server.database.transaction import atomic
async def some_function(db: AsyncSession):
async with atomic(db) as tx:
await operation1(db)
await operation2(db)
# Auto-commits on success, rolls back on exception
```
**Using savepoints for partial rollback:**
```python
async with atomic(db) as tx:
await outer_operation(db)
async with tx.savepoint() as sp:
await risky_operation(db)
if error_condition:
await sp.rollback() # Only rollback nested ops
await final_operation(db) # Still executes
```
Source: [src/server/database/transaction.py](../src/server/database/transaction.py)
### 6.5 Connection Module Additions
| Function | Description |
| ------------------------------- | -------------------------------------------- |
| `get_transactional_session` | Session without auto-commit for transactions |
| `TransactionManager` | Helper class for manual transaction control |
| `is_session_in_transaction` | Check if session is in active transaction |
| `get_session_transaction_depth` | Get nesting depth of transactions |
Source: [src/server/database/connection.py](../src/server/database/connection.py)
---
## 7. Repository Pattern
The `QueueRepository` class provides data access abstraction.
```python
class QueueRepository:
async def save_item(self, item: DownloadItem) -> None:
"""Save or update a download item (atomic operation)."""
async def get_all_items(self) -> List[DownloadItem]:
"""Get all items from database."""
async def delete_item(self, item_id: str) -> bool:
"""Delete item by ID."""
async def clear_all(self) -> int:
"""Clear all items (atomic operation)."""
```
Note: Compound operations (`save_item`, `clear_all`) are wrapped in `atomic()` transactions.
Source: [src/server/services/queue_repository.py](../src/server/services/queue_repository.py)
---
## 8. Database Service
The `AnimeSeriesService` provides async CRUD operations.
```python
class AnimeSeriesService:
@staticmethod
async def create(
db: AsyncSession,
key: str,
name: str,
site: str,
folder: str
) -> AnimeSeries:
"""Create a new anime series."""
@staticmethod
async def get_by_key(
db: AsyncSession,
key: str
) -> Optional[AnimeSeries]:
"""Get series by primary key identifier."""
```
### Bulk Operations
Services provide bulk operations for transaction-safe batch processing:
| Service | Method | Description |
| ---------------------- | ---------------------- | ------------------------------ |
| `EpisodeService` | `bulk_mark_downloaded` | Mark multiple episodes at once |
| `DownloadQueueService` | `bulk_delete` | Delete multiple queue items |
| `DownloadQueueService` | `clear_all` | Clear entire queue |
| `UserSessionService` | `rotate_session` | Revoke old + create new atomic |
| `UserSessionService` | `cleanup_expired` | Bulk delete expired sessions |
Source: [src/server/database/service.py](../src/server/database/service.py)
---
## 9. Data Integrity Rules
### Validation Constraints
| Field | Rule | Error Message |
| ------------------------- | ------------------------ | ------------------------------------- |
| `anime_series.key` | Non-empty, max 255 chars | "Series key cannot be empty" |
| `anime_series.name` | Non-empty, max 500 chars | "Series name cannot be empty" |
| `episodes.season` | 0-1000 | "Season number must be non-negative" |
| `episodes.episode_number` | 0-10000 | "Episode number must be non-negative" |
Source: [src/server/database/models.py](../src/server/database/models.py#L89-L119)
### Cascade Rules
- Deleting `anime_series` deletes all related `episodes` and `download_queue_item`
---
## 10. Migration Strategy
Currently, SQLAlchemy's `create_all()` is used for schema creation.
```python
# src/server/database/connection.py
async def init_db():
async with engine.begin() as conn:
await conn.run_sync(Base.metadata.create_all)
```
For production migrations, Alembic is recommended but not yet implemented.
Source: [src/server/database/connection.py](../src/server/database/connection.py)
---
## 11. Common Query Patterns
### Get all series with missing episodes
```python
series = await db.execute(
select(AnimeSeries).options(selectinload(AnimeSeries.episodes))
)
for serie in series.scalars():
downloaded = [e for e in serie.episodes if e.is_downloaded]
```
### Get pending downloads ordered by priority
```python
items = await db.execute(
select(DownloadQueueItem)
.where(DownloadQueueItem.status == "pending")
.order_by(
case(
(DownloadQueueItem.priority == "HIGH", 1),
(DownloadQueueItem.priority == "NORMAL", 2),
(DownloadQueueItem.priority == "LOW", 3),
),
DownloadQueueItem.added_at
)
)
```
---
## 12. Database Location
| Environment | Default Location |
| ----------- | ------------------------------------------------- |
| Development | `./data/aniworld.db` |
| Production | Via `DATABASE_URL` environment variable |
| Testing | In-memory SQLite (`sqlite+aiosqlite:///:memory:`) |