refactor: move sync_series_from_data_files to anime_service

- Moved _sync_series_to_database from fastapi_app.py to anime_service.py
- Renamed to sync_series_from_data_files for better clarity
- Updated all imports and test references
- Removed completed TODO tasks from instructions.md
This commit is contained in:
Lukas 2025-12-13 09:58:32 +01:00
parent 684337fd0c
commit 5f6ac8e507
4 changed files with 106 additions and 243 deletions

View File

@ -120,166 +120,3 @@ For each task completed:
- Good foundation for future enhancements if needed - Good foundation for future enhancements if needed
--- ---
## 📋 TODO Tasks
### Task 1: Add `get_all_series_from_data_files()` Method to SeriesApp
**Status**: [x] Completed
**Description**: Add a new method to `SeriesApp` that returns all series data found in data files from the filesystem.
**File to Modify**: `src/core/SeriesApp.py`
**Requirements**:
1. Add a new method `get_all_series_from_data_files() -> List[Serie]` to `SeriesApp`
2. This method should scan the `directory_to_search` for all data files
3. Load and return all `Serie` objects found in data files
4. Use the existing `SerieList.load_series()` pattern for file discovery
5. Return an empty list if no data files are found
6. Include proper logging for debugging
7. Method should be synchronous (can be wrapped with `asyncio.to_thread` if needed)
**Implementation Details**:
```python
def get_all_series_from_data_files(self) -> List[Serie]:
"""
Get all series from data files in the anime directory.
Scans the directory_to_search for all 'data' files and loads
the Serie metadata from each file.
Returns:
List of Serie objects found in data files
"""
# Use SerieList's file-based loading to get all series
# Return list of Serie objects from self.list.keyDict.values()
```
**Acceptance Criteria**:
- [x] Method exists in `SeriesApp`
- [x] Method returns `List[Serie]`
- [x] Method scans filesystem for data files
- [x] Proper error handling for missing/corrupt files
- [x] Logging added for operations
- [x] Unit tests written and passing
---
### Task 2: Sync Series from Data Files to Database on Setup Complete
**Status**: [x] Completed
**Description**: When the application setup is complete (anime directory configured), automatically sync all series from data files to the database.
**Files to Modify**:
- `src/server/fastapi_app.py` (lifespan function)
- `src/server/services/` (if needed for service layer)
**Requirements**:
1. After `download_service.initialize()` succeeds in the lifespan function
2. Call `SeriesApp.get_all_series_from_data_files()` to get all series
3. For each series, use `SerieList.add_to_db()` to save to database (uses existing DB schema)
4. Skip series that already exist in database (handled by `add_to_db`)
5. Log the sync progress and results
6. Do NOT modify database model definitions
**Implementation Details**:
```python
# In lifespan function, after download_service.initialize():
try:
from src.server.database.connection import get_db_session
# Get all series from data files using SeriesApp
series_app = SeriesApp(settings.anime_directory)
all_series = series_app.get_all_series_from_data_files()
if all_series:
async with get_db_session() as db:
serie_list = SerieList(settings.anime_directory, db_session=db, skip_load=True)
added_count = 0
for serie in all_series:
result = await serie_list.add_to_db(serie, db)
if result:
added_count += 1
await db.commit()
logger.info("Synced %d new series to database", added_count)
except Exception as e:
logger.warning("Failed to sync series to database: %s", e)
```
**Acceptance Criteria**:
- [x] Series from data files are synced to database on startup
- [x] Existing series in database are not duplicated
- [x] Database schema is NOT modified
- [x] Proper error handling (app continues even if sync fails)
- [x] Logging added for sync operations
- [x] Integration tests written and passing
---
### Task 3: Validation - Verify Data File to Database Sync
**Status**: [x] Completed
**Description**: Create validation tests to ensure the data file to database sync works correctly.
**File to Create**: `tests/integration/test_data_file_db_sync.py`
**Requirements**:
1. Test `get_all_series_from_data_files()` returns correct data
2. Test that series are correctly added to database
3. Test that duplicate series are not created
4. Test that sync handles empty directories gracefully
5. Test that sync handles corrupt data files gracefully
6. Test end-to-end startup sync behavior
**Test Cases**:
```python
class TestDataFileDbSync:
"""Test data file to database synchronization."""
async def test_get_all_series_from_data_files_returns_list(self):
"""Test that get_all_series_from_data_files returns a list."""
pass
async def test_get_all_series_from_data_files_empty_directory(self):
"""Test behavior with empty anime directory."""
pass
async def test_series_sync_to_db_creates_records(self):
"""Test that series are correctly synced to database."""
pass
async def test_series_sync_to_db_no_duplicates(self):
"""Test that duplicate series are not created."""
pass
async def test_series_sync_handles_corrupt_files(self):
"""Test that corrupt data files don't crash the sync."""
pass
async def test_startup_sync_integration(self):
"""Test end-to-end startup sync behavior."""
pass
```
**Acceptance Criteria**:
- [x] All test cases implemented
- [x] Tests use pytest async fixtures
- [x] Tests use temporary directories for isolation
- [x] Tests cover happy path and error cases
- [x] All tests passing
- [x] Code coverage > 80% for new code
---

View File

@ -34,6 +34,7 @@ from src.server.controllers.page_controller import router as page_router
from src.server.middleware.auth import AuthMiddleware from src.server.middleware.auth import AuthMiddleware
from src.server.middleware.error_handler import register_exception_handlers from src.server.middleware.error_handler import register_exception_handlers
from src.server.middleware.setup_redirect import SetupRedirectMiddleware from src.server.middleware.setup_redirect import SetupRedirectMiddleware
from src.server.services.anime_service import sync_series_from_data_files
from src.server.services.progress_service import get_progress_service from src.server.services.progress_service import get_progress_service
from src.server.services.websocket_service import get_websocket_service from src.server.services.websocket_service import get_websocket_service
@ -41,78 +42,6 @@ from src.server.services.websocket_service import get_websocket_service
# module-level globals. This makes testing and multi-instance hosting safer. # module-level globals. This makes testing and multi-instance hosting safer.
async def _sync_series_to_database(
anime_directory: str,
logger
) -> int:
"""
Sync series from data files to the database.
Scans the anime directory for data files and adds any new series
to the database. Existing series are skipped (no duplicates).
Args:
anime_directory: Path to the anime directory with data files
logger: Logger instance for logging operations
Returns:
Number of new series added to the database
"""
try:
import asyncio
from src.core.entities.SerieList import SerieList
from src.core.SeriesApp import SeriesApp
from src.server.database.connection import get_db_session
# Get all series from data files using SeriesApp
series_app = SeriesApp(anime_directory)
all_series = await asyncio.to_thread(
series_app.get_all_series_from_data_files
)
if not all_series:
logger.info("No series found in data files to sync")
return 0
logger.info(
"Found %d series in data files, syncing to database...",
len(all_series)
)
async with get_db_session() as db:
serie_list = SerieList(
anime_directory,
db_session=db,
skip_load=True
)
added_count = 0
for serie in all_series:
result = await serie_list.add_to_db(serie, db)
if result:
added_count += 1
logger.debug(
"Added series to database: %s (key=%s)",
serie.name,
serie.key
)
# Commit happens automatically via get_db_session context
logger.info(
"Synced %d new series to database (skipped %d existing)",
added_count,
len(all_series) - added_count
)
return added_count
except Exception as e:
logger.warning(
"Failed to sync series to database: %s",
e,
exc_info=True
)
return 0
@asynccontextmanager @asynccontextmanager
async def lifespan(app: FastAPI): async def lifespan(app: FastAPI):
"""Manage application lifespan (startup and shutdown).""" """Manage application lifespan (startup and shutdown)."""
@ -138,6 +67,10 @@ async def lifespan(app: FastAPI):
config_service = get_config_service() config_service = get_config_service()
config = config_service.load_config() config = config_service.load_config()
logger.debug(
"Config loaded: other=%s", config.other
)
# Sync anime_directory from config.json to settings # Sync anime_directory from config.json to settings
if config.other and config.other.get("anime_directory"): if config.other and config.other.get("anime_directory"):
settings.anime_directory = str(config.other["anime_directory"]) settings.anime_directory = str(config.other["anime_directory"])
@ -145,6 +78,10 @@ async def lifespan(app: FastAPI):
"Loaded anime_directory from config: %s", "Loaded anime_directory from config: %s",
settings.anime_directory settings.anime_directory
) )
else:
logger.debug(
"anime_directory not found in config.other"
)
except Exception as e: except Exception as e:
logger.warning("Failed to load config from config.json: %s", e) logger.warning("Failed to load config from config.json: %s", e)
@ -172,15 +109,23 @@ async def lifespan(app: FastAPI):
try: try:
from src.server.utils.dependencies import get_download_service from src.server.utils.dependencies import get_download_service
logger.info(
"Checking anime_directory setting: '%s'",
settings.anime_directory
)
if settings.anime_directory: if settings.anime_directory:
download_service = get_download_service() download_service = get_download_service()
await download_service.initialize() await download_service.initialize()
logger.info("Download service initialized and queue restored") logger.info("Download service initialized and queue restored")
# Sync series from data files to database # Sync series from data files to database
await _sync_series_to_database( sync_count = await sync_series_from_data_files(
settings.anime_directory, logger settings.anime_directory, logger
) )
logger.info(
"Data file sync complete. Added %d series.", sync_count
)
else: else:
logger.info( logger.info(
"Download service initialization skipped - " "Download service initialization skipped - "

View File

@ -361,3 +361,84 @@ class AnimeService:
def get_anime_service(series_app: SeriesApp) -> AnimeService: def get_anime_service(series_app: SeriesApp) -> AnimeService:
"""Factory used for creating AnimeService with a SeriesApp instance.""" """Factory used for creating AnimeService with a SeriesApp instance."""
return AnimeService(series_app) return AnimeService(series_app)
async def sync_series_from_data_files(
anime_directory: str,
logger=None
) -> int:
"""
Sync series from data files to the database.
Scans the anime directory for data files and adds any new series
to the database. Existing series are skipped (no duplicates).
This function is typically called during application startup to ensure
series metadata stored in filesystem data files is available in the
database.
Args:
anime_directory: Path to the anime directory with data files
logger: Optional logger instance for logging operations.
If not provided, uses structlog.
Returns:
Number of new series added to the database
"""
log = logger or structlog.get_logger(__name__)
try:
from src.core.entities.SerieList import SerieList
from src.server.database.connection import get_db_session
log.info(
"Starting data file to database sync",
directory=anime_directory
)
# Get all series from data files using SeriesApp
series_app = SeriesApp(anime_directory)
all_series = await asyncio.to_thread(
series_app.get_all_series_from_data_files
)
if not all_series:
log.info("No series found in data files to sync")
return 0
log.info(
"Found series in data files, syncing to database",
count=len(all_series)
)
async with get_db_session() as db:
serie_list = SerieList(
anime_directory,
db_session=db,
skip_load=True
)
added_count = 0
for serie in all_series:
result = await serie_list.add_to_db(serie, db)
if result:
added_count += 1
log.debug(
"Added series to database",
name=serie.name,
key=serie.key
)
log.info(
"Data file sync complete",
added=added_count,
skipped=len(all_series) - added_count
)
return added_count
except Exception as e:
log.warning(
"Failed to sync series to database",
error=str(e),
exc_info=True
)
return 0

View File

@ -187,19 +187,19 @@ class TestSerieListAddToDb:
class TestSyncSeriesToDatabase: class TestSyncSeriesToDatabase:
"""Test _sync_series_to_database function from fastapi_app.""" """Test sync_series_from_data_files function from anime_service."""
@pytest.mark.asyncio @pytest.mark.asyncio
async def test_sync_with_empty_directory(self): async def test_sync_with_empty_directory(self):
"""Test sync with empty anime directory.""" """Test sync with empty anime directory."""
from src.server.fastapi_app import _sync_series_to_database from src.server.services.anime_service import sync_series_from_data_files
with tempfile.TemporaryDirectory() as tmp_dir: with tempfile.TemporaryDirectory() as tmp_dir:
mock_logger = Mock() mock_logger = Mock()
with patch('src.core.SeriesApp.Loaders'), \ with patch('src.core.SeriesApp.Loaders'), \
patch('src.core.SeriesApp.SerieScanner'): patch('src.core.SeriesApp.SerieScanner'):
count = await _sync_series_to_database(tmp_dir, mock_logger) count = await sync_series_from_data_files(tmp_dir, mock_logger)
assert count == 0 assert count == 0
# Should log that no series were found # Should log that no series were found
@ -213,7 +213,7 @@ class TestSyncSeriesToDatabase:
from files and the sync function attempts to add them to the DB. from files and the sync function attempts to add them to the DB.
The actual DB interaction is tested in test_add_to_db_creates_record. The actual DB interaction is tested in test_add_to_db_creates_record.
""" """
from src.server.fastapi_app import _sync_series_to_database from src.server.services.anime_service import sync_series_from_data_files
with tempfile.TemporaryDirectory() as tmp_dir: with tempfile.TemporaryDirectory() as tmp_dir:
# Create test data files # Create test data files
@ -241,7 +241,7 @@ class TestSyncSeriesToDatabase:
patch('src.core.SeriesApp.SerieScanner'): patch('src.core.SeriesApp.SerieScanner'):
# The function should return 0 because DB isn't available # The function should return 0 because DB isn't available
# but should not crash # but should not crash
count = await _sync_series_to_database(tmp_dir, mock_logger) count = await sync_series_from_data_files(tmp_dir, mock_logger)
# Since no real DB, it will fail gracefully # Since no real DB, it will fail gracefully
assert isinstance(count, int) assert isinstance(count, int)
@ -251,7 +251,7 @@ class TestSyncSeriesToDatabase:
@pytest.mark.asyncio @pytest.mark.asyncio
async def test_sync_handles_exceptions_gracefully(self): async def test_sync_handles_exceptions_gracefully(self):
"""Test that sync handles exceptions without crashing.""" """Test that sync handles exceptions without crashing."""
from src.server.fastapi_app import _sync_series_to_database from src.server.services.anime_service import sync_series_from_data_files
mock_logger = Mock() mock_logger = Mock()
@ -262,7 +262,7 @@ class TestSyncSeriesToDatabase:
'src.core.SeriesApp.SerieList', 'src.core.SeriesApp.SerieList',
side_effect=Exception("Test error") side_effect=Exception("Test error")
): ):
count = await _sync_series_to_database( count = await sync_series_from_data_files(
"/fake/path", mock_logger "/fake/path", mock_logger
) )