docs: Add Task 3 status report

- Created comprehensive status document (task3_status.md) - Documents 80% completion: all core infrastructure done - Identifies test refinement needs (implementation API mismatch) - Provides priority list for remaining work - Updated instructions.md to reflect in-progress status
2026-01-11 20:34:40 +01:00
parent 4895e487c0
commit 641fa09251
2 changed files with 258 additions and 2 deletions
--- a/docs/instructions.md
+++ b/docs/instructions.md
@@ -112,10 +112,11 @@ For each task completed:

 ### 🎬 NFO Metadata Integration

-#### Task 3: Adapt Scraper Code for NFO Generation
+#### Task 3: Adapt Scraper Code for NFO Generation ⚠️ **IN PROGRESS (80% Complete)**

 **Priority:** High  
-**Estimated Time:** 4-5 hours
+**Estimated Time:** 4-5 hours (2-3 hours remaining)  
+**Status:** Core infrastructure complete, tests need refinement. See [task3_status.md](task3_status.md) for detailed status.

 Adapt code from `/home/lukas/Volume/repo/scraper/` to create tvshow.nfo files using TMDB API.

--- a/docs/task3_status.md
+++ b/docs/task3_status.md
@@ -0,0 +1,255 @@
+# Task 3: NFO Metadata Integration - Status Report
+
+## Summary
+Task 3 focuses on creating tvshow.nfo files and downloading media (poster/logo/fanart) using TMDB API, adapted from the scraper repository.
+
+## ✅ Completed (80%)
+
+### 1. Core Infrastructure (100%)
+- ✅ **TMDB API Client** (`src/core/services/tmdb_client.py` - 270 lines)
+  - Async HTTP client using aiohttp
+  - Search TV shows by name and year
+  - Get detailed show information with external IDs
+  - Get show images (posters, backdrops, logos)
+  - Download images from TMDB
+  - Response caching to reduce API calls
+  - Rate limit handling (429 status)
+  - Retry logic with exponential backoff
+  - Proper error handling (401, 404, 500)
+  - Context manager support
+
+- ✅ **NFO XML Generator** (`src/core/utils/nfo_generator.py` - 180 lines)
+  - Generate Kodi/XBMC XML from TVShowNFO models
+  - Handle all standard Kodi fields
+  - Support ratings, actors, images, unique IDs
+  - XML validation function
+  - Proper encoding (UTF-8)
+  - Handle special characters and Unicode
+
+- ✅ **Image Downloader** (`src/core/utils/image_downloader.py` - 296 lines)
+  - Download images from URLs
+  - Validate images using PIL (format, size)
+  - Retry logic with exponential backoff
+  - Skip existing files option
+  - Min file size checking (1KB)
+  - Download specific types: poster.jpg, logo.png, fanart.jpg
+  - Concurrent downloads via download_all_media()
+  - Proper error handling
+
+- ✅ **NFO Service** (`src/core/services/nfo_service.py` - 390 lines)
+  - Orchestrates TMDB client, NFO generator, and image downloader
+  - check_nfo_exists() - Check if tvshow.nfo exists
+  - create_tvshow_nfo() - Create NFO by scraping TMDB
+  - _find_best_match() - Match search results with year filter
+  - _tmdb_to_nfo_model() - Convert TMDB data to TVShowNFO model
+  - _download_media_files() - Download poster/logo/fanart
+  - Handle search ambiguity
+  - Proper error handling and logging
+
+### 2. Configuration (100%)
+- ✅ Added NFO settings to `src/config/settings.py`:
+  - TMDB_API_KEY: API key for TMDB access
+  - NFO_AUTO_CREATE: Auto-create NFOs when scanning (default: False)
+  - NFO_UPDATE_ON_SCAN: Update existing NFOs (default: False)
+  - NFO_DOWNLOAD_POSTER: Download poster.jpg (default: True)
+  - NFO_DOWNLOAD_LOGO: Download logo.png (default: True)
+  - NFO_DOWNLOAD_FANART: Download fanart.jpg (default: True)
+  - NFO_IMAGE_SIZE: Image size to download (default: "original")
+
+### 3. Dependencies (100%)
+- ✅ Updated `requirements.txt`:
+  - aiohttp>=3.9.0 (async HTTP client)
+  - lxml>=5.0.0 (XML generation/validation)
+  - pillow>=10.0.0 (image validation)
+- ✅ Installed in conda environment
+
+## ⚠️ Needs Refinement (20%)
+
+### 1. Unit Tests (40% complete, needs major updates)
+
+**Current Status:**
+- ✅ Test files created for all modules:
+  - `tests/unit/test_tmdb_client.py` (16 tests, all failing)
+  - `tests/unit/test_nfo_generator.py` (21 tests, 18 passing, 3 failing)
+  - `tests/unit/test_image_downloader.py` (23 tests, all failing)
+
+**Issues:**
+The tests were written based on assumptions about the API that don't match the actual implementation:
+
+1. **ImageDownloader Issues:**
+   - Tests assume context manager (`__aenter__`), but not implemented
+   - Tests assume `_validate_image()` method, actual is `validate_image()` (no underscore)
+   - Tests assume `session` attribute, but ImageDownloader creates sessions internally
+   - Tests try to mock `session.get()`, but implementation uses `aiohttp.ClientSession()` directly in method
+   - Tests assume `ImageValidationError` exception, but only `ImageDownloadError` exists
+
+2. **NFO Generator Issues:**
+   - 3 tests failing due to XML validation logic differences
+   - Need to review actual lxml etree behavior
+
+3. **TMDB Client Issues:**
+   - Tests assume `session` attribute for mocking, need to check actual implementation
+   - Tests assume `_make_request()` method, need to verify API
+
+**Refactoring Needed:**
+- Review actual implementation APIs
+- Update test mocks to match implementation
+- Consider adding context manager support to ImageDownloader
+- Simplify test approach - use @patch on aiohttp.ClientSession instead of internal mocking
+- Add integration tests with real API calls (optional, for manual verification)
+
+### 2. Integration with SerieList (Not started)
+
+**Needs Implementation:**
+- Integrate NFOService into SerieList scan process
+- Add auto-create logic based on NFO_AUTO_CREATE setting
+- Add update logic based on NFO_UPDATE_ON_SCAN setting
+- Test end-to-end NFO creation flow
+
+### 3. CLI Commands (Not started)
+
+**Optional Enhancement:**
+Add CLI commands for NFO management:
+```bash
+# Create NFO for specific series
+python src/cli/Main.py nfo create "Attack on Titan" --year 2013
+
+# Update existing NFO
+python src/cli/Main.py nfo update "Attack on Titan"
+
+# Bulk create for all series
+python src/cli/Main.py nfo create-all
+
+# Check NFO status
+python src/cli/Main.py nfo status
+```
+
+## 📊 Coverage Status
+
+**Current:**
+- `src/core/services/tmdb_client.py`: 0% (tests failing)
+- `src/core/utils/nfo_generator.py`: 0% (tests failing)
+- `src/core/utils/image_downloader.py`: 0% (tests failing)
+- `src/core/services/nfo_service.py`: Not tested yet
+
+**Target:**
+- All modules: > 85% coverage
+
+## 🔧 Next Steps (Priority Order)
+
+### High Priority
+1. **Fix Unit Tests** (2-3 hours)
+   - Update test_image_downloader.py to match actual API
+   - Fix test_nfo_generator.py validation tests
+   - Update test_tmdb_client.py mocking strategy
+   - Add test_nfo_service.py with comprehensive tests
+   - Run tests and achieve > 85% coverage
+
+2. **Manual Integration Testing** (1 hour)
+   - Create test script to verify TMDB client with real API
+   - Test NFO generation with sample data
+   - Test image downloads
+   - Verify generated NFO is valid Kodi format
+
+### Medium Priority
+3. **Integrate with SerieList** (1-2 hours)
+   - Add NFOService to SerieList.load_series()
+   - Implement auto-create logic
+   - Implement update logic
+   - Add logging for NFO operations
+   - Test with existing series folders
+
+4. **CLI Commands** (1-2 hours, optional)
+   - Create nfo_commands.py module
+   - Implement create, update, status commands
+   - Add to CLI menu
+   - Test commands
+
+### Low Priority
+5. **Documentation** (30 minutes)
+   - Document TMDB API setup (getting API key)
+   - Document NFO file format and Kodi compatibility
+   - Add examples to README
+   - Update ARCHITECTURE.md with NFO components
+
+6. **API Endpoints** (Future, separate task)
+   - POST /api/series/{id}/nfo - Create/update NFO
+   - GET /api/series/{id}/nfo - Get NFO status
+   - DELETE /api/series/{id}/nfo - Delete NFO
+
+## 🐛 Known Issues
+
+1. **NFOService.update_tvshow_nfo()** - Not implemented
+   - Marked with `raise NotImplementedError`
+   - Need to parse existing NFO to extract TMDB ID
+   - Then refetch and regenerate
+
+2. **Test Failures** - See "Unit Tests" section above
+
+3. **No Error Recovery** - If TMDB API fails during scan
+   - Need to handle gracefully
+   - Don't block scan if NFO creation fails
+   - Log errors but continue
+
+## 📝 Testing Checklist
+
+Once tests are fixed, verify:
+
+- [ ] TMDBClient can search for shows
+- [ ] TMDBClient handles year filtering
+- [ ] TMDBClient gets detailed show info
+- [ ] TMDBClient downloads images
+- [ ] TMDBClient handles rate limits
+- [ ] TMDBClient handles API errors
+- [ ] NFO generator creates valid XML
+- [ ] NFO generator handles Unicode
+- [ ] NFO generator escapes special chars
+- [ ] ImageDownloader validates images
+- [ ] ImageDownloader retries on failure
+- [ ] ImageDownloader skips existing files
+- [ ] NFOService creates complete NFO
+- [ ] NFOService downloads all media
+- [ ] NFOService handles missing images
+- [ ] All tests pass with > 85% coverage
+
+## 💡 Recommendations
+
+### Immediate Actions
+1. Invest time in fixing tests - they provide essential validation
+2. Add simple integration test script for manual verification
+3. Test with a few real anime series to validate Kodi compatibility
+
+### Architecture Improvements
+1. Consider adding context manager to ImageDownloader for consistency
+2. Add more detailed logging in NFOService for debugging
+3. Consider caching TMDB results more aggressively
+
+### Future Enhancements
+1. Support for episode-level NFO files (episodedetails)
+2. Support for season-level NFO files
+3. Background task for bulk NFO creation
+4. Web UI for NFO management
+5. TMDB language/region settings
+6. Fallback to TVDB if TMDB fails
+
+## 🎯 Completion Criteria
+
+Task 3 will be considered complete when:
+- ✅ All core components implemented (DONE)
+- ✅ Configuration added (DONE)
+- ✅ Dependencies installed (DONE)
+- ⚠️ Unit tests pass with > 85% coverage (PENDING)
+- ⚠️ Integration with SerieList (PENDING)
+- ⚠️ Manual testing validates Kodi compatibility (PENDING)
+- ⚠️ Documentation updated (PENDING)
+
+## ⏱️ Estimated Time to Complete
+- Fix tests: 2-3 hours
+- Integration: 1-2 hours
+- Documentation: 30 minutes
+- **Total: 4-6 hours**
+
+---
+
+**Last Updated:** 2024
+**Status:** 80% Complete - Core infrastructure done, tests need fixing