Aniworld/docs/identifier_standardization_validation.md

13 KiB

Series Identifier Standardization - Validation Instructions

Overview

This document provides comprehensive instructions for AI agents to validate the Series Identifier Standardization change across the Aniworld codebase. The change standardizes key as the primary identifier for series and relegates folder to metadata-only status.

Summary of the Change

Field Purpose Usage
key Primary Identifier - Provider-assigned, URL-safe (e.g., attack-on-titan) All lookups, API operations, database queries, WebSocket events
folder Metadata Only - Filesystem folder name (e.g., Attack on Titan (2013)) Display purposes, filesystem operations only
id Database Primary Key - Internal auto-increment integer Database relationships only

Validation Checklist

Phase 2: Application Layer Services

Files to validate:

  1. src/server/services/anime_service.py

    • Class docstring explains key vs folder convention
    • All public methods accept key parameter for series identification
    • No methods accept folder as an identifier parameter
    • Event handler methods document key/folder convention
    • Progress tracking uses key in progress IDs where possible
  2. src/server/services/download_service.py

    • DownloadItem uses serie_id (which should be the key)
    • serie_folder is documented as metadata only
    • Queue operations look up series by key not folder
    • Persistence format includes serie_id as the key identifier
  3. src/server/services/websocket_service.py

    • Module docstring explains key/folder convention
    • Broadcast methods include key in message payloads
    • folder is documented as optional/display only
    • Event broadcasts use key as primary identifier
  4. src/server/services/scan_service.py

    • Scan operations use key for identification
    • Progress events include key field
  5. src/server/services/progress_service.py

    • Progress tracking includes key in metadata where applicable

Validation Commands:

# Check service layer for folder-based lookups
grep -rn "by_folder\|folder.*=.*identifier\|folder.*lookup" src/server/services/ --include="*.py"

# Verify key is used in services
grep -rn "serie_id\|series_key\|key.*identifier" src/server/services/ --include="*.py"

Phase 3: API Endpoints and Responses

Files to validate:

  1. src/server/api/anime.py

    • AnimeSummary model has key field with proper description
    • AnimeDetail model has key field with proper description
    • API docstrings explain key is the primary identifier
    • folder field descriptions state "metadata only"
    • Endpoint paths use key parameter (e.g., /api/anime/{key})
    • No endpoints use folder as path parameter for lookups
  2. src/server/api/download.py

    • Download endpoints use serie_id (key) for operations
    • Request models document key/folder convention
    • Response models include key as primary identifier
  3. src/server/models/anime.py

    • Module docstring explains identifier convention
    • AnimeSeriesResponse has key field properly documented
    • SearchResult has key field properly documented
    • Field validators normalize key to lowercase
    • folder fields document metadata-only purpose
  4. src/server/models/download.py

    • DownloadItem has serie_id documented as the key
    • serie_folder documented as metadata only
    • Field descriptions are clear about primary vs metadata
  5. src/server/models/websocket.py

    • Module docstring explains key/folder convention
    • Message models document key as primary identifier
    • folder documented as optional display metadata

Validation Commands:

# Check API endpoints for folder-based paths
grep -rn "folder.*Path\|/{folder}" src/server/api/ --include="*.py"

# Verify key is used in endpoints
grep -rn "/{key}\|series_key\|serie_id" src/server/api/ --include="*.py"

# Check model field descriptions
grep -rn "Field.*description.*identifier\|Field.*description.*key\|Field.*description.*folder" src/server/models/ --include="*.py"

Phase 4: Frontend Integration

Files to validate:

  1. src/server/web/static/js/app.js

    • selectedSeries Set uses key values, not folder
    • seriesData array comments indicate key as primary identifier
    • Selection operations use key property
    • API calls pass key for series identification
    • WebSocket message handlers extract key from data
    • No code uses folder for series lookups
  2. src/server/web/static/js/queue.js

    • Queue items reference series by key or serie_id
    • WebSocket handlers extract key from messages
    • UI operations use key for identification
    • serie_folder used only for display
  3. src/server/web/static/js/websocket_client.js

    • Message handling preserves key field
    • No transformation that loses key information
  4. HTML Templates (src/server/web/templates/)

    • Data attributes use key for identification (e.g., data-key)
    • No data-folder used for identification purposes
    • Display uses folder or name appropriately

Validation Commands:

# Check JavaScript for folder-based lookups
grep -rn "\.folder\s*==\|folder.*identifier\|getByFolder" src/server/web/static/js/ --include="*.js"

# Check data attributes in templates
grep -rn "data-key\|data-folder\|data-series" src/server/web/templates/ --include="*.html"

Phase 5: Database Operations

Files to validate:

  1. src/server/database/models.py

    • AnimeSeries model has key column with unique constraint
    • key column is indexed
    • Model docstring explains identifier convention
    • folder column docstring states "metadata only"
    • Validators check key is not empty
    • No folder uniqueness constraint (unless intentional)
  2. src/server/database/service.py

    • AnimeSeriesService has get_by_key() method
    • Class docstring explains lookup convention
    • No get_by_folder() without deprecation
    • All CRUD operations use key for identification
    • Logging uses key in messages

Validation Commands:

# Check database models
grep -rn "unique=True\|index=True" src/server/database/models.py

# Check service lookups
grep -rn "get_by_key\|get_by_folder\|filter.*key\|filter.*folder" src/server/database/service.py

Phase 6: WebSocket Events

Files to validate:

  1. All WebSocket broadcast calls should include key in payload:

    • download_progress → includes key
    • download_complete → includes key
    • download_failed → includes key
    • scan_progress → includes key (where applicable)
    • queue_status → items include key
  2. Message format validation:

    {
      "type": "download_progress",
      "data": {
        "key": "attack-on-titan",      // PRIMARY - always present
        "folder": "Attack on Titan (2013)",  // OPTIONAL - display only
        "progress": 45.5,
        ...
      }
    }
    

Validation Commands:

# Check WebSocket broadcast calls
grep -rn "broadcast.*key\|send_json.*key" src/server/services/ --include="*.py"

# Check message construction
grep -rn '"key":\|"folder":' src/server/services/ --include="*.py"

Phase 7: Test Coverage

Test files to validate:

  1. tests/unit/test_serie_class.py

    • Tests for key validation (empty, whitespace, None)
    • Tests for key as primary identifier
    • Tests for folder as metadata only
  2. tests/unit/test_anime_service.py

    • Service tests use key for operations
    • Mock objects have proper key attributes
  3. tests/unit/test_database_models.py

    • Tests for key uniqueness constraint
    • Tests for key validation
  4. tests/unit/test_database_service.py

    • Tests for get_by_key() method
    • No tests for deprecated folder lookups
  5. tests/api/test_anime_endpoints.py

    • API tests use key in requests
    • Mock FakeSerie has proper key attribute
    • Comments explain key/folder convention
  6. tests/unit/test_websocket_service.py

    • WebSocket tests verify key in messages
    • Broadcast tests include key in payload

Validation Commands:

# Run all tests
conda run -n AniWorld python -m pytest tests/ -v --tb=short

# Run specific test files
conda run -n AniWorld python -m pytest tests/unit/test_serie_class.py -v
conda run -n AniWorld python -m pytest tests/unit/test_database_models.py -v
conda run -n AniWorld python -m pytest tests/api/test_anime_endpoints.py -v

# Search tests for identifier usage
grep -rn "key.*identifier\|folder.*metadata" tests/ --include="*.py"

Common Issues to Check

1. Inconsistent Naming

Look for inconsistent parameter names:

  • serie_key vs series_key vs key
  • serie_id should refer to key, not database id
  • serie_folder vs folder

2. Missing Documentation

Check that ALL models, services, and APIs document:

  • What key is and how to use it
  • That folder is metadata only

3. Legacy Code Patterns

Search for deprecated patterns:

# Bad - using folder for lookup
series = get_by_folder(folder_name)

# Good - using key for lookup
series = get_by_key(series_key)

4. API Response Consistency

Verify all API responses include:

  • key field (primary identifier)
  • folder field (optional, for display)

5. Frontend Data Flow

Verify the frontend:

  • Stores key in selection sets
  • Passes key to API calls
  • Uses folder only for display

Deprecation Warnings

The following should have deprecation warnings (for removal in v3.0.0):

  1. Any get_by_folder() or GetByFolder() methods
  2. Any API endpoints that accept folder as a lookup parameter
  3. Any frontend code that uses folder for identification

Example deprecation:

import warnings

def get_by_folder(self, folder: str):
    """DEPRECATED: Use get_by_key() instead."""
    warnings.warn(
        "get_by_folder() is deprecated, use get_by_key(). "
        "Will be removed in v3.0.0",
        DeprecationWarning,
        stacklevel=2
    )
    # ... implementation

Automated Validation Script

Run this script to perform automated checks:

#!/bin/bash
# identifier_validation.sh

echo "=== Series Identifier Standardization Validation ==="
echo ""

echo "1. Checking core entities..."
grep -rn "PRIMARY IDENTIFIER\|metadata only" src/core/entities/ --include="*.py" | head -20

echo ""
echo "2. Checking for deprecated folder lookups..."
grep -rn "get_by_folder\|GetByFolder" src/ --include="*.py"

echo ""
echo "3. Checking API models for key field..."
grep -rn 'key.*Field\|Field.*key' src/server/models/ --include="*.py" | head -20

echo ""
echo "4. Checking database models..."
grep -rn "key.*unique\|key.*index" src/server/database/models.py

echo ""
echo "5. Checking frontend key usage..."
grep -rn "selectedSeries\|\.key\|data-key" src/server/web/static/js/ --include="*.js" | head -20

echo ""
echo "6. Running tests..."
conda run -n AniWorld python -m pytest tests/unit/test_serie_class.py -v --tb=short

echo ""
echo "=== Validation Complete ==="

Expected Results

After validation, you should confirm:

  1. All core entities use key as primary identifier
  2. All services look up series by key
  3. All API endpoints use key for operations
  4. All database queries use key for lookups
  5. Frontend uses key for selection and API calls
  6. WebSocket events include key in payload
  7. All tests pass
  8. Documentation clearly explains the convention
  9. Deprecation warnings exist for legacy patterns

Sign-off

Once validation is complete, update this section:

  • Phase 1: Core Entities - Validated by: AI Agent Date: 28 Nov 2025
  • Phase 2: Services - Validated by: AI Agent Date: 28 Nov 2025
  • Phase 3: API - Validated by: ___ Date: ___
  • Phase 4: Frontend - Validated by: ___ Date: ___
  • Phase 5: Database - Validated by: ___ Date: ___
  • Phase 6: WebSocket - Validated by: ___ Date: ___
  • Phase 7: Tests - Validated by: ___ Date: ___

Final Approval: ******___****** Date: **_**