Add startup health checks and /health/ready endpoint
- Add _run_startup_health_checks() function in fastapi_app.py - Check ffmpeg availability (warning) - Check DNS resolution for aniworld.to and api.themoviedb.org (warning) - Check anime_directory configuration and writability (error) - Store startup checks in app.state for health endpoint access - Add /health/ready endpoint for container orchestrators - Returns not_ready with 503 when critical failures present - Includes critical_failures list for debugging - Update /health endpoint to include startup check results - Status reflects worst check (error > warning > ok) - Document health check endpoints in DEVELOPMENT.md - Add unit tests for startup health checks - Add unit tests for /health/ready endpoint Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This commit is contained in:
@@ -165,6 +165,61 @@ scheduler = AsyncIOScheduler(jobstores=jobstores)
|
||||
|
||||
**If server is down >1 hour:** No automatic recovery. Manual trigger via `POST /api/scheduler/trigger-rescan` or wait for next scheduled run.
|
||||
|
||||
### Health Check Endpoints
|
||||
|
||||
The application provides health check endpoints for monitoring and container orchestration:
|
||||
|
||||
#### `GET /health`
|
||||
Basic health check returning service status and startup health check results.
|
||||
|
||||
**Response fields:**
|
||||
- `status`: "healthy", "degraded", or "unhealthy" based on startup checks
|
||||
- `timestamp`: ISO timestamp of the check
|
||||
- `series_app_initialized`: Whether the series app is loaded
|
||||
- `anime_directory_configured`: Whether anime_directory is set
|
||||
- `scheduler_next_run` / `scheduler_last_run`: Scheduler times
|
||||
- `checks`: Detailed startup check results (ffmpeg, DNS, anime_directory)
|
||||
|
||||
#### `GET /health/ready`
|
||||
Readiness check for container orchestrators (Kubernetes, Docker Swarm).
|
||||
|
||||
**Response when ready:**
|
||||
```json
|
||||
{
|
||||
"status": "ready",
|
||||
"ready": true,
|
||||
"timestamp": "2024-01-01T00:00:00",
|
||||
"checks": {...}
|
||||
}
|
||||
```
|
||||
|
||||
**Response when not ready (503):**
|
||||
```json
|
||||
{
|
||||
"status": "not_ready",
|
||||
"ready": false,
|
||||
"timestamp": "2024-01-01T00:00:00",
|
||||
"critical_failures": ["anime_directory: not configured"],
|
||||
"checks": {...}
|
||||
}
|
||||
```
|
||||
|
||||
#### `GET /health/detailed`
|
||||
Comprehensive health check including database, filesystem, and system metrics.
|
||||
|
||||
#### Startup Health Checks
|
||||
|
||||
On application startup, the following checks are performed:
|
||||
|
||||
| Check | Failure Status | Impact |
|
||||
|-------|---------------|--------|
|
||||
| `ffmpeg` | warning | HLS downloads may fail |
|
||||
| `dns_aniworld` | warning | Provider requests may fail |
|
||||
| `dns_tmdb` | warning | TMDB API calls may fail |
|
||||
| `anime_directory` | error | Download service disabled |
|
||||
|
||||
DNS checks are warnings because failures can be transient. anime_directory errors disable the download service to prevent failures.
|
||||
|
||||
### Troubleshooting Development Issues
|
||||
|
||||
#### Scheduler missed a run
|
||||
@@ -175,3 +230,21 @@ scheduler = AsyncIOScheduler(jobstores=jobstores)
|
||||
4. Trigger manually: `POST /api/scheduler/trigger-rescan`
|
||||
5. Monitor next run: `GET /health` → `scheduler_next_run`
|
||||
6. If problem repeats, increase `misfire_grace_time` in `scheduler_service.py`.
|
||||
|
||||
#### Startup health check failures
|
||||
|
||||
If `/health` returns `unhealthy` status:
|
||||
|
||||
1. **anime_directory error**: Directory not configured or not writable
|
||||
- Check `ANIME_DIRECTORY` environment variable
|
||||
- Verify directory exists and permissions allow write access
|
||||
- Download service will not initialize until resolved
|
||||
|
||||
2. **ffmpeg warning**: ffmpeg not found in PATH
|
||||
- HLS stream downloads will fail
|
||||
- Install ffmpeg: `apt install ffmpeg` or `brew install ffmpeg`
|
||||
|
||||
3. **DNS warnings**: Domain resolution failed
|
||||
- Check network connectivity
|
||||
- DNS failures are transient — warnings don't block startup
|
||||
- Retry later to verify: `GET /health`
|
||||
|
||||
Reference in New Issue
Block a user