refactor(scheduler): drop separate scheduler.db in favour of MemoryJobStore
Scheduler used a separate SQLite file (scheduler.db) only to persist one cron job. This was originally required because APScheduler's SQLAlchemyJobStore is sync-only, creating an async/sync driver conflict when accessing the same file. The job is rebuilt from config.json on every startup regardless (replace_existing=True), so the persisted state only served misfire detection. Moved misfire detection into the app layer by querying system_settings.last_scan_timestamp on startup: if the last scan is >23h but <25h ago, an immediate rescan is triggered. Change summary: - Remove SQLAlchemyJobStore; use default MemoryJobStore instead - Add _check_missed_run() that reads last_scan_timestamp from aniworld.db - Update docs/DEVELOPMENT.md scheduler troubleshooting section - Update the scheduler unit test that verified SQLAlchemyJobStore
This commit is contained in:
@@ -162,24 +162,21 @@ await client.close() # May not be called if exception raised earlier
|
||||
|
||||
### Scheduler Persistence and Recovery
|
||||
|
||||
APScheduler stores jobs in `data/scheduler.db` (SQLite) so they survive process restarts:
|
||||
The scheduler uses APScheduler's in-memory job store. Jobs are reconstructed from `config.json` on every startup — no separate database is needed.
|
||||
|
||||
```python
|
||||
from apscheduler.jobstores.sqlalchemy import SQLAlchemyJobStore
|
||||
|
||||
jobstores = {
|
||||
"default": SQLAlchemyJobStore(url="sqlite:///./data/scheduler.db"),
|
||||
}
|
||||
scheduler = AsyncIOScheduler(jobstores=jobstores)
|
||||
# Jobs are built from config on startup — no persistence DB required
|
||||
scheduler = AsyncIOScheduler() # default MemoryJobStore
|
||||
scheduler.add_job(..., replace_existing=True)
|
||||
```
|
||||
|
||||
**Grace period:** `misfire_grace_time=3600` (1 hour). If server is down at scheduled time and restarts within 1 hour, missed job runs automatically via APScheduler coalesce behavior.
|
||||
**Startup misfire recovery:** On `start()`, the scheduler checks `system_settings.last_scan_timestamp` in `aniworld.db`. If the last scan is overdue (>23h but <25h ago), an immediate rescan is triggered. This replaces APScheduler's built-in misfire handling which required a separate SQLite database.
|
||||
|
||||
**Startup recovery:** On `start()`, scheduler loads persisted jobs from DB. APScheduler handles missed jobs internally when `coalesce=True`.
|
||||
**Grace period:** If the server was down for more than 25 hours, no automatic recovery occurs to avoid surprise rescans after long downtime.
|
||||
|
||||
**Health endpoint:** `GET /health` returns `scheduler_next_run` and `scheduler_last_run` for external monitors (Uptime Kuma, Prometheus, etc.).
|
||||
|
||||
**If server is down >1 hour:** No automatic recovery. Manual trigger via `POST /api/scheduler/trigger-rescan` or wait for next scheduled run.
|
||||
**If server is down too long:** Manual trigger via `POST /api/scheduler/trigger-rescan` or wait for next scheduled run.
|
||||
|
||||
### Database Session Management
|
||||
|
||||
@@ -257,30 +254,27 @@ DNS checks are warnings because failures can be transient. anime_directory error
|
||||
#### Scheduler missed a run
|
||||
|
||||
1. Server was down at scheduled time (03:00 UTC by default).
|
||||
2. Check `data/scheduler.db` exists — if not, jobs are not persisted.
|
||||
3. If server was down >1 hour, missed job is dropped (misfire window exceeded).
|
||||
2. On restart, the scheduler checks `last_scan_timestamp` — if overdue by 23-25h, it triggers immediately.
|
||||
3. If server was down >25 hours, missed job is skipped to avoid surprise rescans.
|
||||
4. Trigger manually: `POST /api/scheduler/trigger-rescan`
|
||||
5. Monitor next run: `GET /health` → `scheduler_next_run`
|
||||
6. If problem repeats, increase `misfire_grace_time` in `scheduler_service.py`.
|
||||
|
||||
#### Scheduler not firing (no events at scheduled time)
|
||||
|
||||
If the scheduler appears configured but never triggers:
|
||||
|
||||
1. **Verify scheduler.db contains the job:**
|
||||
```bash
|
||||
sqlite3 data/scheduler.db "SELECT id, next_run_time FROM apscheduler_jobs;"
|
||||
```
|
||||
- `next_run_time` should be in the future
|
||||
- If it's in the past, the server was down when the job should have fired
|
||||
|
||||
2. **Check application logs for scheduler startup:**
|
||||
1. **Check application logs for scheduler startup:**
|
||||
```
|
||||
grep "Scheduler service started" fastapi_app.log
|
||||
```
|
||||
- If missing, the scheduler failed to start — check for errors above this line
|
||||
- If present, scheduler started successfully
|
||||
|
||||
2. **Verify the job is registered:**
|
||||
```
|
||||
grep "Scheduler started with cron trigger" fastapi_app.log
|
||||
```
|
||||
|
||||
3. **Verify APScheduler events in logs:**
|
||||
```
|
||||
grep "apscheduler.executors.default" fastapi_app.log
|
||||
|
||||
Reference in New Issue
Block a user