Files
BanGUI/Docs/Tasks.md
Lukas d5a78a251a Remove Tasks.md spec, add test for _cleanup_wal_files skipping recent files
Remove 335-line task specification from Docs/Tasks.md.
Add test confirming _cleanup_wal_files skips recently-modified WAL/SHM files.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-24 22:05:34 +02:00

44 lines
2.5 KiB
Markdown

## Task: Investigate Orphaned SQLite Shared Memory Files on Startup
### Issue in Detail
The log shows repeated warnings:
```
event=orphaned_sqlite_file_removed path=/data/bangui.db-shm
```
This occurs at `19:39:48` and again at `19:49:39` (after restart). The `-shm` file is SQLite's shared memory file for WAL mode. Its presence indicates **unclean shutdowns** (crashes or SIGKILL instead of graceful SIGTERM).
### Why This Happens
1. **Docker stop timeout:** Docker sends SIGTERM, waits `stop_grace_period` (default 10s), then sends SIGKILL. The backend allows 25s for graceful shutdown, but if the container's `stop_grace_period` is shorter, the process is killed before cleanup completes.
2. **Missing connection close:** If the application crashes or is killed, SQLite connections are not closed, leaving `.wal` and `.shm` files behind.
3. **`_cleanup_wal_files()` is a workaround, not a fix:** It removes stale files on the *next* startup, but the underlying cause (unclean shutdown) remains.
### How to Fix It
1. **Verify Docker Compose `stop_grace_period`:** In `Docker/compose.prod.yml`, ensure the backend service has `stop_grace_period: 30s` (matching the 25s internal timeout + margin).
2. **Improve shutdown logging:** Add explicit logs when the database connection is closed during lifespan shutdown.
3. **Consider `PRAGMA journal_mode = DELETE` for single-process setups:** WAL mode is beneficial for concurrent readers, but if BanGUI runs with a single worker and single process, DELETE mode eliminates `.wal`/`.shm` files entirely. Evaluate the tradeoff.
### Issues and Trapfalls
1. **WAL mode is required for concurrent reads:** If you switch to DELETE mode, readers block writers. This may degrade API performance under load.
2. **The `_cleanup_wal_files()` 10-second threshold:** Files modified within 10 seconds are skipped. If the container restarts rapidly (e.g., health check failure → restart), the files may not be cleaned up.
### Documentation References
- **`Docs/Deployment.md`:** Docker deployment configuration and graceful shutdown behavior.
- **`Docs/Architekture.md`:** Deployment constraints and process-local state.
### Tests to Write
#### 1. `test_cleanup_wal_files_removes_stale_files`
- **Setup:** Create fake `.wal` and `.shm` files with mtime > 10s ago.
- **Action:** Call `_cleanup_wal_files()`.
- **Assert:** Files are removed.
#### 2. `test_cleanup_wal_files_skips_recent_files`
- **Setup:** Create fake `.wal` and `.shm` files with mtime < 10s ago.
- **Action:** Call `_cleanup_wal_files()`.
- **Assert:** Files are NOT removed.