Files
BanGUI/Docs/Tasks.md
Lukas 408eb900eb Remove Tasks.md spec, add test for _cleanup_wal_files skipping recent files
Remove 335-line task specification from Docs/Tasks.md.
Add test confirming _cleanup_wal_files skips recently-modified WAL/SHM files.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-23 23:04:04 +02:00

2.5 KiB

Task: Investigate Orphaned SQLite Shared Memory Files on Startup

Issue in Detail

The log shows repeated warnings:

event=orphaned_sqlite_file_removed path=/data/bangui.db-shm

This occurs at 19:39:48 and again at 19:49:39 (after restart). The -shm file is SQLite's shared memory file for WAL mode. Its presence indicates unclean shutdowns (crashes or SIGKILL instead of graceful SIGTERM).

Why This Happens

  1. Docker stop timeout: Docker sends SIGTERM, waits stop_grace_period (default 10s), then sends SIGKILL. The backend allows 25s for graceful shutdown, but if the container's stop_grace_period is shorter, the process is killed before cleanup completes.
  2. Missing connection close: If the application crashes or is killed, SQLite connections are not closed, leaving .wal and .shm files behind.
  3. _cleanup_wal_files() is a workaround, not a fix: It removes stale files on the next startup, but the underlying cause (unclean shutdown) remains.

How to Fix It

  1. Verify Docker Compose stop_grace_period: In Docker/compose.prod.yml, ensure the backend service has stop_grace_period: 30s (matching the 25s internal timeout + margin).
  2. Improve shutdown logging: Add explicit logs when the database connection is closed during lifespan shutdown.
  3. Consider PRAGMA journal_mode = DELETE for single-process setups: WAL mode is beneficial for concurrent readers, but if BanGUI runs with a single worker and single process, DELETE mode eliminates .wal/.shm files entirely. Evaluate the tradeoff.

Issues and Trapfalls

  1. WAL mode is required for concurrent reads: If you switch to DELETE mode, readers block writers. This may degrade API performance under load.
  2. The _cleanup_wal_files() 10-second threshold: Files modified within 10 seconds are skipped. If the container restarts rapidly (e.g., health check failure → restart), the files may not be cleaned up.

Documentation References

  • Docs/Deployment.md: Docker deployment configuration and graceful shutdown behavior.
  • Docs/Architekture.md: Deployment constraints and process-local state.

Tests to Write

1. test_cleanup_wal_files_removes_stale_files

  • Setup: Create fake .wal and .shm files with mtime > 10s ago.
  • Action: Call _cleanup_wal_files().
  • Assert: Files are removed.

2. test_cleanup_wal_files_skips_recent_files

  • Setup: Create fake .wal and .shm files with mtime < 10s ago.
  • Action: Call _cleanup_wal_files().
  • Assert: Files are NOT removed.