# Incremental Refresh Plan Date: 2026-03-23 Status: executable in MVP mode ## 1. Objective Keep canonical store current for open periods without running full historical load each time. ## 2. Schedule recommendation - business hours: every 15-60 minutes for critical sets - off-hours: one consolidation run - manual targeted run on-demand for urgent drill-down ## 3. Standard command Example incremental window: ```powershell python scripts/run_refresh.py --mode incremental --from-date 2026-01-01T00:00:00 --limit-per-set 200 ``` Targeted catch-up: ```powershell python scripts/run_refresh.py --mode targeted --target-id 68.02 --limit-per-set 200 ``` ## 4. Operational controls 1. Watch latest run status via `GET /refresh/runs`. 2. Watch store health via `GET /store/stats`. 3. Alert on consecutive `failed` runs. 4. Alert on repeated growth of `failed_entity_sets`. ## 5. Idempotency and consistency - Entity writes are upsert-based (`source_entity`, `source_id`). - Links for each source entity are replaced each run to avoid stale edges. - Checkpoints update only for successfully processed entity sets. ## 6. Known MVP limits - Date filtering is best-effort from common date fields. - No CDC stream; refresh is pull-based. - Large enterprise-wide slices still require separate analytical batching strategy.