1.3 KiB
1.3 KiB
Incremental Refresh Plan
Date: 2026-03-23
Status: executable in MVP mode
1. Objective
Keep canonical store current for open periods without running full historical load each time.
2. Schedule recommendation
- business hours: every 15-60 minutes for critical sets
- off-hours: one consolidation run
- manual targeted run on-demand for urgent drill-down
3. Standard command
Example incremental window:
python scripts/run_refresh.py --mode incremental --from-date 2026-01-01T00:00:00 --limit-per-set 200
Targeted catch-up:
python scripts/run_refresh.py --mode targeted --target-id 68.02 --limit-per-set 200
4. Operational controls
- Watch latest run status via
GET /refresh/runs. - Watch store health via
GET /store/stats. - Alert on consecutive
failedruns. - Alert on repeated growth of
failed_entity_sets.
5. Idempotency and consistency
- Entity writes are upsert-based (
source_entity,source_id). - Links for each source entity are replaced each run to avoid stale edges.
- Checkpoints update only for successfully processed entity sets.
6. Known MVP limits
- Date filtering is best-effort from common date fields.
- No CDC stream; refresh is pull-based.
- Large enterprise-wide slices still require separate analytical batching strategy.