NODEDC_1C/docs/incremental_refresh_plan.md

1.3 KiB

Incremental Refresh Plan

Date: 2026-03-23
Status: executable in MVP mode

1. Objective

Keep canonical store current for open periods without running full historical load each time.

2. Schedule recommendation

  • business hours: every 15-60 minutes for critical sets
  • off-hours: one consolidation run
  • manual targeted run on-demand for urgent drill-down

3. Standard command

Example incremental window:

python scripts/run_refresh.py --mode incremental --from-date 2026-01-01T00:00:00 --limit-per-set 200

Targeted catch-up:

python scripts/run_refresh.py --mode targeted --target-id 68.02 --limit-per-set 200

4. Operational controls

  1. Watch latest run status via GET /refresh/runs.
  2. Watch store health via GET /store/stats.
  3. Alert on consecutive failed runs.
  4. Alert on repeated growth of failed_entity_sets.

5. Idempotency and consistency

  • Entity writes are upsert-based (source_entity, source_id).
  • Links for each source entity are replaced each run to avoid stale edges.
  • Checkpoints update only for successfully processed entity sets.

6. Known MVP limits

  • Date filtering is best-effort from common date fields.
  • No CDC stream; refresh is pull-based.
  • Large enterprise-wide slices still require separate analytical batching strategy.