NODEDC_1C/docs/HARD_SEMANTIC/analytics_store_design.md

1.7 KiB

Analytics Store Design (Layer 5)

Date: 2026-03-23
Status: MVP implemented (canonical_layer/features.py, feature_metrics, anomaly_signals)

1. Purpose

Analytics store keeps derived metrics and anomaly signals on top of canonical accounting store. It avoids running heavy reasoning directly against live 1C bridge.

2. Physical tables

feature_runs

Tracks each feature-engine execution.

Core fields:

  • id
  • status
  • started_at, finished_at
  • baseline_window_hours
  • entities_total
  • metrics_written
  • anomalies_written
  • details_json
  • error_message

feature_metrics

Stores derived metric rows.

Core fields:

  • feature_run_id
  • metric_key
  • scope
  • scope_id
  • metric_type
  • metric_value
  • attributes_json
  • computed_at

anomaly_signals

Stores anomaly events and active anomaly snapshot.

Core fields:

  • feature_run_id
  • signal_type
  • severity
  • scope
  • scope_id
  • score
  • details_json
  • detected_at
  • is_active

3. Current metric families (MVP)

  • Global volume metrics:
    • canonical_entities_total
    • canonical_links_total
    • canonical_entity_sets_total
  • Structural metrics:
    • avg_links_per_entity
    • high_link_threshold
  • Per-source entity metrics:
    • entity_count
    • empty_display_share
    • avg_links_per_entity
  • Tokenized accounting hints:
    • account_token_frequency
  • Operational freshness:
    • refresh_age_hours
  • Optional drift:
    • entity_count_drift_ratio (if previous successful feature run exists)

4. Access surface

API endpoints:

  • POST /features/run
  • GET /features/stats
  • GET /features/runs
  • GET /features/metrics
  • GET /features/anomalies

CLI:

  • python scripts/run_features.py