NODEDC_1C/docs/analytics_store_design.md

93 lines
1.7 KiB
Markdown

# Analytics Store Design (Layer 5)
Date: 2026-03-23
Status: MVP implemented (`canonical_layer/features.py`, `feature_metrics`, `anomaly_signals`)
## 1. Purpose
Analytics store keeps derived metrics and anomaly signals on top of canonical accounting store.
It avoids running heavy reasoning directly against live 1C bridge.
## 2. Physical tables
### `feature_runs`
Tracks each feature-engine execution.
Core fields:
- `id`
- `status`
- `started_at`, `finished_at`
- `baseline_window_hours`
- `entities_total`
- `metrics_written`
- `anomalies_written`
- `details_json`
- `error_message`
### `feature_metrics`
Stores derived metric rows.
Core fields:
- `feature_run_id`
- `metric_key`
- `scope`
- `scope_id`
- `metric_type`
- `metric_value`
- `attributes_json`
- `computed_at`
### `anomaly_signals`
Stores anomaly events and active anomaly snapshot.
Core fields:
- `feature_run_id`
- `signal_type`
- `severity`
- `scope`
- `scope_id`
- `score`
- `details_json`
- `detected_at`
- `is_active`
## 3. Current metric families (MVP)
- Global volume metrics:
- `canonical_entities_total`
- `canonical_links_total`
- `canonical_entity_sets_total`
- Structural metrics:
- `avg_links_per_entity`
- `high_link_threshold`
- Per-source entity metrics:
- `entity_count`
- `empty_display_share`
- `avg_links_per_entity`
- Tokenized accounting hints:
- `account_token_frequency`
- Operational freshness:
- `refresh_age_hours`
- Optional drift:
- `entity_count_drift_ratio` (if previous successful feature run exists)
## 4. Access surface
API endpoints:
- `POST /features/run`
- `GET /features/stats`
- `GET /features/runs`
- `GET /features/metrics`
- `GET /features/anomalies`
CLI:
- `python scripts/run_features.py`