# Anomaly Engine Spec (Layer 5 MVP) Date: 2026-03-23 Status: implemented in `canonical_layer/features.py` ## 1. Engine role Detect suspicious patterns from canonical data and refresh operations without direct write access to 1C. ## 2. Input data - `canonical_entities` - `canonical_links` - `refresh_runs` (for freshness context) - previous successful `feature_runs` and `feature_metrics` (for drift context) ## 3. Implemented anomaly rules ### 3.1 `no_canonical_data` Trigger: - canonical entity count is zero. Severity: - `high` ### 3.2 `empty_display_share_high` Trigger: - per-source-entity count >= 50 - and `empty_display_share >= 0.2` Severity: - `medium` ### 3.3 `high_link_degree` Trigger: - entity link count exceeds dynamic threshold `max(10, mean + 3*std)` Severity: - `medium` or `high` by score multiplier over threshold. ### 3.4 `missing_refresh_baseline` Trigger: - no successful refresh run exists. Severity: - `high` ### 3.5 `stale_refresh` Trigger: - `refresh_age_hours` exceeds configured threshold (`ANOMALY_STALE_REFRESH_THRESHOLD_HOURS`) Severity: - `high` ### 3.6 `entity_count_drift` Trigger: - previous successful feature run exists - absolute drift ratio for `entity_count` is >= 0.3 - and absolute count difference >= 10 Severity: - `medium` or `high` (if drift ratio >= 1.0) ## 4. Output contract Each anomaly signal contains: - `signal_type` - `severity` - `scope` - `scope_id` - `score` - `details` - `is_active` ## 5. Execution - API: `POST /features/run` - CLI: `python scripts/run_features.py` - PowerShell: `scripts/run_features.ps1`