99 lines
1.6 KiB
Markdown
99 lines
1.6 KiB
Markdown
# Anomaly Engine Spec (Layer 5 MVP)
|
|
|
|
Date: 2026-03-23
|
|
Status: implemented in `canonical_layer/features.py`
|
|
|
|
## 1. Engine role
|
|
|
|
Detect suspicious patterns from canonical data and refresh operations without direct write access to 1C.
|
|
|
|
## 2. Input data
|
|
|
|
- `canonical_entities`
|
|
- `canonical_links`
|
|
- `refresh_runs` (for freshness context)
|
|
- previous successful `feature_runs` and `feature_metrics` (for drift context)
|
|
|
|
## 3. Implemented anomaly rules
|
|
|
|
### 3.1 `no_canonical_data`
|
|
|
|
Trigger:
|
|
|
|
- canonical entity count is zero.
|
|
|
|
Severity:
|
|
|
|
- `high`
|
|
|
|
### 3.2 `empty_display_share_high`
|
|
|
|
Trigger:
|
|
|
|
- per-source-entity count >= 50
|
|
- and `empty_display_share >= 0.2`
|
|
|
|
Severity:
|
|
|
|
- `medium`
|
|
|
|
### 3.3 `high_link_degree`
|
|
|
|
Trigger:
|
|
|
|
- entity link count exceeds dynamic threshold `max(10, mean + 3*std)`
|
|
|
|
Severity:
|
|
|
|
- `medium` or `high` by score multiplier over threshold.
|
|
|
|
### 3.4 `missing_refresh_baseline`
|
|
|
|
Trigger:
|
|
|
|
- no successful refresh run exists.
|
|
|
|
Severity:
|
|
|
|
- `high`
|
|
|
|
### 3.5 `stale_refresh`
|
|
|
|
Trigger:
|
|
|
|
- `refresh_age_hours` exceeds configured threshold (`ANOMALY_STALE_REFRESH_THRESHOLD_HOURS`)
|
|
|
|
Severity:
|
|
|
|
- `high`
|
|
|
|
### 3.6 `entity_count_drift`
|
|
|
|
Trigger:
|
|
|
|
- previous successful feature run exists
|
|
- absolute drift ratio for `entity_count` is >= 0.3
|
|
- and absolute count difference >= 10
|
|
|
|
Severity:
|
|
|
|
- `medium` or `high` (if drift ratio >= 1.0)
|
|
|
|
## 4. Output contract
|
|
|
|
Each anomaly signal contains:
|
|
|
|
- `signal_type`
|
|
- `severity`
|
|
- `scope`
|
|
- `scope_id`
|
|
- `score`
|
|
- `details`
|
|
- `is_active`
|
|
|
|
## 5. Execution
|
|
|
|
- API: `POST /features/run`
|
|
- CLI: `python scripts/run_features.py`
|
|
- PowerShell: `scripts/run_features.ps1`
|