NODEDC_1C/docs/HARD_SEMANTIC/anomaly_engine_spec.md

1.6 KiB

Anomaly Engine Spec (Layer 5 MVP)

Date: 2026-03-23
Status: implemented in canonical_layer/features.py

1. Engine role

Detect suspicious patterns from canonical data and refresh operations without direct write access to 1C.

2. Input data

  • canonical_entities
  • canonical_links
  • refresh_runs (for freshness context)
  • previous successful feature_runs and feature_metrics (for drift context)

3. Implemented anomaly rules

3.1 no_canonical_data

Trigger:

  • canonical entity count is zero.

Severity:

  • high

3.2 empty_display_share_high

Trigger:

  • per-source-entity count >= 50
  • and empty_display_share >= 0.2

Severity:

  • medium

Trigger:

  • entity link count exceeds dynamic threshold max(10, mean + 3*std)

Severity:

  • medium or high by score multiplier over threshold.

3.4 missing_refresh_baseline

Trigger:

  • no successful refresh run exists.

Severity:

  • high

3.5 stale_refresh

Trigger:

  • refresh_age_hours exceeds configured threshold (ANOMALY_STALE_REFRESH_THRESHOLD_HOURS)

Severity:

  • high

3.6 entity_count_drift

Trigger:

  • previous successful feature run exists
  • absolute drift ratio for entity_count is >= 0.3
  • and absolute count difference >= 10

Severity:

  • medium or high (if drift ratio >= 1.0)

4. Output contract

Each anomaly signal contains:

  • signal_type
  • severity
  • scope
  • scope_id
  • score
  • details
  • is_active

5. Execution

  • API: POST /features/run
  • CLI: python scripts/run_features.py
  • PowerShell: scripts/run_features.ps1