NODEDC_1C/.codex/skills/domain-case-loop/SKILL.md

202 lines
8.9 KiB
Markdown

---
name: domain-case-loop
description: "Use this skill when a user wants to iteratively refine one NDC_1C domain case or one linked multi-step domain scenario through a multi-agent loop: automated capture, JSON analysis, minimal domain patch, rerun, and before/after verdict."
---
# Domain case loop
This skill packages the standard workflow for iterating on one concrete domain case or one linked multi-step domain scenario in NDC_1C.
## Use this skill when
- the user wants to improve one domain question end-to-end;
- the answer exists but is noisy, heuristic, partial, or business-useless;
- the route is wrong even if the wording looks better;
- there is a gap between exact compute intent and actual fallback output;
- there are follow-up / continuation bugs that corrupt business context.
- the user has a cascade of linked questions that should reuse one assistant session and semantic state.
- the bug appears only in colloquial/slang wording or in UI-generated follow-up phrasing such as `По выбранному объекту "...": ...`.
## Do not use this skill when
- the user is asking for a broad architecture rewrite;
- there is no concrete domain case or no reproducible input;
- the task is only prose editing with no technical/domain component;
- the task is a generic repo cleanup unrelated to domain capability behavior.
## Repo-specific runtime map
Read `references/repo_runtime_map.md` before the first real cycle.
Use these repo-native capture paths:
- automated capture: `python scripts/domain_case_loop.py run-case ...`
- linked multi-step capture: `python scripts/domain_case_loop.py run-scenario --manifest path/to/manifest.json`
- full domain question pool capture: `python scripts/domain_case_loop.py run-pack --manifest path/to/pack.json`
- autonomous full-pack loop: `python scripts/domain_case_loop.py run-pack-loop --manifest path/to/pack.json`
- import existing technical export: `python scripts/domain_case_loop.py import-export ...`
- `run-case` defaults to the repo's live local profile: `local / qwen2.5-14b-instruct-1m / http://127.0.0.1:1234/v1`
- override with `--llm-provider`, `--llm-model`, `--llm-base-url`, `--llm-api-key` when needed
- `run-pack-loop` defaults to `gpt-5.4` for analyst and `gpt-5.4-mini` for coder; tune with `--analyst-codex-model`, `--coder-codex-model`, `--analyst-reasoning-effort`, `--coder-reasoning-effort`
## Workflow
### Scenario mode
Use scenario mode when the user brings a linked chain such as:
- "what is on stock now"
- "who supplied this item"
- "which documents bought it"
- "was it later sold"
In scenario mode:
- create `scenario_manifest.json` first;
- keep one shared `session_id`;
- capture each step under `artifacts/domain_runs/<scenario_id>/steps/<step_id>/`;
- preserve semantic carryover via explicit `scenario_state.json`, not vague model memory.
Use `references/scenario_manifest_template.json`.
### Pack mode
Use pack mode when the user brings a whole domain pool and wants grouped orchestration rather than one isolated chain.
In pack mode:
- group the question pool into several coherent scenarios;
- capture each scenario under `artifacts/domain_runs/<pack_id>/scenarios/<scenario_id>/`;
- write aggregate `pack_state.json` and `pack_summary.md`;
- treat unresolved scenarios as enablement backlog, not as a reason to drop the domain.
### Autonomous pack-loop mode
Use autonomous pack-loop mode when the user wants the system to continue with analyst/coder iterations until the analyst gate is reached or the loop hits a real blocker.
In autonomous pack-loop mode:
- run `python scripts/domain_case_loop.py run-pack-loop --manifest ...`;
- keep each iteration under `artifacts/domain_runs/<loop_id>/iterations/<iteration_id>/`;
- read `analyst_verdict.json` before any coder patch;
- let coder patch only the highest-value domain targets from the current analyst verdict;
- stop only on `accepted`, `blocked`, explicit `requires_user_decision = true`, or `max_iterations`;
- do not stop just because the analyst returns `needs_exact_capability` or `partial` if autonomous domain enablement work still remains.
- treat `quality score >= 80` as the target gate, not as permission to keep pushing through hard blockers, missing essential observations, or unsafe fixes.
- for follow-up-heavy domains, include conversational variants, slang/typo variants, and UI-generated selected-object follow-ups in the acceptance slice instead of validating only one canonical wording.
### Step 1 - Normalize the case
Create `artifacts/domain_runs/<case_id>/case_brief.md` with:
- domain name
- raw user question
- expected business meaning
- expected exact capability
- expected result mode
- known constraints
- acceptance criteria draft
Use `references/case_brief_template.md`.
### Step 2 - Capture baseline
Preferred path:
- run `python scripts/domain_case_loop.py run-case ...`
Fallback path:
- if the user already has a copied technical export markdown, run `python scripts/domain_case_loop.py import-export ...`
Required artifacts:
- `baseline_output.md`
- `baseline_debug.json`
- `baseline_turn.json`
### Step 3 - Analyst verdict
Spawn `domain_analyst` and provide:
- `case_brief.md`
- `baseline_turn.json`
- `baseline_output.md`
- `baseline_debug.json`
- optional relevant code excerpts or file paths
Require a full verdict using `references/verdict_template.md`.
The verdict must explicitly say whether the case is:
- an existing in-contour regression;
- a missing route/intent/capability inside project scope;
- a true out-of-scope request.
### Step 4 - Domain patch
Spawn `domain_coder` with:
- the case brief
- the analyst verdict
- baseline artifacts
Require:
- a minimal patch
- zero architecture drift
- rerun after changes
- if the domain is in project scope but outside the current contour, convert the verdict into capability enablement work instead of closing the case as unsupported
### Step 5 - Rerun
Capture:
- `rerun_output.md`
- `rerun_debug.json`
- `rerun_turn.json`
- `patch_summary.md`
### Step 6 - Before/after analysis
Spawn `domain_analyst` again for:
- before/after comparison
- final status recommendation
- quality score from 0 to 100
### Step 7 - Final status
Write `final_status.md` with one of:
- accepted
- partial
- blocked
- needs_exact_capability
`needs_exact_capability` is the default status when the business/domain request is valid for the project, but the current contour is missing the route, intent, capability, or domain bootstrap needed to answer it.
`needs_exact_capability` does not automatically stop autonomous pack-loop mode. Treat it as "continue domain enablement work" unless the analyst explicitly marks `requires_user_decision = true`, the runtime is truly blocked, or the loop hits `max_iterations`.
Autonomous pack-loop mode should stop early and ask the user when at least one of these is true:
- a required observation anchor is missing and cannot be recovered safely from artifacts, 1C, or the current scenario state;
- the next patch would introduce a hack, brittle workaround, hidden heuristic masking, or another low-trust shortcut;
- the next patch would cause risky architecture drift, disproportionate complexity, or a contour expansion with unclear blast radius;
- a business-critical ambiguity or scope tradeoff cannot be resolved from repo context and artifacts alone.
Accepted requires:
- quality score >= 80
- no unresolved P0 defects
- no silent heuristic masking
## Hard rules
- Do not count heuristic candidates as confirmed business answers.
- If exact data should exist in 1C/MCP, prefer exact route work over prompt cosmetics.
- If exact data does not exist yet in the reachable contour, return a technical insufficiency with a crisp blocker.
- If the user case belongs to a project-relevant domain but is outside the current contour, do not treat that as a terminal rejection. Treat it as domain enablement work and record the missing route/intent/capability explicitly.
- Raise `requires_user_decision = true` when the loop would otherwise have to guess a missing anchor, choose between materially different risky implementations, or push through a hacky/suspicious fix path.
- Never fabricate 1C data.
- Keep domain fixes minimal and localized.
- Preserve successful baseline scenarios.
- Treat follow-up continuity as a state-machine problem, not a wording problem.
- Do not accept a domain as hardened if only canonical phrasing works while colloquial or UI-generated follow-up phrasing still breaks the exact contour.
## Domain-specific framing
For this repository:
- architecture must remain unchanged;
- 1C/MCP is the primary source of truth;
- analyst output must be detailed and business-readable;
- answers should be suitable for product hardening, not just debugging notes;
- machine-readable turn artifacts are first-class inputs for analysis.
- New user domains may be unmarked in the current repo. Missing markup is expected and should be handled as enablement, not as a reason to stop the loop.
## Recommended artifact set
Use the artifact layout from `references/artifact_layout.md`.