15 KiB

Raw Permalink Blame History

name	description
domain-case-loop	Use this skill when a user wants to iteratively refine one NDC_1C domain case or one linked multi-step domain scenario through a multi-agent loop: automated capture, JSON analysis, minimal domain patch, rerun, and before/after verdict.

Domain case loop

This skill packages the standard workflow for iterating on one concrete domain case or one linked multi-step domain scenario in NDC_1C.

Use this skill when

the user wants to improve one domain question end-to-end;
the answer exists but is noisy, heuristic, partial, or business-useless;
the route is wrong even if the wording looks better;
there is a gap between exact compute intent and actual fallback output;
there are follow-up / continuation bugs that corrupt business context.
the user has a cascade of linked questions that should reuse one assistant session and semantic state.
the bug appears only in colloquial/slang wording or in UI-generated follow-up phrasing such as По выбранному объекту "...": ....

Do not use this skill when

the user is asking for a broad architecture rewrite;
there is no concrete domain case or no reproducible input;
the task is only prose editing with no technical/domain component;
the task is a generic repo cleanup unrelated to domain capability behavior.

Repo-specific runtime map

Read references/repo_runtime_map.md before the first real cycle. For follow-up-heavy domains, also read references/scenario_tree_acceptance_canon.md before scenario mode, pack mode, or autonomous pack-loop mode. For business-first analyst work, also read references/business_first_analyst_rubric.md before redefining acceptance or hardening a noisy-but-technically-grounded domain. If docs/orchestration/active_domain_contract.json exists, treat it as the single mutable source of truth for the current domain and prefer it over older scattered pool/pack prose docs.

Use these repo-native capture paths:

automated capture: python scripts/domain_case_loop.py run-case ...
linked multi-step capture: python scripts/domain_case_loop.py run-scenario --manifest path/to/manifest.json
full domain question pool capture: python scripts/domain_case_loop.py run-pack --manifest path/to/pack.json
autonomous full-pack loop: python scripts/domain_case_loop.py run-pack-loop --manifest path/to/pack.json
import existing technical export: python scripts/domain_case_loop.py import-export ...
run-case defaults to the repo's live local profile: local / qwen2.5-14b-instruct-1m / http://127.0.0.1:1234/v1
override with --llm-provider, --llm-model, --llm-base-url, --llm-api-key when needed
run-pack-loop defaults to gpt-5.4 for the independent business analyst and lead-handoff repair mode; opt into the old autonomous coder loop only with --repair-mode auto-coder

Workflow

Scenario mode

Use scenario mode when the user brings a linked chain such as:

"what is on stock now"
"who supplied this item"
"which documents bought it"
"was it later sold"

In scenario mode:

model the domain as a scenario tree, not as a flat list of prompts;
define one root plus critical child drilldowns and the primary user path;
treat selected-object follow-up branches as first-class business paths when the UI exposes selectable entities;
create scenario_manifest.json first;
keep one shared session_id;
capture each step under artifacts/domain_runs/<scenario_id>/steps/<step_id>/;
preserve semantic carryover via explicit scenario_state.json, not vague model memory;
require a scenario_acceptance_matrix.md artifact that records node/edge coverage and paraphrase-family coverage.

Use references/scenario_manifest_template.json.

Pack mode

Use pack mode when the user brings a whole domain pool and wants grouped orchestration rather than one isolated chain.

In pack mode:

group the question pool into several coherent scenarios;
define the root and critical branches inside each scenario instead of validating only isolated prompts;
capture each scenario under artifacts/domain_runs/<pack_id>/scenarios/<scenario_id>/;
write aggregate pack_state.json and pack_summary.md;
aggregate scenario acceptance through node/edge coverage rather than a raw question count;
treat unresolved scenarios as enablement backlog, not as a reason to drop the domain.

Autonomous pack-loop mode

Use pack-loop mode when the user wants the system to run live replay, produce a strong business-first analyst verdict, and continue toward repair evidence until the analyst gate is reached or the loop hits a real blocker.

In autonomous pack-loop mode:

run python scripts/domain_case_loop.py run-pack-loop --manifest ...;
keep each iteration under artifacts/domain_runs/<loop_id>/iterations/<iteration_id>/;
read analyst_verdict.json before any coder patch;
by default, stop after the analyst verdict with business_audit.md and lead_coder_handoff.md so Lead Codex repairs code in the main context;
let an autonomous coder patch only when --repair-mode auto-coder is explicitly selected, and only against the highest-value domain targets from the current analyst verdict;
stop only on accepted, blocked, explicit requires_user_decision = true, or max_iterations;
do not stop just because the analyst returns needs_exact_capability or partial if autonomous domain enablement work still remains.
treat quality score >= 80 as the target gate, not as permission to keep pushing through hard blockers, missing essential observations, or unsafe fixes.
for follow-up-heavy domains, include conversational variants, slang/typo variants, and UI-generated selected-object follow-ups in the acceptance slice instead of validating only one canonical wording.
do not mark a domain path as hardened only because the root node works; critical edges and drilldowns must pass as well.
treat broken tree edges, missing carryover, or wrong answer shape as blockers for acceptance even when the underlying root intent is already exact.

Step 1 - Normalize the case

Create artifacts/domain_runs/<case_id>/case_brief.md with:

domain name
raw user question
expected business meaning
expected exact capability
expected result mode
primary user path
required paraphrase families
required carryover invariants
known constraints
acceptance criteria draft

Use references/case_brief_template.md.

Step 2 - Capture baseline

Preferred path:

run python scripts/domain_case_loop.py run-case ...

Fallback path:

if the user already has a copied technical export markdown, run python scripts/domain_case_loop.py import-export ...

Required artifacts:

baseline_output.md
baseline_debug.json
baseline_turn.json

Step 3 - Analyst verdict

Spawn domain_analyst and provide:

case_brief.md
baseline_turn.json
baseline_output.md
baseline_debug.json
scenario_acceptance_matrix.md when the case is follow-up-heavy or scenario-based
optional relevant code excerpts or file paths

Require a full verdict using references/verdict_template.md.

The verdict must explicitly say whether the case is:

an existing in-contour regression;
a missing route/intent/capability inside project scope;
a true out-of-scope request.
a runtime_capability_gap, semantic_understanding_gap, edge_carryover_gap, answer_shape_mismatch, ordering_semantics_mismatch, or loop_coverage_gap.
an object_memory_gap, followup_action_resolution_gap, bundle_reuse_gap, field_mapping_gap, business_utility_gap, or domain_anchor_gap when that is the real blocker.

Step 4 - Domain patch

Spawn domain_coder with:

the case brief
the analyst verdict
baseline artifacts

Require:

a minimal patch
zero architecture drift
rerun after changes
if the domain is in project scope but outside the current contour, convert the verdict into capability enablement work instead of closing the case as unsupported

Step 5 - Rerun

Capture:

rerun_output.md
rerun_debug.json
rerun_turn.json
patch_summary.md
updated scenario_acceptance_matrix.md when the rerun belongs to a scenario or pack

Step 6 - Before/after analysis

Spawn domain_analyst again for:

before/after comparison
final status recommendation
quality score from 0 to 100

Step 7 - Final status

Write final_status.md with one of:

accepted
partial
blocked
needs_exact_capability

needs_exact_capability is the default status when the business/domain request is valid for the project, but the current contour is missing the route, intent, capability, or domain bootstrap needed to answer it.

needs_exact_capability does not automatically stop autonomous pack-loop mode. Treat it as "continue domain enablement work" unless the analyst explicitly marks requires_user_decision = true, the runtime is truly blocked, or the loop hits max_iterations.

Autonomous pack-loop mode should stop early and ask the user when at least one of these is true:

a required observation anchor is missing and cannot be recovered safely from artifacts, 1C, or the current scenario state;
the next patch would introduce a hack, brittle workaround, hidden heuristic masking, or another low-trust shortcut;
the next patch would cause risky architecture drift, disproportionate complexity, or a contour expansion with unclear blast radius;
a business-critical ambiguity or scope tradeoff cannot be resolved from repo context and artifacts alone.

Accepted requires:

quality score >= 80
no unresolved P0 defects
no silent heuristic masking
critical scenario-tree edges on the primary user path are green
canonical, colloquial, and UI-selected-object variants are green for critical branches

Hard rules

Do not count heuristic candidates as confirmed business answers.
If exact data should exist in 1C/MCP, prefer exact route work over prompt cosmetics.
If exact data does not exist yet in the reachable contour, return a technical insufficiency with a crisp blocker.
If the user case belongs to a project-relevant domain but is outside the current contour, do not treat that as a terminal rejection. Treat it as domain enablement work and record the missing route/intent/capability explicitly.
Raise requires_user_decision = true when the loop would otherwise have to guess a missing anchor, choose between materially different risky implementations, or push through a hacky/suspicious fix path.
Never fabricate 1C data.
Keep domain fixes minimal and localized.
Preserve successful baseline scenarios.
Treat follow-up continuity as a state-machine problem, not a wording problem.
Do not accept a domain as hardened if only canonical phrasing works while colloquial or UI-generated follow-up phrasing still breaks the exact contour.
Do not accept a domain as hardened if the root node works but a critical selected-object or drilldown edge still breaks.
Treat temporal carryover loss in a cascading scenario as a real regression: if the user says на эту дату / на ту дату, the analyst must verify that the exact carried date or period survived into extracted_filters.
Treat answer-shape mismatch as a scoring defect: if the user asked for items / residues / contracts, do not accept an answer that switched to raw documents, movements, or another lower-level object without saying so explicitly.
Treat ordering semantics as part of correctness when the wording implies ranking or chronology, for example старые закупки => oldest-first rather than newest-first.
Treat primary user-path failures as more important than supporting-path polish: if the user cannot go from root list -> selected object -> first drilldown, the scenario is not accepted.
Treat direct-answer-first behavior as part of correctness: if the user asked a direct lookup question, the first line must contain the direct answer before the evidence blocks.
Treat business usefulness as part of correctness: factual-but-business-useless output is not acceptance-quality output.
Treat stable follow-up object memory as part of correctness: when the prior turn already resolved the relevant item/object, the next turn must not re-ask for it.
Treat object-centric dialog state as part of correctness: short follow-ups like по ней, по этой позиции, когда купили ее, покажи документы по этой позиции must resolve against the active selected item before broader routing guesses.
Treat reusable supplier/date/document bundles as part of correctness: adjacent follow-ups over the same item should reuse a resolved provenance bundle when available.
Treat action-first follow-up behavior as part of correctness: when the user asks кто, когда, каким документом, or покажи документы over a selected object, the answer must begin with that action's result rather than with a generic trace narrative.
Treat answer layering as part of correctness: user-facing answer first, proof second, service or methodological notes last.
Treat stable answer_object state as part of correctness: once supplier/date/document facts are already resolved, adjacent narrow follow-ups should derive from that bundle instead of replaying a full search.
Treat narrow selected-object micro-actions as compact answers by default: кто, когда, каким документом, покажи документы, сумма, все закупки should return the requested fact first and should not open with a generic multi-block trace packet.
Treat temporal honesty as part of correctness: if the exact requested window has no evidence and the runtime auto-broadens to nearest available rows, the answer must separate the exact-window outcome from the out-of-window evidence.
Treat supplier/buyer field truth as part of correctness: do not surface organization as supplier or buyer without proven mapping.
Do not accept top-of-answer system scaffolding such as status, what was considered, row counts, or exact contour above the user-facing answer on business-critical turns.
Do not accept numbered block scaffolding such as Блок 1/2/3 in narrow business follow-ups unless the user explicitly asked for a structured report.

Domain-specific framing

For this repository:

architecture must remain unchanged;
1C/MCP is the primary source of truth;
analyst output must be detailed and business-readable;
answers should be suitable for product hardening, not just debugging notes;
machine-readable turn artifacts are first-class inputs for analysis.
New user domains may be unmarked in the current repo. Missing markup is expected and should be handled as enablement, not as a reason to stop the loop.

Recommended artifact set

Use the artifact layout from references/artifact_layout.md.

15 KiB Raw Permalink Blame History Unescape Escape