ОРРКЕСТРАЦИЯ - Усилить domain-loop business-first каноном для analyst и orchestrator

2026-04-14 17:33:57 +03:00 · 2026-04-14 17:33:57 +03:00 · cb0eb450d7
parent f6a2c8e0a3
commit cb0eb450d7
19 changed files with 813 additions and 66 deletions
--- a/.codex/agents/domain_analyst.toml
+++ b/.codex/agents/domain_analyst.toml
@ -17,21 +17,25 @@ You read:
 Your job is to produce a detailed verdict in Russian with strong business focus.
-Always answer in a strict structure:
+When the caller asks for prose, use this strict structure:
-1. Смысл вопроса
+1. Question meaning
-2. Главный пользовательский путь и дерево сценария
+2. Primary user path and scenario tree
-3. Что реально посчитано
+3. Expected direct answer
-4. Где расхождение по бизнес-смыслу
+4. What the system actually computed
-5. Где route / capability mismatch
+5. Business mismatch
-6. Evidence quality
+6. Route / capability mismatch
-7. P0 defects
+7. State continuity and selected-object memory
-8. P1 defects
+8. Field truth and evidence quality
-9. P2 defects
+9. P0 defects
-10. Minimal patch directions
+10. P1 defects
-11. Acceptance matrix for rerun
+11. P2 defects
-12. Acceptance criteria for rerun
+12. Minimal patch directions
-13. Quality score
+13. Acceptance matrix for rerun
-14. Loop decision
+14. Acceptance criteria for rerun
 15. Quality score
 16. Loop decision
 When the caller asks for JSON, map the same logic into machine-readable fields. Do not collapse the business analysis into one generic summary.
 Rules:
 - Call out non-business garbage explicitly.
@ -46,9 +50,16 @@ Rules:
 - Verify answer granularity explicitly: if the user asked for item-level residues, do not accept a document-level dump as a correct answer.
 - Verify sort/order semantics when the wording implies chronology or ranking, for example `старые закупки` should be oldest-first.
 - Treat the acceptance unit as a scenario tree, not a flat list of prompts.
- Under `Главный пользовательский путь и дерево сценария`, explicitly name the root node, critical child nodes, critical edges, and the primary user path.
+- Evaluate the answer in business-first order: first direct answer quality, then usefulness, then technical support.
- Under `Acceptance matrix for rerun`, list at least the critical nodes/edges and mark each one by wording family: `canonical`, `colloquial`, `ui_selected_object`.
+- Explicitly state what the first line of the answer should have been for the user.
- Distinguish these defect classes explicitly when relevant: `semantic_understanding_gap`, `edge_carryover_gap`, `answer_shape_mismatch`, `ordering_semantics_mismatch`, `runtime_capability_gap`, `loop_coverage_gap`.
+- If the answer is technically grounded but business-useless, say so directly and lower the score.
 - Treat selected-object continuity and reusable answer-object memory as first-class analysis objects.
 - Call out when the runtime found the underlying document/trace but failed to retain the resolved business object for the next follow-up.
 - Distinguish `object_memory_gap`, `field_mapping_gap`, `business_utility_gap`, and `domain_anchor_gap` from pure route gaps.
 - Check field truth explicitly: supplier must not be mislabeled as organization, buyer must not be mislabeled as organization, and document-side fields must not be presented as business truth without evidence.
 - Under the scenario-tree section, explicitly name the root node, critical child nodes, critical edges, and the primary user path.
 - Under the acceptance matrix, list at least the critical nodes/edges and mark each one by wording family: `canonical`, `colloquial`, `ui_selected_object`.
 - Distinguish these defect classes explicitly when relevant: `semantic_understanding_gap`, `edge_carryover_gap`, `object_memory_gap`, `field_mapping_gap`, `answer_shape_mismatch`, `ordering_semantics_mismatch`, `runtime_capability_gap`, `business_utility_gap`, `loop_coverage_gap`, `domain_anchor_gap`.
 - If the root node works but the primary user path is broken at the first selected-object drilldown, treat that as a real failure of domain hardening.
 - If the runtime nearly supports the path but the loop never validated the realistic wording family, call it `loop_coverage_gap`, not product success.
@ -56,6 +67,6 @@ Quality score:
 - Output one integer score from 0 to 100.
 - Score >= 80 means the case can be accepted only if there is no unresolved P0.
 - Score >= 80 also requires the primary user path and its critical edges to be green across canonical, colloquial, and UI-selected-object coverage where applicable.
- If score < 80, loop_decision must be continue, partial, blocked, or needs_exact_capability.
+- Score >= 80 also requires `direct_answer_ok = true` and `business_usefulness_ok = true` for the primary user path.
 """
 nickname_candidates = ["Lens", "Vector", "Delta"]
--- a/.codex/agents/orchestrator.toml
+++ b/.codex/agents/orchestrator.toml
@ -1,5 +1,5 @@
 name = "orchestrator"
-description = "Coordinates a repo-native domain-case or scenario loop for NDC_1C: baseline or scenario capture, analyst verdict, minimal domain patch, rerun, and 80-point acceptance gate."
+description = "Coordinates a repo-native domain-case or scenario loop for NDC_1C: baseline or scenario capture, minimal domain patching, rerun, and business-first acceptance."
 model = "gpt-5.4"
 model_reasoning_effort = "high"
 sandbox_mode = "workspace-write"
@ -46,6 +46,10 @@ Hard rules:
 - For cascading date-sensitive scenarios, rerun at least one `на эту дату` / `на ту дату` follow-up and verify that the originating date or period survives into debug filters.
 - If the business question asks for residues/items/contracts but the answer switched to raw documents or movements, treat that as a real defect, not as acceptable detail.
 - If the wording implies chronology or ranking such as `старые закупки`, verify oldest-first ordering explicitly.
 - Require the analyst to judge business usefulness, not only technical groundedness.
 - Require the analyst to judge whether the direct answer appears in the first line when the user asked a direct lookup question.
 - Treat selected-object continuity, pronoun resolution, and reusable resolved-object state as mandatory audit targets for follow-up-heavy domains.
 - Distinguish runtime capability gaps from state-layer continuity gaps and from business-presentation gaps before choosing coder tasks.
 - If the root node works but the first critical selected-object or drilldown edge is still broken, do not treat the scenario as hardened.
 - Require an explicit `scenario_acceptance_matrix.md` artifact for follow-up-heavy domains and packs.
 - Use the matrix to drive coder tasks: patch the narrowest broken edge or wording family first, not the whole domain at once.
@ -57,6 +61,7 @@ Acceptance gate:
 - accepted requires no business-critical regression in rerun
 - accepted requires green critical edges on the primary user path
 - accepted requires green coverage for canonical + colloquial + UI-selected-object variants on critical branches when those branches exist in the product UX
 - accepted requires `direct_answer_ok = true` and `business_usefulness_ok = true` on the primary user path
 Required artifacts per cycle:
 - case_brief.md
--- a/.codex/skills/domain-case-loop/SKILL.md
+++ b/.codex/skills/domain-case-loop/SKILL.md
@ -28,6 +28,7 @@ This skill packages the standard workflow for iterating on one concrete domain c
 Read `references/repo_runtime_map.md` before the first real cycle.
 For follow-up-heavy domains, also read `references/scenario_tree_acceptance_canon.md` before scenario mode, pack mode, or autonomous pack-loop mode.
 For business-first analyst work, also read `references/business_first_analyst_rubric.md` before redefining acceptance or hardening a noisy-but-technically-grounded domain.
 If `docs/orchestration/active_domain_contract.json` exists, treat it as the single mutable source of truth for the current domain and prefer it over older scattered pool/pack prose docs.
 Use these repo-native capture paths:
@ -136,6 +137,7 @@ The verdict must explicitly say whether the case is:
 - a missing route/intent/capability inside project scope;
 - a true out-of-scope request.
 - a `runtime_capability_gap`, `semantic_understanding_gap`, `edge_carryover_gap`, `answer_shape_mismatch`, `ordering_semantics_mismatch`, or `loop_coverage_gap`.
 - an `object_memory_gap`, `field_mapping_gap`, `business_utility_gap`, or `domain_anchor_gap` when that is the real blocker.
 ### Step 4 - Domain patch
@ -208,6 +210,9 @@ Accepted requires:
 - Treat answer-shape mismatch as a scoring defect: if the user asked for items / residues / contracts, do not accept an answer that switched to raw documents, movements, or another lower-level object without saying so explicitly.
 - Treat ordering semantics as part of correctness when the wording implies ranking or chronology, for example `старые закупки` => oldest-first rather than newest-first.
 - Treat primary user-path failures as more important than supporting-path polish: if the user cannot go from root list -> selected object -> first drilldown, the scenario is not accepted.
 - Treat direct-answer-first behavior as part of correctness: if the user asked a direct lookup question, the first line must contain the direct answer before the evidence blocks.
 - Treat business usefulness as part of correctness: factual-but-business-useless output is not acceptance-quality output.
 - Treat stable follow-up object memory as part of correctness: when the prior turn already resolved the relevant item/object, the next turn must not re-ask for it.
 ## Domain-specific framing
--- a/.codex/skills/domain-case-loop/references/business_first_analyst_rubric.md
+++ b/.codex/skills/domain-case-loop/references/business_first_analyst_rubric.md
@ -0,0 +1,128 @@
 # Business-first analyst rubric
 Use this rubric when evaluating one domain case, one multi-step scenario, or one full domain pack.
 The analyst must not stop at route/debug correctness. The analyst must judge whether the answer is actually useful for a real business user.
 ## Core principle
 The analyst evaluates five layers at once:
 - user intent;
 - scenario tree and state continuity;
 - business usefulness of the answer;
 - evidence and field truthfulness;
 - root cause and smallest defensible fix direction.
 ## Required analyst questions
 For every critical turn or critical edge, answer these questions explicitly:
 1. What did the user really ask?
 - State the business meaning in one short sentence.
 - Name the minimum direct answer the user expected.
 2. What should the first line of the answer have been?
 - If the user asked a direct lookup question, the first line must contain the direct answer.
 - Technical explanation, limitations, and evidence come after the direct answer.
 3. What object and scope had to survive from previous turns?
 - selected item / selected contract / selected counterparty;
 - originating date or period;
 - warehouse or organization scope when still relevant;
 - reusable resolved bundle, for example provenance trace or sale trace.
 4. Did the answer stay on the same business object?
 - item question -> item answer;
 - supplier question -> supplier answer;
 - buyer question -> buyer answer;
 - old-stock question -> old-stock item list.
 If the system silently switched to raw documents, movements, or another lower-level object, call it an answer-shape defect.
 5. Are the surfaced fields truthful and correctly labeled?
 - do not confuse supplier with organization;
 - do not confuse buyer with organization;
 - do not present a document-side technical field as a business truth unless that mapping is proven.
 ## Business usefulness rules
 An answer is not accepted as business-useful when any of these are true:
 - the direct answer is not placed first;
 - the answer opens with technical hedging instead of the user-facing result;
 - a weaker question is answered than the one the user asked;
 - the answer requires the user to reconstruct the conclusion from low-level evidence;
 - the answer uses ambiguous field labels for business-critical entities.
 ## State continuity rules
 Follow-up continuity is a first-class acceptance object.
 The analyst must verify:
 - selected object continuity;
 - date/period continuity;
 - reusable evidence continuity;
 - pronoun resolution continuity.
 Important pronoun examples:
 - `эту позицию`
 - `этот товар`
 - `его`
 - `по нему`
 - `по этой позиции`
 If the previous turn already resolved a concrete object, the next turn must reuse it instead of asking for the anchor again.
 ## Reusable answer-object cache
 For follow-up-heavy domains, the analyst should explicitly look for evidence that the product behaves as if it had a reusable resolved object bundle.
 Examples:
 - `current_item`
 - `current_as_of_date`
 - `current_provenance_trace`
 - `current_sale_trace`
 - `first_purchase_date`
 - `supplier_if_known`
 - `source_document_if_known`
 If the runtime recomputes everything from scratch and loses the already resolved object, call that out as a state-layer defect.
 ## Root-cause layers
 Use one or more of these root-cause layers explicitly:
 - `semantic_understanding_gap`
 - `runtime_capability_gap`
 - `edge_carryover_gap`
 - `object_memory_gap`
 - `field_mapping_gap`
 - `answer_shape_mismatch`
 - `ordering_semantics_mismatch`
 - `business_utility_gap`
 - `loop_coverage_gap`
 - `domain_anchor_gap`
 ## Minimum machine-readable verdict fields
 The analyst verdict should expose at least:
 - `user_intent_summary`
 - `expected_direct_answer`
 - `actual_direct_answer`
 - `direct_answer_ok`
 - `business_usefulness_ok`
 - `business_utility_score`
 - `direct_answer_priority_score`
 - `state_continuity_score`
 - `answer_shape_score`
 - `evidence_clarity_score`
 - `root_cause_layers`
 - `broken_edge_ids`
 - `violated_invariants`
 ## Inventory-specific reminders
 For inventory follow-up chains, verify all of these:
 - the selected item remains the current focus object after the user clicks a result;
 - provenance questions answer supplier/date/document first, not only raw movement rows;
 - `когда купили` can reuse the already resolved provenance bundle;
 - supplier and organization are not mixed up in the surfaced answer;
 - `на эту дату` keeps the original stock date unless the user explicitly changed it.
--- a/.codex/skills/domain-case-loop/references/case_brief_template.md
+++ b/.codex/skills/domain-case-loop/references/case_brief_template.md
@ -9,6 +9,10 @@
 ## Expected business meaning
 - ...
 ## Expected direct answer
 - first line should say:
 - minimum acceptable business answer:
 ## Expected capability
 - ...
@ -31,6 +35,13 @@
 - warehouse if relevant
 - organization if relevant
 - expected answer shape
 - direct-answer-first when the user asked a direct lookup question
 - reusable resolved-object continuity when the user asks a follow-up about the same selected object
 ## Field truth constraints
 - do not confuse supplier with organization
 - do not confuse buyer with organization
 - do not surface technical document-side fields as business truth without proof
 ## Contour status
 - in_contour / outside_current_contour / unknown
@ -53,3 +64,5 @@
 - root node works
 - critical edges on the primary user path work
 - colloquial and UI-generated follow-up variants work
 - direct answer is placed first where expected
 - output is business-useful, not only technically grounded
--- a/.codex/skills/domain-case-loop/references/scenario_tree_acceptance_canon.md
+++ b/.codex/skills/domain-case-loop/references/scenario_tree_acceptance_canon.md
@ -12,6 +12,16 @@ The unit of acceptance is a **scenario tree**:
 If the root works but a critical child transition breaks, the domain is **not** hardened.
 ## Business-first framing
 Every accepted node or edge must be both technically grounded and business-useful.
 This means:
 - the direct answer is surfaced first when the user asked a direct lookup question;
 - the answer stays on the requested business object;
 - evidence and caveats support the answer instead of replacing it;
 - field labels are truthful for business entities such as supplier, buyer, organization, warehouse, and document.
 ## Model the domain as a tree
 For each scenario, define:
@ -36,7 +46,8 @@ A node is considered covered only if all of these are true:
 - the expected intent / capability is selected;
 - the answer shape matches the requested business object;
 - the answer begins with a direct user-facing answer when such an answer is expected;
- the answer is evidence-backed rather than heuristic-masked.
+- the answer is evidence-backed rather than heuristic-masked;
 - the surfaced business fields are truthful and not mislabeled.
 Examples:
 - asking for supplier provenance must answer with the supplier first, not only with raw documents;
@ -53,9 +64,23 @@ Typical invariants:
 - warehouse survives if the follow-up still targets the same stock slice
 - organization survives if the previous slice was organization-bound
 - route family remains in the same business contour unless the user clearly changed intent
 - reusable resolved-object state survives when the previous turn already answered a closely related lookup
 - pronoun references can reuse the active focus object when the wording supports it
 If an edge loses a required invariant, that is a real regression even if the target node works in isolation.
 ## Resolved answer-object continuity
 For follow-up-heavy domains, the analyst should treat resolved business objects as reusable state, not as disposable one-turn artifacts.
 Examples:
 - selected inventory item
 - resolved supplier provenance bundle
 - resolved buyer bundle
 - resolved purchase document bundle
 If turn N already resolved such an object and turn N+1 asks a natural follow-up about the same object, the system should reuse that state instead of demanding the same anchor again.
 ## Mandatory paraphrase families
 Every critical node or edge must be validated in a small paraphrase family instead of one curated wording only.
@ -65,11 +90,6 @@ Minimum family:
 - `colloquial`
 - `ui_selected_object`
 Examples:
 - canonical: `От какого поставщика куплен товар X`
 - colloquial: `Кто поставил этот товар`
 - ui_selected_object: `По выбранному объекту "X": кто это поставил нам`
 If canonical works but colloquial or UI-generated follow-up fails, the node/edge is not accepted.
 ## Acceptance matrix
@ -86,6 +106,8 @@ Minimum matrix columns:
 - expected capability / recipe
 - required carryover invariants
 - expected answer shape
 - expected direct answer
 - business usefulness expectation
 - actual outcome
 - status (`pass`, `partial`, `fail`)
 - defect class
@ -95,17 +117,25 @@ Minimum matrix columns:
 Use these classes explicitly:
 - `semantic_understanding_gap`
 - `edge_carryover_gap`
 - `object_memory_gap`
 - `field_mapping_gap`
 - `answer_shape_mismatch`
 - `ordering_semantics_mismatch`
 - `runtime_capability_gap`
 - `business_utility_gap`
 - `domain_anchor_gap`
 - `loop_coverage_gap`
 Definitions:
 - `semantic_understanding_gap`: the system did not understand the real user meaning
 - `edge_carryover_gap`: the follow-up lost date / object / scope across steps
 - `object_memory_gap`: the system resolved the object once but failed to retain it for the next follow-up
 - `field_mapping_gap`: the answer surfaced the wrong business field or mislabeled a field
 - `answer_shape_mismatch`: the business object in the answer does not match the requested object
 - `ordering_semantics_mismatch`: ranking / chronology semantics are wrong
 - `runtime_capability_gap`: the product contour truly lacks the route / intent / capability / extractor / recipe
 - `business_utility_gap`: the answer may be grounded but is still not useful as a user-facing result
 - `domain_anchor_gap`: the scenario uses a weak or wrong observed anchor, so the tree is semantically mis-specified
 - `loop_coverage_gap`: the runtime could support the path or nearly support it, but the analyst/orchestrator never treated that path as mandatory acceptance coverage
 ## Analyst responsibilities
@ -116,6 +146,9 @@ The analyst must:
 - call out broken edges explicitly;
 - verify colloquial and UI-generated variants as first-class coverage;
 - verify direct-answer-first behavior where the user asked a direct lookup question;
 - verify business usefulness explicitly, not only technical validity;
 - verify field truthfulness for surfaced supplier / buyer / organization labels;
 - verify selected-object continuity and reusable object memory;
 - verify answer granularity and ordering semantics;
 - lower the score when any critical edge or paraphrase family is broken.
@ -136,7 +169,9 @@ Do not accept a domain when:
 - selected-object follow-up is broken;
 - `на эту дату` / `на ту дату` loses the originating date;
 - the answer shape is wrong for the business question;
- chronology / ranking semantics are inverted.
+- chronology / ranking semantics are inverted;
 - the direct answer is not surfaced first on direct lookup questions;
 - the answer is technically grounded but still business-useless.
 Accepted requires:
 - score >= 80
@ -144,3 +179,5 @@ Accepted requires:
 - critical path edges pass
 - canonical + colloquial + UI-selected-object variants pass for critical branches
 - no silent heuristic masking
 - `direct_answer_ok = true`
 - `business_usefulness_ok = true`
--- a/.codex/skills/domain-case-loop/references/verdict_template.md
+++ b/.codex/skills/domain-case-loop/references/verdict_template.md
@ -1,55 +1,74 @@
 # Verdict
-## 1. Смысл вопроса
+## 1. Question meaning
 ...
-## 2. Главный пользовательский путь и дерево сценария
+## 2. Primary user path and scenario tree
 - root:
 - critical child nodes:
 - critical edges:
 - primary user path:
-## 3. Что реально посчитано
+## 3. Expected direct answer
 - what the first line should say:
 - minimum acceptable business answer:
 ## 4. What the system actually computed
 ...
-## 4. Где расхождение по бизнес-смыслу
+## 5. Business mismatch
 - did the answer solve the user's real question:
 - did the direct answer appear first:
 - is the answer usable for an operator/accountant/manager:
 ## 6. Route / capability mismatch
 ...
-## 5. Где route / capability mismatch
+## 7. State continuity and selected-object memory
-...
+- selected object continuity:
 - date/period continuity:
 - reusable answer-object continuity:
 - pronoun resolution continuity:
-## 6. Evidence quality
+## 8. Field truth and evidence quality
- exact / partial / heuristic / technical insufficiency
+- supplier vs organization:
- why
+- buyer vs organization:
 - exact / partial / heuristic / technical insufficiency:
 - why:
-## 7. P0 defects
+## 9. P0 defects
 - ...
-## 8. P1 defects
+## 10. P1 defects
 - ...
-## 9. P2 defects
+## 11. P2 defects
 - ...
-## 10. Minimal patch directions
+## 12. Minimal patch directions
 - ...
-## 11. Acceptance matrix for rerun
+## 13. Acceptance matrix for rerun
 - Node / edge coverage:
 - Canonical wording:
 - Colloquial wording:
 - UI-generated selected-object wording:
 - Carryover invariants:
 - Expected answer shape:
 - Expected direct answer:
 - Business usefulness:
 - Defect class:
-## 12. Acceptance criteria for rerun
+## 14. Acceptance criteria for rerun
 - ...
 - Include colloquial/slang variants and UI-generated selected-object follow-up variants when they are part of the business flow.
 - Require the primary user path to pass end-to-end, not only the root node.
 - Require direct-answer-first behavior on direct lookup questions.
 - Require business-useful output rather than technically-grounded-but-noisy output.
 - Require selected-object continuity and reusable answer-object continuity on follow-up chains.
-## 13. Quality score
+## 15. Quality score
 - integer from 0 to 100
-## 14. Loop decision
+## 16. Loop decision
 - accepted / continue / partial / blocked / needs_exact_capability
--- a/AGENTS.md
+++ b/AGENTS.md
@ -26,6 +26,7 @@ Rules:
 - Do not accept a domain when only the root snapshot works but selected-object or drilldown follow-up edges still fail.
 - For critical branches, validate at least canonical wording, colloquial wording, and UI-generated selected-object wording when that UX exists.
 - Treat temporal carryover, selected-object carryover, answer-shape match, and ordering semantics as first-class acceptance invariants rather than optional polish.
 - Treat direct-answer-first behavior, business usefulness, selected-object memory, and field truthfulness as first-class analyst criteria rather than optional presentation polish.
 - If a case falls outside the current routed contour because the route/intent/capability is not wired yet, treat it as domain enablement work for this project, not as automatic out-of-scope rejection.
 - For new unmarked domains, `needs_exact_capability` means "bootstrap or extend the contour" rather than "close the case as unsupported".
 - A case can be marked `accepted` only when analyst verdict is at least `80/100`, no unresolved `P0` remains, and the rerun does not mask heuristic output as confirmed.
--- a/docs/orchestration/active_domain_contract.json
+++ b/docs/orchestration/active_domain_contract.json
@ -780,6 +780,12 @@
          "required_paraphrase_families": ["canonical", "ui_selected_object"],
          "required_carryover_invariants": ["selected_object", "date_scope", "answer_shape"]
        },
        "bindings": {
          "target_date_historical": "2020-03-31",
          "focus_item_historical": "Шкаф картотечный 1000*400*2100",
          "observed_supplier_candidate": "Гамма-мебель, ООО",
          "observed_customer_candidate": "Департамент капитального ремонта города Москвы"
        },
        "steps": [
          {
            "step_id": "step_01_account_41_historical",
@ -790,7 +796,7 @@
            "title": "Historical account 41 anchor",
            "question": "Какие товары числятся на 41 счете на дату {{bindings.target_date_historical}}",
            "analysis_context": {
-              "as_of_date": "2019-03-31",
+              "as_of_date": "2020-03-31",
              "source": "binding_target_date_historical"
            },
            "expected_capability": "confirmed_inventory_on_hand_as_of_date",
@ -823,13 +829,29 @@
            "node_role": "supporting_child",
            "paraphrase_family": "canonical",
            "title": "Supplier to buyer overlap",
-            "question": "Какие товары были куплены у поставщика {{bindings.observed_supplier_candidate}} и позже проданы покупателю {{bindings.observed_customer_candidate}}",
+            "question": "Есть ли документально подтвержденная цепочка: поставщик {{bindings.observed_supplier_candidate}} -> товар {{bindings.focus_item_historical}} -> покупатель {{bindings.observed_customer_candidate}}",
            "depends_on": ["step_01_account_41_historical", "step_02_selected_item_buyer"]
          }
        ]
      }
    ]
  },
  "agent_audit_expectations": {
    "direct_answer_first": true,
    "business_utility_required": true,
    "state_continuity_required": true,
    "selected_object_memory_required": true,
    "field_truth_checks": [
      "supplier_vs_organization",
      "buyer_vs_organization"
    ],
    "reusable_answer_object_expectations": [
      "current_item",
      "current_as_of_date",
      "current_provenance_trace",
      "current_sale_trace"
    ]
  },
  "acceptance_contract": {
    "acceptance_unit": "scenario_tree",
    "do_not_accept_if": [
--- a/docs/orchestration/schemas/domain_loop_analyst_verdict.schema.json
+++ b/docs/orchestration/schemas/domain_loop_analyst_verdict.schema.json
@ -5,13 +5,26 @@
  "additionalProperties": false,
  "required": [
    "summary",
    "user_intent_summary",
    "expected_direct_answer",
    "actual_direct_answer",
    "quality_score",
    "direct_answer_ok",
    "business_usefulness_ok",
    "business_utility_score",
    "direct_answer_priority_score",
    "state_continuity_score",
    "answer_shape_score",
    "evidence_clarity_score",
    "loop_decision",
    "requires_user_decision",
    "user_decision_type",
    "user_decision_prompt",
    "unresolved_p0_count",
    "regression_detected",
    "root_cause_layers",
    "broken_edge_ids",
    "violated_invariants",
    "priority_targets",
    "acceptance_criteria",
    "notes"
@ -20,11 +33,51 @@
    "summary": {
      "type": "string"
    },
    "user_intent_summary": {
      "type": "string"
    },
    "expected_direct_answer": {
      "type": "string"
    },
    "actual_direct_answer": {
      "type": ["string", "null"]
    },
    "quality_score": {
      "type": "integer",
      "minimum": 0,
      "maximum": 100
    },
    "direct_answer_ok": {
      "type": "boolean"
    },
    "business_usefulness_ok": {
      "type": "boolean"
    },
    "business_utility_score": {
      "type": "integer",
      "minimum": 0,
      "maximum": 100
    },
    "direct_answer_priority_score": {
      "type": "integer",
      "minimum": 0,
      "maximum": 100
    },
    "state_continuity_score": {
      "type": "integer",
      "minimum": 0,
      "maximum": 100
    },
    "answer_shape_score": {
      "type": "integer",
      "minimum": 0,
      "maximum": 100
    },
    "evidence_clarity_score": {
      "type": "integer",
      "minimum": 0,
      "maximum": 100
    },
    "loop_decision": {
      "type": "string",
      "enum": ["accepted", "continue", "partial", "blocked", "needs_exact_capability"]
@ -35,7 +88,17 @@
    },
    "user_decision_type": {
      "type": "string",
-      "enum": ["none", "architecture_fork", "important_business_question", "scope_tradeoff", "data_truth_gap", "missing_required_observation", "risky_workaround", "risky_complexity", "other"],
+      "enum": [
        "none",
        "architecture_fork",
        "important_business_question",
        "scope_tradeoff",
        "data_truth_gap",
        "missing_required_observation",
        "risky_workaround",
        "risky_complexity",
        "other"
      ],
      "description": "Explain why the loop needs user input. Use none when requires_user_decision is false."
    },
    "user_decision_prompt": {
@ -49,6 +112,37 @@
    "regression_detected": {
      "type": "boolean"
    },
    "root_cause_layers": {
      "type": "array",
      "items": {
        "type": "string",
        "enum": [
          "semantic_understanding_gap",
          "runtime_capability_gap",
          "edge_carryover_gap",
          "object_memory_gap",
          "field_mapping_gap",
          "answer_shape_mismatch",
          "ordering_semantics_mismatch",
          "business_utility_gap",
          "loop_coverage_gap",
          "domain_anchor_gap",
          "other"
        ]
      }
    },
    "broken_edge_ids": {
      "type": "array",
      "items": {
        "type": "string"
      }
    },
    "violated_invariants": {
      "type": "array",
      "items": {
        "type": "string"
      }
    },
    "priority_targets": {
      "type": "array",
      "items": {
@ -68,7 +162,23 @@
          },
          "problem_type": {
            "type": "string",
-            "enum": ["route_gap", "capability_gap", "evidence_gap", "presentation_gap", "regression", "other"]
+            "enum": [
              "route_gap",
              "capability_gap",
              "evidence_gap",
              "presentation_gap",
              "semantic_understanding_gap",
              "edge_carryover_gap",
              "object_memory_gap",
              "field_mapping_gap",
              "answer_shape_mismatch",
              "ordering_semantics_mismatch",
              "business_utility_gap",
              "loop_coverage_gap",
              "domain_anchor_gap",
              "regression",
              "other"
            ]
          },
          "fix_goal": {
            "type": "string"
--- a/llm_normalizer/backend/dist/services/addressIntentResolver.js
+++ b/llm_normalizer/backend/dist/services/addressIntentResolver.js
@ -1340,6 +1340,17 @@ function hasInventoryPurchaseDocumentsSignal(text) {
 function hasInventorySaleTraceSignal(text) {
    return /(?:РїСЂРѕРґР°Р¶|РїРѕРєСѓРїР°С‚РµР»|buyer|sale trace|purchase[\s-]?to[\s-]?sale|purchase -> warehouse -> sale|Р·Р°РєСѓРїРєР°.*РїСЂРѕРґР°Р¶)/iu.test(text);
 }
 function hasSelectedObjectInventoryCue(text) {
    return /(?:по\s+выбранному\s+объекту|selected\s+object)/iu.test(text);
 }
 function hasSelectedObjectInventoryProvenanceSignal(text) {
    return (hasSelectedObjectInventoryCue(text) &&
        /(?:кто\s+(?:(?:это|этот\s+товар|эту\s+позицию)\s+)?(?:нам\s+)?поставил|кто\s+(?:нам\s+)?поставил\s+(?:это|этот\s+товар|эту\s+позицию)|от\s+какого\s+поставщика|у\s+какого\s+поставщика|от\s+кого\s+куплен|supplier|vendor|поставщик)/iu.test(text));
 }
 function hasSelectedObjectInventoryPurchaseDocumentsSignal(text) {
    return (hasSelectedObjectInventoryCue(text) &&
        /(?:по\s+каким\s+документам\s+(?:это|его|этот\s+товар|эту\s+позицию)\s+купили|по\s+каким\s+документам\s+(?:был\s+)?куплен|какими\s+документами\s+(?:это|его|этот\s+товар|эту\s+позицию)\s+купили|какими\s+документами\s+(?:был\s+)?куплен|purchase\s+documents|documents\s+of\s+purchase|through\s+which\s+documents)/iu.test(text));
 }
 function hasInventoryProvenanceSignalV2(text) {
    const hasItemCue = /(?:товар|номенклатур|sku|item|product|остат(?:ок|ки)|склад)/iu.test(text);
    const hasSupplierCue = /(?:от\s+какого\s+поставщика|у\s+какого\s+поставщика|от\s+кого\s+куплен|кто\s+(?:нам\s+)?поставил|кем\s+поставлен|поставщик|supplier|vendor)/iu.test(text);
@ -1541,6 +1552,13 @@ function resolveAddressIntent(userMessage) {
            reasons: ["inventory_aging_signal_detected"]
        };
    }
    if (hasSelectedObjectInventoryProvenanceSignal(text)) {
        return {
            intent: "inventory_purchase_provenance_for_item",
            confidence: "medium",
            reasons: ["inventory_selected_object_provenance_signal_detected"]
        };
    }
    if (hasInventoryProvenanceSignalV2(text)) {
        return {
            intent: "inventory_purchase_provenance_for_item",
@ -1555,6 +1573,13 @@ function resolveAddressIntent(userMessage) {
            reasons: ["inventory_purchase_date_signal_detected"]
        };
    }
    if (hasSelectedObjectInventoryPurchaseDocumentsSignal(text)) {
        return {
            intent: "inventory_purchase_documents_for_item",
            confidence: "medium",
            reasons: ["inventory_selected_object_purchase_documents_signal_detected"]
        };
    }
    if (hasInventoryPurchaseDocumentsSignalV2(text)) {
        return {
            intent: "inventory_purchase_documents_for_item",
--- a/llm_normalizer/backend/dist/services/addressQueryService.js
+++ b/llm_normalizer/backend/dist/services/addressQueryService.js
@ -2879,6 +2879,19 @@ class AddressQueryService {
                        const broadenedFactual = (0, composeStage_1.composeFactualReply)(intent.intent, broadenedFilteredRows, composeOptionsFromFilters(autoBroadenedFilters));
                        const broadenedLimitations = [...filters.warnings, "period_window_auto_broadened_to_available_data"];
                        const broadenedReasons = [...baseReasons, "period_window_auto_broadened_to_available_data"];
                        const broadenedResultSemantics = mergeAddressResultSemantics(deriveAddressResultSemantics({
                            intent: intent.intent,
                            selectedRecipe: broadenedSelection.selected_recipe.recipe_id,
                            filters: filters.extracted_filters,
                            responseType: broadenedFactual.responseType,
                            rowsMatched: broadenedFilteredRows.length
                        }), broadenedFactual.semantics);
                        const broadenedRouteExpectationAudit = buildRouteExpectationAudit({
                            intent: routeExpectationIntent,
                            selectedRecipe: broadenedSelection.selected_recipe.recipe_id,
                            requestedResultMode,
                            resultMode: broadenedResultSemantics.result_mode
                        });
                        return {
                            handled: true,
                            reply_text: injectNoticeAfterLeadLine(broadenedFactual.text, broadenedPrefix),
@ -2921,13 +2934,20 @@ class AddressQueryService {
                                runtime_readiness: "LIVE_QUERYABLE_WITH_LIMITS",
                                limited_reason_category: null,
                                response_type: broadenedFactual.responseType,
-                                ...mergeAddressResultSemantics(deriveAddressResultSemantics({
+                                capability_id: capabilityAudit.capabilityId,
-                                    intent: intent.intent,
+                                capability_layer: capabilityAudit.layer,
-                                    selectedRecipe: broadenedSelection.selected_recipe.recipe_id,
+                                capability_route_mode: capabilityAudit.routeMode,
-                                    filters: filters.extracted_filters,
+                                capability_route_enabled: capabilityAudit.enabled,
-                                    responseType: broadenedFactual.responseType,
+                                capability_route_reason: capabilityAudit.reason,
-                                    rowsMatched: broadenedFilteredRows.length
+                                shadow_route_intent: shadowRouteAudit.intent,
-                                }), broadenedFactual.semantics),
+                                shadow_route_selected_recipe: shadowRouteAudit.selectedRecipe,
                                shadow_route_status: shadowRouteAudit.status,
                                route_expectation_status: broadenedRouteExpectationAudit.status,
                                route_expectation_reason: broadenedRouteExpectationAudit.reason,
                                route_expectation_expected_selected_recipes: broadenedRouteExpectationAudit.expectedSelectedRecipes,
                                route_expectation_expected_requested_result_modes: broadenedRouteExpectationAudit.expectedRequestedResultModes,
                                route_expectation_expected_result_modes: broadenedRouteExpectationAudit.expectedResultModes,
                                ...broadenedResultSemantics,
                                limitations: broadenedLimitations,
                                reasons: withConfirmedBalanceFallbackReason(broadenedReasons, requestedResultMode, broadenedFactual.semantics)
                            }
--- a/llm_normalizer/backend/dist/services/address_runtime/decomposeStage.js
+++ b/llm_normalizer/backend/dist/services/address_runtime/decomposeStage.js
@ -244,6 +244,24 @@ function mapCounterpartyIntentToContractIntent(intent) {
    }
    return null;
 }
 function isInventoryIntent(intent) {
    return (intent === "inventory_on_hand_as_of_date" ||
        intent === "inventory_purchase_provenance_for_item" ||
        intent === "inventory_purchase_documents_for_item" ||
        intent === "inventory_supplier_stock_overlap_as_of_date" ||
        intent === "inventory_sale_trace_for_item" ||
        intent === "inventory_purchase_to_sale_chain" ||
        intent === "inventory_aging_by_purchase_date");
 }
 function hasSelectedObjectInventorySignal(text) {
    return /(?:по\s+выбранному\s+объекту|for\s+selected\s+object)/iu.test(String(text ?? ""));
 }
 function hasInventorySupplierFollowupCue(text) {
    return /(?:кто\s+(?:(?:это|этот\s+товар|эту\s+позицию)\s+)?(?:нам\s+)?поставил|кто\s+(?:нам\s+)?поставил\s+(?:это|этот\s+товар|эту\s+позицию)|от\s+какого\s+поставщика|у\s+какого\s+поставщика|от\s+кого\s+куплен|supplier|vendor|поставщик)/iu.test(String(text ?? ""));
 }
 function hasInventoryPurchaseDocumentsFollowupCue(text) {
    return /(?:по\s+каким\s+документам\s+(?:это|его|этот\s+товар|эту\s+позицию)\s+купили|по\s+каким\s+документам\s+(?:был\s+)?куплен|какими\s+документами\s+(?:это|его|этот\s+товар|эту\s+позицию)\s+купили|какими\s+документами\s+(?:был\s+)?куплен|purchase\s+documents|documents\s+of\s+purchase|through\s+which\s+documents)/iu.test(String(text ?? ""));
 }
 function hasAddressFollowupContextSignal(text) {
    const normalized = String(text ?? "").trim();
    if (!normalized) {
@ -612,6 +630,32 @@ function deriveIntentWithFollowupContext(detectedIntent, userMessage, followupCo
            reasons: [...detectedIntent.reasons, "intent_adjusted_to_balance_followup_context"]
        };
    }
    const previousIsInventoryFamily = isInventoryIntent(previousIntent);
    const inventorySelectedObjectFollowup = hasSelectedObjectInventorySignal(normalizedMessage) || (previousIsInventoryFamily && hasFollowupSignal);
    if (inventorySelectedObjectFollowup && hasInventorySupplierFollowupCue(normalizedMessage)) {
        if (detectedIntent.intent === "unknown" ||
            detectedIntent.intent === "inventory_on_hand_as_of_date" ||
            detectedIntent.intent === previousIntent) {
            return {
                intent: "inventory_purchase_provenance_for_item",
                confidence: "low",
                reasons: [...detectedIntent.reasons, "intent_adjusted_to_inventory_followup_context"]
            };
        }
    }
    if (inventorySelectedObjectFollowup && hasInventoryPurchaseDocumentsFollowupCue(normalizedMessage)) {
        if (detectedIntent.intent === "unknown" ||
            detectedIntent.intent === "list_documents_by_counterparty" ||
            detectedIntent.intent === "list_documents_by_contract" ||
            detectedIntent.intent === "inventory_on_hand_as_of_date" ||
            detectedIntent.intent === previousIntent) {
            return {
                intent: "inventory_purchase_documents_for_item",
                confidence: "low",
                reasons: [...detectedIntent.reasons, "intent_adjusted_to_inventory_followup_context"]
            };
        }
    }
    if (hasPreviousContract) {
        if (detectedIntent.intent === "list_contracts_by_counterparty") {
            if (hasBankSignal(normalizedMessage)) {
--- a/llm_normalizer/backend/src/services/addressIntentResolver.ts
+++ b/llm_normalizer/backend/src/services/addressIntentResolver.ts
@ -1603,6 +1603,28 @@ function hasInventorySaleTraceSignal(text: string): boolean {
  );
 }
 function hasSelectedObjectInventoryCue(text: string): boolean {
  return /(?:по\s+выбранному\s+объекту|selected\s+object)/iu.test(text);
 }
 function hasSelectedObjectInventoryProvenanceSignal(text: string): boolean {
  return (
    hasSelectedObjectInventoryCue(text) &&
    /(?:кто\s+(?:(?:это|этот\s+товар|эту\s+позицию)\s+)?(?:нам\s+)?поставил|кто\s+(?:нам\s+)?поставил\s+(?:это|этот\s+товар|эту\s+позицию)|от\s+какого\s+поставщика|у\s+какого\s+поставщика|от\s+кого\s+куплен|supplier|vendor|поставщик)/iu.test(
      text
    )
  );
 }
 function hasSelectedObjectInventoryPurchaseDocumentsSignal(text: string): boolean {
  return (
    hasSelectedObjectInventoryCue(text) &&
    /(?:по\s+каким\s+документам\s+(?:это|его|этот\s+товар|эту\s+позицию)\s+купили|по\s+каким\s+документам\s+(?:был\s+)?куплен|какими\s+документами\s+(?:это|его|этот\s+товар|эту\s+позицию)\s+купили|какими\s+документами\s+(?:был\s+)?куплен|purchase\s+documents|documents\s+of\s+purchase|through\s+which\s+documents)/iu.test(
      text
    )
  );
 }
 function hasInventoryProvenanceSignalV2(text: string): boolean {
  const hasItemCue = /(?:товар|номенклатур|sku|item|product|остат(?:ок|ки)|склад)/iu.test(text);
  const hasSupplierCue =
@ -1871,6 +1893,14 @@ export function resolveAddressIntent(userMessage: string): AddressIntentResoluti
    };
  }
  if (hasSelectedObjectInventoryProvenanceSignal(text)) {
    return {
      intent: "inventory_purchase_provenance_for_item",
      confidence: "medium",
      reasons: ["inventory_selected_object_provenance_signal_detected"]
    };
  }
  if (hasInventoryProvenanceSignalV2(text)) {
    return {
      intent: "inventory_purchase_provenance_for_item",
@ -1887,6 +1917,14 @@ export function resolveAddressIntent(userMessage: string): AddressIntentResoluti
    };
  }
  if (hasSelectedObjectInventoryPurchaseDocumentsSignal(text)) {
    return {
      intent: "inventory_purchase_documents_for_item",
      confidence: "medium",
      reasons: ["inventory_selected_object_purchase_documents_signal_detected"]
    };
  }
  if (hasInventoryPurchaseDocumentsSignalV2(text)) {
    return {
      intent: "inventory_purchase_documents_for_item",
--- a/llm_normalizer/backend/src/services/addressQueryService.ts
+++ b/llm_normalizer/backend/src/services/addressQueryService.ts
@ -3498,6 +3498,22 @@ export class AddressQueryService {
            );
            const broadenedLimitations = [...filters.warnings, "period_window_auto_broadened_to_available_data"];
            const broadenedReasons = [...baseReasons, "period_window_auto_broadened_to_available_data"];
            const broadenedResultSemantics = mergeAddressResultSemantics(
              deriveAddressResultSemantics({
                intent: intent.intent,
                selectedRecipe: broadenedSelection.selected_recipe.recipe_id,
                filters: filters.extracted_filters,
                responseType: broadenedFactual.responseType,
                rowsMatched: broadenedFilteredRows.length
              }),
              broadenedFactual.semantics
            );
            const broadenedRouteExpectationAudit = buildRouteExpectationAudit({
              intent: routeExpectationIntent,
              selectedRecipe: broadenedSelection.selected_recipe.recipe_id,
              requestedResultMode,
              resultMode: broadenedResultSemantics.result_mode
            });
            return {
              handled: true,
              reply_text: injectNoticeAfterLeadLine(broadenedFactual.text, broadenedPrefix),
@ -3540,16 +3556,21 @@ export class AddressQueryService {
                runtime_readiness: "LIVE_QUERYABLE_WITH_LIMITS",
                limited_reason_category: null,
                response_type: broadenedFactual.responseType,
-                ...mergeAddressResultSemantics(
+                capability_id: capabilityAudit.capabilityId,
-                  deriveAddressResultSemantics({
+                capability_layer: capabilityAudit.layer,
-                    intent: intent.intent,
+                capability_route_mode: capabilityAudit.routeMode,
-                    selectedRecipe: broadenedSelection.selected_recipe.recipe_id,
+                capability_route_enabled: capabilityAudit.enabled,
-                    filters: filters.extracted_filters,
+                capability_route_reason: capabilityAudit.reason,
-                    responseType: broadenedFactual.responseType,
+                shadow_route_intent: shadowRouteAudit.intent,
-                    rowsMatched: broadenedFilteredRows.length
+                shadow_route_selected_recipe: shadowRouteAudit.selectedRecipe,
-                  }),
+                shadow_route_status: shadowRouteAudit.status,
-                  broadenedFactual.semantics
+                route_expectation_status: broadenedRouteExpectationAudit.status,
-                ),
+                route_expectation_reason: broadenedRouteExpectationAudit.reason,
                route_expectation_expected_selected_recipes: broadenedRouteExpectationAudit.expectedSelectedRecipes,
                route_expectation_expected_requested_result_modes:
                  broadenedRouteExpectationAudit.expectedRequestedResultModes,
                route_expectation_expected_result_modes: broadenedRouteExpectationAudit.expectedResultModes,
                ...broadenedResultSemantics,
                limitations: broadenedLimitations,
                reasons: withConfirmedBalanceFallbackReason(
                  broadenedReasons,
--- a/llm_normalizer/backend/src/services/address_runtime/decomposeStage.ts
+++ b/llm_normalizer/backend/src/services/address_runtime/decomposeStage.ts
@ -306,6 +306,34 @@ function mapCounterpartyIntentToContractIntent(intent: AddressIntent): AddressIn
  return null;
 }
 function isInventoryIntent(intent: AddressIntent | undefined): boolean {
  return (
    intent === "inventory_on_hand_as_of_date" ||
    intent === "inventory_purchase_provenance_for_item" ||
    intent === "inventory_purchase_documents_for_item" ||
    intent === "inventory_supplier_stock_overlap_as_of_date" ||
    intent === "inventory_sale_trace_for_item" ||
    intent === "inventory_purchase_to_sale_chain" ||
    intent === "inventory_aging_by_purchase_date"
  );
 }
 function hasSelectedObjectInventorySignal(text: string): boolean {
  return /(?:по\s+выбранному\s+объекту|for\s+selected\s+object)/iu.test(String(text ?? ""));
 }
 function hasInventorySupplierFollowupCue(text: string): boolean {
  return /(?:кто\s+(?:(?:это|этот\s+товар|эту\s+позицию)\s+)?(?:нам\s+)?поставил|кто\s+(?:нам\s+)?поставил\s+(?:это|этот\s+товар|эту\s+позицию)|от\s+какого\s+поставщика|у\s+какого\s+поставщика|от\s+кого\s+куплен|supplier|vendor|поставщик)/iu.test(
    String(text ?? "")
  );
 }
 function hasInventoryPurchaseDocumentsFollowupCue(text: string): boolean {
  return /(?:по\s+каким\s+документам\s+(?:это|его|этот\s+товар|эту\s+позицию)\s+купили|по\s+каким\s+документам\s+(?:был\s+)?куплен|какими\s+документами\s+(?:это|его|этот\s+товар|эту\s+позицию)\s+купили|какими\s+документами\s+(?:был\s+)?куплен|purchase\s+documents|documents\s+of\s+purchase|through\s+which\s+documents)/iu.test(
    String(text ?? "")
  );
 }
 export function hasAddressFollowupContextSignal(text: string): boolean {
  const normalized = String(text ?? "").trim();
  if (!normalized) {
@ -752,6 +780,39 @@ function deriveIntentWithFollowupContext(
    };
  }
  const previousIsInventoryFamily = isInventoryIntent(previousIntent);
  const inventorySelectedObjectFollowup =
    hasSelectedObjectInventorySignal(normalizedMessage) || (previousIsInventoryFamily && hasFollowupSignal);
  if (inventorySelectedObjectFollowup && hasInventorySupplierFollowupCue(normalizedMessage)) {
    if (
      detectedIntent.intent === "unknown" ||
      detectedIntent.intent === "inventory_on_hand_as_of_date" ||
      detectedIntent.intent === previousIntent
    ) {
      return {
        intent: "inventory_purchase_provenance_for_item",
        confidence: "low",
        reasons: [...detectedIntent.reasons, "intent_adjusted_to_inventory_followup_context"]
      };
    }
  }
  if (inventorySelectedObjectFollowup && hasInventoryPurchaseDocumentsFollowupCue(normalizedMessage)) {
    if (
      detectedIntent.intent === "unknown" ||
      detectedIntent.intent === "list_documents_by_counterparty" ||
      detectedIntent.intent === "list_documents_by_contract" ||
      detectedIntent.intent === "inventory_on_hand_as_of_date" ||
      detectedIntent.intent === previousIntent
    ) {
      return {
        intent: "inventory_purchase_documents_for_item",
        confidence: "low",
        reasons: [...detectedIntent.reasons, "intent_adjusted_to_inventory_followup_context"]
      };
    }
  }
  if (hasPreviousContract) {
    if (detectedIntent.intent === "list_contracts_by_counterparty") {
      if (hasBankSignal(normalizedMessage)) {
--- a/llm_normalizer/backend/tests/addressInventorySelectedObjectFollowup.test.ts
+++ b/llm_normalizer/backend/tests/addressInventorySelectedObjectFollowup.test.ts
@ -103,6 +103,8 @@ describe("inventory selected-object follow-up", () => {
    expect(result?.debug.extracted_filters?.as_of_date).toBe("2021-03-31");
    expect(result?.debug.extracted_filters?.period_from).toBe("2021-03-01");
    expect(result?.debug.extracted_filters?.period_to).toBe("2021-03-31");
    expect(result?.debug.capability_id).toBe("inventory_inventory_purchase_provenance_for_item");
    expect(result?.debug.capability_route_mode).toBe("exact");
    expect(result?.debug.reasons).toContain("period_window_auto_broadened_to_available_data");
    expect(result?.debug.limitations).toContain("period_window_auto_broadened_to_available_data");
    const replyLines = String(result?.reply_text ?? "").split("\n");
@ -111,4 +113,97 @@ describe("inventory selected-object follow-up", () => {
    expect(replyLines[1]).toContain("По окну 2021-03-01..2021-03-31 строк не найдено");
    expect(executeAddressMcpQueryMock).toHaveBeenCalledTimes(2);
  });
  it("handles selected-object supplier slang 'кто это поставил нам' as provenance follow-up", async () => {
    executeAddressMcpQueryMock.mockResolvedValueOnce({
      fetched_rows: 1,
      matched_rows: 1,
      raw_rows: [
        {
          Period: "2019-02-11T00:00:00Z",
          Registrator: "Поступление товаров и услуг 00000000077 от 11.02.2019 0:00:00",
          AccountDt: "41.01",
          AccountKt: "60.01",
          Amount: 3724.17,
          SubcontoDt1: "Столешница 600*3050*26 дуб ниагара",
          SubcontoDt3: "Основной склад",
          SubcontoKt1: "Торговый дом \\Союз МСК\\",
          SubcontoKt2: "Договор поставки № 12 от 01.02.2019",
          Organization: "ООО \\Альтернатива Плюс\\"
        }
      ],
      rows: [],
      error: null
    });
    const service = new AddressQueryService();
    const result = await service.tryHandle('По выбранному объекту "Столешница 600*3050*26 дуб ниагара": кто это поставил нам', {
      followupContext: {
        previous_intent: "inventory_on_hand_as_of_date",
        previous_filters: {
          as_of_date: "2019-03-31",
          period_from: "2019-03-01",
          period_to: "2019-03-31",
          warehouse: "Основной склад",
          organization: "ООО \\Альтернатива Плюс\\"
        },
        previous_anchor_type: "unknown",
        previous_anchor_value: null
      }
    });
    expect(result?.handled).toBe(true);
    expect(result?.response_type).toBe("FACTUAL_SUMMARY");
    expect(result?.debug.detected_intent).toBe("inventory_purchase_provenance_for_item");
    expect(result?.debug.extracted_filters?.item).toBe("Столешница 600*3050*26 дуб ниагара");
    expect(result?.debug.extracted_filters?.as_of_date).toBe("2019-03-31");
    expect(String(result?.reply_text ?? "")).toContain("Торговый дом \\Союз МСК\\");
  });
  it("handles selected-object purchase-doc slang 'по каким документам это купили' as exact purchase-doc follow-up", async () => {
    executeAddressMcpQueryMock.mockResolvedValueOnce({
      fetched_rows: 1,
      matched_rows: 1,
      raw_rows: [
        {
          Period: "2019-02-11T00:00:00Z",
          Registrator: "Поступление товаров и услуг 00000000077 от 11.02.2019 0:00:00",
          AccountDt: "41.01",
          AccountKt: "60.01",
          Amount: 3724.17,
          SubcontoDt1: "Столешница 600*3050*26 дуб ниагара",
          SubcontoDt3: "Основной склад",
          SubcontoKt1: "Торговый дом \\Союз МСК\\",
          SubcontoKt2: "Договор поставки № 12 от 01.02.2019",
          Organization: "ООО \\Альтернатива Плюс\\"
        }
      ],
      rows: [],
      error: null
    });
    const service = new AddressQueryService();
    const result = await service.tryHandle('По выбранному объекту "Столешница 600*3050*26 дуб ниагара": по каким документам это купили', {
      followupContext: {
        previous_intent: "inventory_purchase_provenance_for_item",
        previous_filters: {
          as_of_date: "2019-03-31",
          period_from: "2019-03-01",
          period_to: "2019-03-31",
          item: "Столешница 600*3050*26 дуб ниагара",
          warehouse: "Основной склад"
        },
        previous_anchor_type: "unknown",
        previous_anchor_value: null
      }
    });
    expect(result?.handled).toBe(true);
    expect(result?.response_type).toBe("FACTUAL_LIST");
    expect(result?.debug.detected_intent).toBe("inventory_purchase_documents_for_item");
    expect(result?.debug.selected_recipe).toBe("address_inventory_purchase_documents_for_item_v1");
    expect(result?.debug.extracted_filters?.item).toBe("Столешница 600*3050*26 дуб ниагара");
    expect(result?.debug.extracted_filters?.as_of_date).toBe("2019-03-31");
    expect(String(result?.reply_text ?? "")).toContain("Поступление товаров и услуг 00000000077");
  });
 });
--- a/llm_normalizer/backend/tests/addressQueryRuntimeM23.test.ts
+++ b/llm_normalizer/backend/tests/addressQueryRuntimeM23.test.ts
@ -173,6 +173,14 @@ describe("address query shape classifier", () => {
    expect(filters.item).toBe("Кромка с клеем 33 альмандин 137 м");
  });
  it("extracts item anchor from selected-object purchase-doc follow-up without explicit word товар", () => {
    const filters = extractAddressFilters(
      'По выбранному объекту "Столешница 600*3050*26 дуб ниагара": по каким документам это купили',
      "inventory_purchase_documents_for_item"
    ).extracted_filters;
    expect(filters.item).toBe("Столешница 600*3050*26 дуб ниагара");
  });
  it("keeps colloquial selected-object supplier follow-up in inventory provenance intent", () => {
    const mode = detectAddressQuestionMode(
      'По выбранному объекту "Кромка с клеем 33 альмандин 137 м": кто поставил этот товар'
@ -184,6 +192,28 @@ describe("address query shape classifier", () => {
    expect(result.intent).toBe("inventory_purchase_provenance_for_item");
  });
  it("keeps selected-object supplier slang with 'кто это поставил нам' in inventory provenance intent", () => {
    const mode = detectAddressQuestionMode(
      'По выбранному объекту "Столешница 600*3050*26 дуб ниагара": кто это поставил нам'
    );
    const result = resolveAddressIntent(
      'По выбранному объекту "Столешница 600*3050*26 дуб ниагара": кто это поставил нам'
    );
    expect(mode.mode).toBe("address_query");
    expect(result.intent).toBe("inventory_purchase_provenance_for_item");
  });
  it("keeps selected-object purchase-doc slang with 'по каким документам это купили' in purchase-doc intent", () => {
    const mode = detectAddressQuestionMode(
      'По выбранному объекту "Столешница 600*3050*26 дуб ниагара": по каким документам это купили'
    );
    const result = resolveAddressIntent(
      'По выбранному объекту "Столешница 600*3050*26 дуб ниагара": по каким документам это купили'
    );
    expect(mode.mode).toBe("address_query");
    expect(result.intent).toBe("inventory_purchase_documents_for_item");
  });
  it("keeps full supplier anchor with comma suffix for stock-overlap questions", () => {
    const filters = extractAddressFilters(
      "Какие товары от поставщика Гамма-мебель, ООО сейчас еще лежат на складе Основной склад?",
@ -3874,6 +3904,49 @@ describe("address query limited taxonomy and stage diagnostics", { timeout: 1500
 });
 describe("address decompose stage follow-up carryover", () => {
  it("promotes selected-object supplier slang follow-up into inventory provenance with inherited date context", () => {
    const result = runAddressDecomposeStage('По выбранному объекту "Столешница 600*3050*26 дуб ниагара": кто это поставил нам', {
      previous_intent: "inventory_on_hand_as_of_date",
      previous_filters: {
        as_of_date: "2019-03-31",
        period_from: "2019-03-01",
        period_to: "2019-03-31",
        warehouse: "Основной склад"
      },
      previous_anchor_type: "unknown",
      previous_anchor_value: null
    });
    expect(result).not.toBeNull();
    expect(result?.intent.intent).toBe("inventory_purchase_provenance_for_item");
    expect(result?.filters.extracted_filters.as_of_date).toBe("2019-03-31");
    expect(
      result?.baseReasons?.includes("intent_adjusted_to_inventory_followup_context") ||
        result?.intent.reasons.includes("inventory_selected_object_provenance_signal_detected")
    ).toBe(true);
  });
  it("promotes selected-object purchase-doc slang follow-up into inventory purchase documents with inherited date context", () => {
    const result = runAddressDecomposeStage('По выбранному объекту "Столешница 600*3050*26 дуб ниагара": по каким документам это купили', {
      previous_intent: "inventory_purchase_provenance_for_item",
      previous_filters: {
        as_of_date: "2019-03-31",
        period_from: "2019-03-01",
        period_to: "2019-03-31",
        item: "Столешница 600*3050*26 дуб ниагара"
      },
      previous_anchor_type: "unknown",
      previous_anchor_value: null
    });
    expect(result).not.toBeNull();
    expect(result?.intent.intent).toBe("inventory_purchase_documents_for_item");
    expect(result?.filters.extracted_filters.item).toBe("Столешница 600*3050*26 дуб ниагара");
    expect(result?.filters.extracted_filters.as_of_date).toBe("2019-03-31");
    expect(
      result?.baseReasons?.includes("intent_adjusted_to_inventory_followup_context") ||
        result?.intent.reasons.includes("inventory_selected_object_purchase_documents_signal_detected")
    ).toBe(true);
  });
  it("keeps slang all-customers-all-time wording in address lane via resolved intent fallback", () => {
    const result = runAddressDecomposeStage("выведи всех заков за все время", null);
    expect(result).not.toBeNull();
--- a/scripts/domain_case_loop.py
+++ b/scripts/domain_case_loop.py
@ -2120,6 +2120,7 @@ def build_analyst_loop_prompt(
        - `.codex/agents/domain_analyst.toml`
        - `.codex/skills/domain-case-loop/SKILL.md`
        - `.codex/skills/domain-case-loop/references/verdict_template.md`
        - `.codex/skills/domain-case-loop/references/business_first_analyst_rubric.md`
        Current loop context:
        - loop_dir: `{loop_dir}`
@ -2135,11 +2136,13 @@ def build_analyst_loop_prompt(
        Goal:
        - evaluate current domain-pack correctness for business meaning, route/capability quality, evidence quality, and absence of silent heuristic masking;
        - evaluate business usefulness, direct-answer-first behavior, state continuity, and field truthfulness, not only technical groundedness;
        - determine whether the gate `quality_score >= {target_score}` is reached;
        - if not, provide the smallest high-value fix targets for the coder.
        Rules:
        - `accepted` is allowed only if quality_score >= {target_score}, unresolved_p0_count = 0, and regression_detected = false;
        - `accepted` also requires `direct_answer_ok = true` and `business_usefulness_ok = true`;
        - `partial` means the pack is usable but exactness, routing, or coverage is still insufficient;
        - `needs_exact_capability` means the primary blocker is a missing exact route or capability, but the loop should still continue autonomously unless a user decision is required;
        - `continue` means there is a clear next patch cycle;
@ -2152,6 +2155,10 @@ def build_analyst_loop_prompt(
        - if `requires_user_decision = true`, fill `user_decision_type` and `user_decision_prompt`;
        - if the pack is below {target_score} but there is still safe autonomous implementation work, keep `requires_user_decision = false`;
        - do not request user input merely because the score is still below {target_score}; request it only when the loop would otherwise guess, overfit, or risk architecture drift.
        - return machine-readable fields for: `user_intent_summary`, `expected_direct_answer`, `actual_direct_answer`, `direct_answer_ok`, `business_usefulness_ok`, `business_utility_score`, `direct_answer_priority_score`, `state_continuity_score`, `answer_shape_score`, `evidence_clarity_score`, `root_cause_layers`, `broken_edge_ids`, `violated_invariants`;
        - if the product found the evidence but failed to retain the selected object, provenance bundle, or another reusable resolved object across turns, classify that as `object_memory_gap` or `edge_carryover_gap`, not as a generic route problem;
        - if the surfaced business field looks mislabeled, for example supplier vs organization, classify that as `field_mapping_gap`;
        - if the answer is technically grounded but still weak for a manager/accountant/operator, classify that as `business_utility_gap`.
        Use this UTF-8 evidence bundle as the source of truth for artifact contents. Do not treat shell rendering artifacts as file corruption if the embedded bundle is readable.
@ -2196,6 +2203,9 @@ def build_coder_loop_prompt(
        - do not present heuristic answers as confirmed;
        - do not touch unrelated files;
        - preserve already successful baseline flows.
        - use `root_cause_layers`, `broken_edge_ids`, `violated_invariants`, and business-utility scores from the analyst verdict to choose the smallest fix;
        - prioritize state continuity, selected-object persistence, direct-answer-first behavior, and field-truth mapping when those are the blocking layers;
        - do not broaden scope when the analyst says the defect is mainly `object_memory_gap`, `field_mapping_gap`, `answer_shape_mismatch`, or `business_utility_gap`.
        Required outputs:
        - create `{iteration_dir / 'coder_plan.md'}` with a short plan;
@ -2217,12 +2227,21 @@ def evaluate_analyst_gate(
    quality_score = int(verdict.get("quality_score") or 0)
    unresolved_p0_count = int(verdict.get("unresolved_p0_count") or 0)
    regression_detected = bool(verdict.get("regression_detected"))
    direct_answer_ok = bool(verdict.get("direct_answer_ok", True))
    business_usefulness_ok = bool(verdict.get("business_usefulness_ok", True))
    loop_decision = str(verdict.get("loop_decision") or "").strip() or "continue"
    requires_user_decision = bool(verdict.get("requires_user_decision"))
    user_decision_type = str(verdict.get("user_decision_type") or "").strip() or "none"
    user_decision_prompt_raw = verdict.get("user_decision_prompt")
    user_decision_prompt = str(user_decision_prompt_raw).strip() if user_decision_prompt_raw else None
-    accepted = quality_score >= target_score and unresolved_p0_count == 0 and not regression_detected and loop_decision == "accepted"
+    accepted = (
        quality_score >= target_score
        and unresolved_p0_count == 0
        and not regression_detected
        and direct_answer_ok
        and business_usefulness_ok
        and loop_decision == "accepted"
    )
    return accepted, loop_decision, requires_user_decision, user_decision_type, user_decision_prompt