# Scenario-tree acceptance canon ## Core idea For follow-up-heavy business domains, the unit of acceptance is not a flat list of isolated questions. The unit of acceptance is a **scenario tree**: - a root business question; - one or more critical child drilldowns; - explicit transitions between steps; - explicit semantic carryover between steps. If the root works but a critical child transition breaks, the domain is **not** hardened. ## Model the domain as a tree For each scenario, define: - `root node` - `critical child nodes` - `critical edges` - `primary user path` Example for inventory: - root: stock snapshot on date - child: selected item -> supplier provenance - child: selected item -> purchase documents - child: selected item -> aging on the same date - child: selected item -> sale trace The primary user path is the path a real user is most likely to take first, not the prettiest canonical wording. ## Node acceptance A node is considered covered only if all of these are true: - the business meaning is understood correctly; - the expected intent / capability is selected; - the answer shape matches the requested business object; - the answer begins with a direct user-facing answer when such an answer is expected; - the answer is evidence-backed rather than heuristic-masked. Examples: - asking for supplier provenance must answer with the supplier first, not only with raw documents; - asking for old stock must answer with item-level old-stock positions, not with a raw document dump; - asking for residues/items/contracts must not silently downgrade to lower-level movements. ## Edge acceptance Each critical edge must define its required carryover invariants. Typical invariants: - selected object survives from previous assistant output - originating date / period survives into follow-up filters - warehouse survives if the follow-up still targets the same stock slice - organization survives if the previous slice was organization-bound - route family remains in the same business contour unless the user clearly changed intent If an edge loses a required invariant, that is a real regression even if the target node works in isolation. ## Mandatory paraphrase families Every critical node or edge must be validated in a small paraphrase family instead of one curated wording only. Minimum family: - `canonical` - `colloquial` - `ui_selected_object` Examples: - canonical: `От какого поставщика куплен товар X` - colloquial: `Кто поставил этот товар` - ui_selected_object: `По выбранному объекту "X": кто это поставил нам` If canonical works but colloquial or UI-generated follow-up fails, the node/edge is not accepted. ## Acceptance matrix The analyst must produce or update a `scenario_acceptance_matrix.md` artifact for every multi-step scenario or pack. Minimum matrix columns: - scenario id - node id or edge id - user path role (`root`, `critical_child`, `supporting`) - wording family (`canonical`, `colloquial`, `ui_selected_object`) - expected business meaning - expected intent - expected capability / recipe - required carryover invariants - expected answer shape - actual outcome - status (`pass`, `partial`, `fail`) - defect class ## Defect classes Use these classes explicitly: - `semantic_understanding_gap` - `edge_carryover_gap` - `answer_shape_mismatch` - `ordering_semantics_mismatch` - `runtime_capability_gap` - `loop_coverage_gap` Definitions: - `semantic_understanding_gap`: the system did not understand the real user meaning - `edge_carryover_gap`: the follow-up lost date / object / scope across steps - `answer_shape_mismatch`: the business object in the answer does not match the requested object - `ordering_semantics_mismatch`: ranking / chronology semantics are wrong - `runtime_capability_gap`: the product contour truly lacks the route / intent / capability / extractor / recipe - `loop_coverage_gap`: the runtime could support the path or nearly support it, but the analyst/orchestrator never treated that path as mandatory acceptance coverage ## Analyst responsibilities The analyst must: - review the scenario tree, not just individual turns; - compare expected and actual user path transitions; - call out broken edges explicitly; - verify colloquial and UI-generated variants as first-class coverage; - verify direct-answer-first behavior where the user asked a direct lookup question; - verify answer granularity and ordering semantics; - lower the score when any critical edge or paraphrase family is broken. ## Orchestrator responsibilities The orchestrator must: - define the tree before iterating deeply; - prioritize the primary user path first; - rerun at least one colloquial variant and one UI-selected-object variant for each critical branch; - treat a broken critical edge as an unfinished scenario even if the root node works; - route coder work to the narrowest broken edge or node rather than issuing broad “improve the domain” tasks. ## Stop and acceptance rules Do not accept a domain when: - only the root node works; - only one curated phrasing works; - selected-object follow-up is broken; - `на эту дату` / `на ту дату` loses the originating date; - the answer shape is wrong for the business question; - chronology / ranking semantics are inverted. Accepted requires: - score >= 80 - no unresolved P0 - critical path edges pass - canonical + colloquial + UI-selected-object variants pass for critical branches - no silent heuristic masking