# Scenario-tree acceptance canon

## Core idea

For follow-up-heavy business domains, the unit of acceptance is not a flat list of isolated questions.

The unit of acceptance is a **scenario tree**:
- a root business question;
- one or more critical child drilldowns;
- explicit transitions between steps;
- explicit semantic carryover between steps.

If the root works but a critical child transition breaks, the domain is **not** hardened.

## Business-first framing

Every accepted node or edge must be both technically grounded and business-useful.

This means:
- the direct answer is surfaced first when the user asked a direct lookup question;
- the answer stays on the requested business object;
- evidence and caveats support the answer instead of replacing it;
- field labels are truthful for business entities such as supplier, buyer, organization, warehouse, and document.

## Model the domain as a tree

For each scenario, define:
- `root node`
- `critical child nodes`
- `critical edges`
- `primary user path`

Example for inventory:
- root: stock snapshot on date
- child: selected item -> supplier provenance
- child: selected item -> purchase documents
- child: selected item -> aging on the same date
- child: selected item -> sale trace
- child: selected item -> pronoun follow-up purchase documents

The primary user path is the path a real user is most likely to take first, not the prettiest canonical wording.

## Node acceptance

A node is considered covered only if all of these are true:
- the business meaning is understood correctly;
- the expected intent / capability is selected;
- the answer shape matches the requested business object;
- the answer begins with a direct user-facing answer when such an answer is expected;
- the answer is evidence-backed rather than heuristic-masked;
- the surfaced business fields are truthful and not mislabeled.

Examples:
- asking for supplier provenance must answer with the supplier first, not only with raw documents;
- asking for old stock must answer with item-level old-stock positions, not with a raw document dump;
- asking for residues/items/contracts must not silently downgrade to lower-level movements.

## Edge acceptance

Each critical edge must define its required carryover invariants.

Typical invariants:
- selected object survives from previous assistant output
- stable focus object survives as the active business object
- originating date / period survives into follow-up filters
- warehouse survives if the follow-up still targets the same stock slice
- organization survives if the previous slice was organization-bound
- route family remains in the same business contour unless the user clearly changed intent
- reusable resolved-object state survives when the previous turn already answered a closely related lookup
- pronoun references can reuse the active focus object when the wording supports it
- follow-up action resolution stays on the same business object, for example item -> purchase documents rather than counterparty -> documents

If an edge loses a required invariant, that is a real regression even if the target node works in isolation.

## Resolved answer-object continuity

For follow-up-heavy domains, the analyst should treat resolved business objects as reusable state, not as disposable one-turn artifacts.

Examples:
- selected inventory item
- resolved supplier provenance bundle
- resolved buyer bundle
- resolved purchase document bundle

If turn N already resolved such an object and turn N+1 asks a natural follow-up about the same object, the system should reuse that state instead of demanding the same anchor again.
If turn N already resolved supplier/date/document provenance and turn N+1 asks for one adjacent field such as `когда купили ее` or `покажи документы по этой позиции`, the system should prefer bundle reuse before re-entering a broad generic router.

## Mandatory paraphrase families

Every critical node or edge must be validated in a small paraphrase family instead of one curated wording only.

Minimum family:
- `canonical`
- `colloquial`
- `ui_selected_object`
- `pronoun_followup` when the UX already established a selected object or active item

If canonical works but colloquial, UI-generated, or pronoun-only follow-up fails, the node/edge is not accepted.

## Acceptance matrix

The analyst must produce or update a `scenario_acceptance_matrix.md` artifact for every multi-step scenario or pack.

Minimum matrix columns:
- scenario id
- node id or edge id
- user path role (`root`, `critical_child`, `supporting`)
- wording family (`canonical`, `colloquial`, `ui_selected_object`)
- expected business meaning
- expected intent
- expected capability / recipe
- required carryover invariants
- expected answer shape
- expected direct answer
- business usefulness expectation
- actual outcome
- status (`pass`, `partial`, `fail`)
- defect class

## Defect classes

Use these classes explicitly:
- `semantic_understanding_gap`
- `edge_carryover_gap`
- `object_memory_gap`
- `followup_action_resolution_gap`
- `bundle_reuse_gap`
- `field_mapping_gap`
- `answer_shape_mismatch`
- `ordering_semantics_mismatch`
- `runtime_capability_gap`
- `business_utility_gap`
- `domain_anchor_gap`
- `loop_coverage_gap`

Definitions:
- `semantic_understanding_gap`: the system did not understand the real user meaning
- `edge_carryover_gap`: the follow-up lost date / object / scope across steps
- `object_memory_gap`: the system resolved the object once but failed to retain it for the next follow-up
- `followup_action_resolution_gap`: the system kept the business object but resolved the wrong action over that object, for example item-follow-up -> counterparty-documents
- `bundle_reuse_gap`: the system resolved a reusable supplier/date/document bundle once but failed to reuse it for an adjacent follow-up
- `field_mapping_gap`: the answer surfaced the wrong business field or mislabeled a field
- `answer_shape_mismatch`: the business object in the answer does not match the requested object
- `ordering_semantics_mismatch`: ranking / chronology semantics are wrong
- `runtime_capability_gap`: the product contour truly lacks the route / intent / capability / extractor / recipe
- `business_utility_gap`: the answer may be grounded but is still not useful as a user-facing result
- `domain_anchor_gap`: the scenario uses a weak or wrong observed anchor, so the tree is semantically mis-specified
- `loop_coverage_gap`: the runtime could support the path or nearly support it, but the analyst/orchestrator never treated that path as mandatory acceptance coverage

## Analyst responsibilities

The analyst must:
- review the scenario tree, not just individual turns;
- compare expected and actual user path transitions;
- call out broken edges explicitly;
- verify colloquial and UI-generated variants as first-class coverage;
- verify direct-answer-first behavior where the user asked a direct lookup question;
- verify business usefulness explicitly, not only technical validity;
- verify field truthfulness for surfaced supplier / buyer / organization labels;
- verify selected-object continuity and reusable object memory;
- verify focus-object continuity, pronoun follow-up continuity, and follow-up action resolution on the active business object;
- verify answer granularity and ordering semantics;
- lower the score when any critical edge or paraphrase family is broken.

## Orchestrator responsibilities

The orchestrator must:
- define the tree before iterating deeply;
- prioritize the primary user path first;
- rerun at least one colloquial variant and one UI-selected-object variant for each critical branch;
- rerun at least one short pronoun follow-up such as `по ней` / `по этой позиции` when the product UX already established a selected object;
- treat a broken critical edge as an unfinished scenario even if the root node works;
- route coder work to the narrowest broken edge or node rather than issuing broad “improve the domain” tasks.

## Stop and acceptance rules

Do not accept a domain when:
- only the root node works;
- only one curated phrasing works;
- selected-object follow-up is broken;
- pronoun-only selected-object follow-up is broken or misrouted to another business object;
- `на эту дату` / `на ту дату` loses the originating date;
- the answer shape is wrong for the business question;
- chronology / ranking semantics are inverted;
- the direct answer is not surfaced first on direct lookup questions;
- the answer is technically grounded but still business-useless.

Accepted requires:
- score >= 80
- no unresolved P0
- critical path edges pass
- canonical + colloquial + UI-selected-object variants pass for critical branches
- no silent heuristic masking
- `direct_answer_ok = true`
- `business_usefulness_ok = true`