ОРРКЕСТРАЦИЯ - Усилить domain-loop business-first каноном для analyst и orchestrator
This commit is contained in:
parent
f6a2c8e0a3
commit
cb0eb450d7
|
|
@ -17,21 +17,25 @@ You read:
|
||||||
|
|
||||||
Your job is to produce a detailed verdict in Russian with strong business focus.
|
Your job is to produce a detailed verdict in Russian with strong business focus.
|
||||||
|
|
||||||
Always answer in a strict structure:
|
When the caller asks for prose, use this strict structure:
|
||||||
1. Смысл вопроса
|
1. Question meaning
|
||||||
2. Главный пользовательский путь и дерево сценария
|
2. Primary user path and scenario tree
|
||||||
3. Что реально посчитано
|
3. Expected direct answer
|
||||||
4. Где расхождение по бизнес-смыслу
|
4. What the system actually computed
|
||||||
5. Где route / capability mismatch
|
5. Business mismatch
|
||||||
6. Evidence quality
|
6. Route / capability mismatch
|
||||||
7. P0 defects
|
7. State continuity and selected-object memory
|
||||||
8. P1 defects
|
8. Field truth and evidence quality
|
||||||
9. P2 defects
|
9. P0 defects
|
||||||
10. Minimal patch directions
|
10. P1 defects
|
||||||
11. Acceptance matrix for rerun
|
11. P2 defects
|
||||||
12. Acceptance criteria for rerun
|
12. Minimal patch directions
|
||||||
13. Quality score
|
13. Acceptance matrix for rerun
|
||||||
14. Loop decision
|
14. Acceptance criteria for rerun
|
||||||
|
15. Quality score
|
||||||
|
16. Loop decision
|
||||||
|
|
||||||
|
When the caller asks for JSON, map the same logic into machine-readable fields. Do not collapse the business analysis into one generic summary.
|
||||||
|
|
||||||
Rules:
|
Rules:
|
||||||
- Call out non-business garbage explicitly.
|
- Call out non-business garbage explicitly.
|
||||||
|
|
@ -46,9 +50,16 @@ Rules:
|
||||||
- Verify answer granularity explicitly: if the user asked for item-level residues, do not accept a document-level dump as a correct answer.
|
- Verify answer granularity explicitly: if the user asked for item-level residues, do not accept a document-level dump as a correct answer.
|
||||||
- Verify sort/order semantics when the wording implies chronology or ranking, for example `старые закупки` should be oldest-first.
|
- Verify sort/order semantics when the wording implies chronology or ranking, for example `старые закупки` should be oldest-first.
|
||||||
- Treat the acceptance unit as a scenario tree, not a flat list of prompts.
|
- Treat the acceptance unit as a scenario tree, not a flat list of prompts.
|
||||||
- Under `Главный пользовательский путь и дерево сценария`, explicitly name the root node, critical child nodes, critical edges, and the primary user path.
|
- Evaluate the answer in business-first order: first direct answer quality, then usefulness, then technical support.
|
||||||
- Under `Acceptance matrix for rerun`, list at least the critical nodes/edges and mark each one by wording family: `canonical`, `colloquial`, `ui_selected_object`.
|
- Explicitly state what the first line of the answer should have been for the user.
|
||||||
- Distinguish these defect classes explicitly when relevant: `semantic_understanding_gap`, `edge_carryover_gap`, `answer_shape_mismatch`, `ordering_semantics_mismatch`, `runtime_capability_gap`, `loop_coverage_gap`.
|
- If the answer is technically grounded but business-useless, say so directly and lower the score.
|
||||||
|
- Treat selected-object continuity and reusable answer-object memory as first-class analysis objects.
|
||||||
|
- Call out when the runtime found the underlying document/trace but failed to retain the resolved business object for the next follow-up.
|
||||||
|
- Distinguish `object_memory_gap`, `field_mapping_gap`, `business_utility_gap`, and `domain_anchor_gap` from pure route gaps.
|
||||||
|
- Check field truth explicitly: supplier must not be mislabeled as organization, buyer must not be mislabeled as organization, and document-side fields must not be presented as business truth without evidence.
|
||||||
|
- Under the scenario-tree section, explicitly name the root node, critical child nodes, critical edges, and the primary user path.
|
||||||
|
- Under the acceptance matrix, list at least the critical nodes/edges and mark each one by wording family: `canonical`, `colloquial`, `ui_selected_object`.
|
||||||
|
- Distinguish these defect classes explicitly when relevant: `semantic_understanding_gap`, `edge_carryover_gap`, `object_memory_gap`, `field_mapping_gap`, `answer_shape_mismatch`, `ordering_semantics_mismatch`, `runtime_capability_gap`, `business_utility_gap`, `loop_coverage_gap`, `domain_anchor_gap`.
|
||||||
- If the root node works but the primary user path is broken at the first selected-object drilldown, treat that as a real failure of domain hardening.
|
- If the root node works but the primary user path is broken at the first selected-object drilldown, treat that as a real failure of domain hardening.
|
||||||
- If the runtime nearly supports the path but the loop never validated the realistic wording family, call it `loop_coverage_gap`, not product success.
|
- If the runtime nearly supports the path but the loop never validated the realistic wording family, call it `loop_coverage_gap`, not product success.
|
||||||
|
|
||||||
|
|
@ -56,6 +67,6 @@ Quality score:
|
||||||
- Output one integer score from 0 to 100.
|
- Output one integer score from 0 to 100.
|
||||||
- Score >= 80 means the case can be accepted only if there is no unresolved P0.
|
- Score >= 80 means the case can be accepted only if there is no unresolved P0.
|
||||||
- Score >= 80 also requires the primary user path and its critical edges to be green across canonical, colloquial, and UI-selected-object coverage where applicable.
|
- Score >= 80 also requires the primary user path and its critical edges to be green across canonical, colloquial, and UI-selected-object coverage where applicable.
|
||||||
- If score < 80, loop_decision must be continue, partial, blocked, or needs_exact_capability.
|
- Score >= 80 also requires `direct_answer_ok = true` and `business_usefulness_ok = true` for the primary user path.
|
||||||
"""
|
"""
|
||||||
nickname_candidates = ["Lens", "Vector", "Delta"]
|
nickname_candidates = ["Lens", "Vector", "Delta"]
|
||||||
|
|
|
||||||
|
|
@ -1,5 +1,5 @@
|
||||||
name = "orchestrator"
|
name = "orchestrator"
|
||||||
description = "Coordinates a repo-native domain-case or scenario loop for NDC_1C: baseline or scenario capture, analyst verdict, minimal domain patch, rerun, and 80-point acceptance gate."
|
description = "Coordinates a repo-native domain-case or scenario loop for NDC_1C: baseline or scenario capture, minimal domain patching, rerun, and business-first acceptance."
|
||||||
model = "gpt-5.4"
|
model = "gpt-5.4"
|
||||||
model_reasoning_effort = "high"
|
model_reasoning_effort = "high"
|
||||||
sandbox_mode = "workspace-write"
|
sandbox_mode = "workspace-write"
|
||||||
|
|
@ -46,6 +46,10 @@ Hard rules:
|
||||||
- For cascading date-sensitive scenarios, rerun at least one `на эту дату` / `на ту дату` follow-up and verify that the originating date or period survives into debug filters.
|
- For cascading date-sensitive scenarios, rerun at least one `на эту дату` / `на ту дату` follow-up and verify that the originating date or period survives into debug filters.
|
||||||
- If the business question asks for residues/items/contracts but the answer switched to raw documents or movements, treat that as a real defect, not as acceptable detail.
|
- If the business question asks for residues/items/contracts but the answer switched to raw documents or movements, treat that as a real defect, not as acceptable detail.
|
||||||
- If the wording implies chronology or ranking such as `старые закупки`, verify oldest-first ordering explicitly.
|
- If the wording implies chronology or ranking such as `старые закупки`, verify oldest-first ordering explicitly.
|
||||||
|
- Require the analyst to judge business usefulness, not only technical groundedness.
|
||||||
|
- Require the analyst to judge whether the direct answer appears in the first line when the user asked a direct lookup question.
|
||||||
|
- Treat selected-object continuity, pronoun resolution, and reusable resolved-object state as mandatory audit targets for follow-up-heavy domains.
|
||||||
|
- Distinguish runtime capability gaps from state-layer continuity gaps and from business-presentation gaps before choosing coder tasks.
|
||||||
- If the root node works but the first critical selected-object or drilldown edge is still broken, do not treat the scenario as hardened.
|
- If the root node works but the first critical selected-object or drilldown edge is still broken, do not treat the scenario as hardened.
|
||||||
- Require an explicit `scenario_acceptance_matrix.md` artifact for follow-up-heavy domains and packs.
|
- Require an explicit `scenario_acceptance_matrix.md` artifact for follow-up-heavy domains and packs.
|
||||||
- Use the matrix to drive coder tasks: patch the narrowest broken edge or wording family first, not the whole domain at once.
|
- Use the matrix to drive coder tasks: patch the narrowest broken edge or wording family first, not the whole domain at once.
|
||||||
|
|
@ -57,6 +61,7 @@ Acceptance gate:
|
||||||
- accepted requires no business-critical regression in rerun
|
- accepted requires no business-critical regression in rerun
|
||||||
- accepted requires green critical edges on the primary user path
|
- accepted requires green critical edges on the primary user path
|
||||||
- accepted requires green coverage for canonical + colloquial + UI-selected-object variants on critical branches when those branches exist in the product UX
|
- accepted requires green coverage for canonical + colloquial + UI-selected-object variants on critical branches when those branches exist in the product UX
|
||||||
|
- accepted requires `direct_answer_ok = true` and `business_usefulness_ok = true` on the primary user path
|
||||||
|
|
||||||
Required artifacts per cycle:
|
Required artifacts per cycle:
|
||||||
- case_brief.md
|
- case_brief.md
|
||||||
|
|
|
||||||
|
|
@ -28,6 +28,7 @@ This skill packages the standard workflow for iterating on one concrete domain c
|
||||||
|
|
||||||
Read `references/repo_runtime_map.md` before the first real cycle.
|
Read `references/repo_runtime_map.md` before the first real cycle.
|
||||||
For follow-up-heavy domains, also read `references/scenario_tree_acceptance_canon.md` before scenario mode, pack mode, or autonomous pack-loop mode.
|
For follow-up-heavy domains, also read `references/scenario_tree_acceptance_canon.md` before scenario mode, pack mode, or autonomous pack-loop mode.
|
||||||
|
For business-first analyst work, also read `references/business_first_analyst_rubric.md` before redefining acceptance or hardening a noisy-but-technically-grounded domain.
|
||||||
If `docs/orchestration/active_domain_contract.json` exists, treat it as the single mutable source of truth for the current domain and prefer it over older scattered pool/pack prose docs.
|
If `docs/orchestration/active_domain_contract.json` exists, treat it as the single mutable source of truth for the current domain and prefer it over older scattered pool/pack prose docs.
|
||||||
|
|
||||||
Use these repo-native capture paths:
|
Use these repo-native capture paths:
|
||||||
|
|
@ -136,6 +137,7 @@ The verdict must explicitly say whether the case is:
|
||||||
- a missing route/intent/capability inside project scope;
|
- a missing route/intent/capability inside project scope;
|
||||||
- a true out-of-scope request.
|
- a true out-of-scope request.
|
||||||
- a `runtime_capability_gap`, `semantic_understanding_gap`, `edge_carryover_gap`, `answer_shape_mismatch`, `ordering_semantics_mismatch`, or `loop_coverage_gap`.
|
- a `runtime_capability_gap`, `semantic_understanding_gap`, `edge_carryover_gap`, `answer_shape_mismatch`, `ordering_semantics_mismatch`, or `loop_coverage_gap`.
|
||||||
|
- an `object_memory_gap`, `field_mapping_gap`, `business_utility_gap`, or `domain_anchor_gap` when that is the real blocker.
|
||||||
|
|
||||||
### Step 4 - Domain patch
|
### Step 4 - Domain patch
|
||||||
|
|
||||||
|
|
@ -208,6 +210,9 @@ Accepted requires:
|
||||||
- Treat answer-shape mismatch as a scoring defect: if the user asked for items / residues / contracts, do not accept an answer that switched to raw documents, movements, or another lower-level object without saying so explicitly.
|
- Treat answer-shape mismatch as a scoring defect: if the user asked for items / residues / contracts, do not accept an answer that switched to raw documents, movements, or another lower-level object without saying so explicitly.
|
||||||
- Treat ordering semantics as part of correctness when the wording implies ranking or chronology, for example `старые закупки` => oldest-first rather than newest-first.
|
- Treat ordering semantics as part of correctness when the wording implies ranking or chronology, for example `старые закупки` => oldest-first rather than newest-first.
|
||||||
- Treat primary user-path failures as more important than supporting-path polish: if the user cannot go from root list -> selected object -> first drilldown, the scenario is not accepted.
|
- Treat primary user-path failures as more important than supporting-path polish: if the user cannot go from root list -> selected object -> first drilldown, the scenario is not accepted.
|
||||||
|
- Treat direct-answer-first behavior as part of correctness: if the user asked a direct lookup question, the first line must contain the direct answer before the evidence blocks.
|
||||||
|
- Treat business usefulness as part of correctness: factual-but-business-useless output is not acceptance-quality output.
|
||||||
|
- Treat stable follow-up object memory as part of correctness: when the prior turn already resolved the relevant item/object, the next turn must not re-ask for it.
|
||||||
|
|
||||||
## Domain-specific framing
|
## Domain-specific framing
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -0,0 +1,128 @@
|
||||||
|
# Business-first analyst rubric
|
||||||
|
|
||||||
|
Use this rubric when evaluating one domain case, one multi-step scenario, or one full domain pack.
|
||||||
|
|
||||||
|
The analyst must not stop at route/debug correctness. The analyst must judge whether the answer is actually useful for a real business user.
|
||||||
|
|
||||||
|
## Core principle
|
||||||
|
|
||||||
|
The analyst evaluates five layers at once:
|
||||||
|
- user intent;
|
||||||
|
- scenario tree and state continuity;
|
||||||
|
- business usefulness of the answer;
|
||||||
|
- evidence and field truthfulness;
|
||||||
|
- root cause and smallest defensible fix direction.
|
||||||
|
|
||||||
|
## Required analyst questions
|
||||||
|
|
||||||
|
For every critical turn or critical edge, answer these questions explicitly:
|
||||||
|
|
||||||
|
1. What did the user really ask?
|
||||||
|
- State the business meaning in one short sentence.
|
||||||
|
- Name the minimum direct answer the user expected.
|
||||||
|
|
||||||
|
2. What should the first line of the answer have been?
|
||||||
|
- If the user asked a direct lookup question, the first line must contain the direct answer.
|
||||||
|
- Technical explanation, limitations, and evidence come after the direct answer.
|
||||||
|
|
||||||
|
3. What object and scope had to survive from previous turns?
|
||||||
|
- selected item / selected contract / selected counterparty;
|
||||||
|
- originating date or period;
|
||||||
|
- warehouse or organization scope when still relevant;
|
||||||
|
- reusable resolved bundle, for example provenance trace or sale trace.
|
||||||
|
|
||||||
|
4. Did the answer stay on the same business object?
|
||||||
|
- item question -> item answer;
|
||||||
|
- supplier question -> supplier answer;
|
||||||
|
- buyer question -> buyer answer;
|
||||||
|
- old-stock question -> old-stock item list.
|
||||||
|
|
||||||
|
If the system silently switched to raw documents, movements, or another lower-level object, call it an answer-shape defect.
|
||||||
|
|
||||||
|
5. Are the surfaced fields truthful and correctly labeled?
|
||||||
|
- do not confuse supplier with organization;
|
||||||
|
- do not confuse buyer with organization;
|
||||||
|
- do not present a document-side technical field as a business truth unless that mapping is proven.
|
||||||
|
|
||||||
|
## Business usefulness rules
|
||||||
|
|
||||||
|
An answer is not accepted as business-useful when any of these are true:
|
||||||
|
- the direct answer is not placed first;
|
||||||
|
- the answer opens with technical hedging instead of the user-facing result;
|
||||||
|
- a weaker question is answered than the one the user asked;
|
||||||
|
- the answer requires the user to reconstruct the conclusion from low-level evidence;
|
||||||
|
- the answer uses ambiguous field labels for business-critical entities.
|
||||||
|
|
||||||
|
## State continuity rules
|
||||||
|
|
||||||
|
Follow-up continuity is a first-class acceptance object.
|
||||||
|
|
||||||
|
The analyst must verify:
|
||||||
|
- selected object continuity;
|
||||||
|
- date/period continuity;
|
||||||
|
- reusable evidence continuity;
|
||||||
|
- pronoun resolution continuity.
|
||||||
|
|
||||||
|
Important pronoun examples:
|
||||||
|
- `эту позицию`
|
||||||
|
- `этот товар`
|
||||||
|
- `его`
|
||||||
|
- `по нему`
|
||||||
|
- `по этой позиции`
|
||||||
|
|
||||||
|
If the previous turn already resolved a concrete object, the next turn must reuse it instead of asking for the anchor again.
|
||||||
|
|
||||||
|
## Reusable answer-object cache
|
||||||
|
|
||||||
|
For follow-up-heavy domains, the analyst should explicitly look for evidence that the product behaves as if it had a reusable resolved object bundle.
|
||||||
|
|
||||||
|
Examples:
|
||||||
|
- `current_item`
|
||||||
|
- `current_as_of_date`
|
||||||
|
- `current_provenance_trace`
|
||||||
|
- `current_sale_trace`
|
||||||
|
- `first_purchase_date`
|
||||||
|
- `supplier_if_known`
|
||||||
|
- `source_document_if_known`
|
||||||
|
|
||||||
|
If the runtime recomputes everything from scratch and loses the already resolved object, call that out as a state-layer defect.
|
||||||
|
|
||||||
|
## Root-cause layers
|
||||||
|
|
||||||
|
Use one or more of these root-cause layers explicitly:
|
||||||
|
- `semantic_understanding_gap`
|
||||||
|
- `runtime_capability_gap`
|
||||||
|
- `edge_carryover_gap`
|
||||||
|
- `object_memory_gap`
|
||||||
|
- `field_mapping_gap`
|
||||||
|
- `answer_shape_mismatch`
|
||||||
|
- `ordering_semantics_mismatch`
|
||||||
|
- `business_utility_gap`
|
||||||
|
- `loop_coverage_gap`
|
||||||
|
- `domain_anchor_gap`
|
||||||
|
|
||||||
|
## Minimum machine-readable verdict fields
|
||||||
|
|
||||||
|
The analyst verdict should expose at least:
|
||||||
|
- `user_intent_summary`
|
||||||
|
- `expected_direct_answer`
|
||||||
|
- `actual_direct_answer`
|
||||||
|
- `direct_answer_ok`
|
||||||
|
- `business_usefulness_ok`
|
||||||
|
- `business_utility_score`
|
||||||
|
- `direct_answer_priority_score`
|
||||||
|
- `state_continuity_score`
|
||||||
|
- `answer_shape_score`
|
||||||
|
- `evidence_clarity_score`
|
||||||
|
- `root_cause_layers`
|
||||||
|
- `broken_edge_ids`
|
||||||
|
- `violated_invariants`
|
||||||
|
|
||||||
|
## Inventory-specific reminders
|
||||||
|
|
||||||
|
For inventory follow-up chains, verify all of these:
|
||||||
|
- the selected item remains the current focus object after the user clicks a result;
|
||||||
|
- provenance questions answer supplier/date/document first, not only raw movement rows;
|
||||||
|
- `когда купили` can reuse the already resolved provenance bundle;
|
||||||
|
- supplier and organization are not mixed up in the surfaced answer;
|
||||||
|
- `на эту дату` keeps the original stock date unless the user explicitly changed it.
|
||||||
|
|
@ -9,6 +9,10 @@
|
||||||
## Expected business meaning
|
## Expected business meaning
|
||||||
- ...
|
- ...
|
||||||
|
|
||||||
|
## Expected direct answer
|
||||||
|
- first line should say:
|
||||||
|
- minimum acceptable business answer:
|
||||||
|
|
||||||
## Expected capability
|
## Expected capability
|
||||||
- ...
|
- ...
|
||||||
|
|
||||||
|
|
@ -31,6 +35,13 @@
|
||||||
- warehouse if relevant
|
- warehouse if relevant
|
||||||
- organization if relevant
|
- organization if relevant
|
||||||
- expected answer shape
|
- expected answer shape
|
||||||
|
- direct-answer-first when the user asked a direct lookup question
|
||||||
|
- reusable resolved-object continuity when the user asks a follow-up about the same selected object
|
||||||
|
|
||||||
|
## Field truth constraints
|
||||||
|
- do not confuse supplier with organization
|
||||||
|
- do not confuse buyer with organization
|
||||||
|
- do not surface technical document-side fields as business truth without proof
|
||||||
|
|
||||||
## Contour status
|
## Contour status
|
||||||
- in_contour / outside_current_contour / unknown
|
- in_contour / outside_current_contour / unknown
|
||||||
|
|
@ -53,3 +64,5 @@
|
||||||
- root node works
|
- root node works
|
||||||
- critical edges on the primary user path work
|
- critical edges on the primary user path work
|
||||||
- colloquial and UI-generated follow-up variants work
|
- colloquial and UI-generated follow-up variants work
|
||||||
|
- direct answer is placed first where expected
|
||||||
|
- output is business-useful, not only technically grounded
|
||||||
|
|
|
||||||
|
|
@ -12,6 +12,16 @@ The unit of acceptance is a **scenario tree**:
|
||||||
|
|
||||||
If the root works but a critical child transition breaks, the domain is **not** hardened.
|
If the root works but a critical child transition breaks, the domain is **not** hardened.
|
||||||
|
|
||||||
|
## Business-first framing
|
||||||
|
|
||||||
|
Every accepted node or edge must be both technically grounded and business-useful.
|
||||||
|
|
||||||
|
This means:
|
||||||
|
- the direct answer is surfaced first when the user asked a direct lookup question;
|
||||||
|
- the answer stays on the requested business object;
|
||||||
|
- evidence and caveats support the answer instead of replacing it;
|
||||||
|
- field labels are truthful for business entities such as supplier, buyer, organization, warehouse, and document.
|
||||||
|
|
||||||
## Model the domain as a tree
|
## Model the domain as a tree
|
||||||
|
|
||||||
For each scenario, define:
|
For each scenario, define:
|
||||||
|
|
@ -36,7 +46,8 @@ A node is considered covered only if all of these are true:
|
||||||
- the expected intent / capability is selected;
|
- the expected intent / capability is selected;
|
||||||
- the answer shape matches the requested business object;
|
- the answer shape matches the requested business object;
|
||||||
- the answer begins with a direct user-facing answer when such an answer is expected;
|
- the answer begins with a direct user-facing answer when such an answer is expected;
|
||||||
- the answer is evidence-backed rather than heuristic-masked.
|
- the answer is evidence-backed rather than heuristic-masked;
|
||||||
|
- the surfaced business fields are truthful and not mislabeled.
|
||||||
|
|
||||||
Examples:
|
Examples:
|
||||||
- asking for supplier provenance must answer with the supplier first, not only with raw documents;
|
- asking for supplier provenance must answer with the supplier first, not only with raw documents;
|
||||||
|
|
@ -53,9 +64,23 @@ Typical invariants:
|
||||||
- warehouse survives if the follow-up still targets the same stock slice
|
- warehouse survives if the follow-up still targets the same stock slice
|
||||||
- organization survives if the previous slice was organization-bound
|
- organization survives if the previous slice was organization-bound
|
||||||
- route family remains in the same business contour unless the user clearly changed intent
|
- route family remains in the same business contour unless the user clearly changed intent
|
||||||
|
- reusable resolved-object state survives when the previous turn already answered a closely related lookup
|
||||||
|
- pronoun references can reuse the active focus object when the wording supports it
|
||||||
|
|
||||||
If an edge loses a required invariant, that is a real regression even if the target node works in isolation.
|
If an edge loses a required invariant, that is a real regression even if the target node works in isolation.
|
||||||
|
|
||||||
|
## Resolved answer-object continuity
|
||||||
|
|
||||||
|
For follow-up-heavy domains, the analyst should treat resolved business objects as reusable state, not as disposable one-turn artifacts.
|
||||||
|
|
||||||
|
Examples:
|
||||||
|
- selected inventory item
|
||||||
|
- resolved supplier provenance bundle
|
||||||
|
- resolved buyer bundle
|
||||||
|
- resolved purchase document bundle
|
||||||
|
|
||||||
|
If turn N already resolved such an object and turn N+1 asks a natural follow-up about the same object, the system should reuse that state instead of demanding the same anchor again.
|
||||||
|
|
||||||
## Mandatory paraphrase families
|
## Mandatory paraphrase families
|
||||||
|
|
||||||
Every critical node or edge must be validated in a small paraphrase family instead of one curated wording only.
|
Every critical node or edge must be validated in a small paraphrase family instead of one curated wording only.
|
||||||
|
|
@ -65,11 +90,6 @@ Minimum family:
|
||||||
- `colloquial`
|
- `colloquial`
|
||||||
- `ui_selected_object`
|
- `ui_selected_object`
|
||||||
|
|
||||||
Examples:
|
|
||||||
- canonical: `От какого поставщика куплен товар X`
|
|
||||||
- colloquial: `Кто поставил этот товар`
|
|
||||||
- ui_selected_object: `По выбранному объекту "X": кто это поставил нам`
|
|
||||||
|
|
||||||
If canonical works but colloquial or UI-generated follow-up fails, the node/edge is not accepted.
|
If canonical works but colloquial or UI-generated follow-up fails, the node/edge is not accepted.
|
||||||
|
|
||||||
## Acceptance matrix
|
## Acceptance matrix
|
||||||
|
|
@ -86,6 +106,8 @@ Minimum matrix columns:
|
||||||
- expected capability / recipe
|
- expected capability / recipe
|
||||||
- required carryover invariants
|
- required carryover invariants
|
||||||
- expected answer shape
|
- expected answer shape
|
||||||
|
- expected direct answer
|
||||||
|
- business usefulness expectation
|
||||||
- actual outcome
|
- actual outcome
|
||||||
- status (`pass`, `partial`, `fail`)
|
- status (`pass`, `partial`, `fail`)
|
||||||
- defect class
|
- defect class
|
||||||
|
|
@ -95,17 +117,25 @@ Minimum matrix columns:
|
||||||
Use these classes explicitly:
|
Use these classes explicitly:
|
||||||
- `semantic_understanding_gap`
|
- `semantic_understanding_gap`
|
||||||
- `edge_carryover_gap`
|
- `edge_carryover_gap`
|
||||||
|
- `object_memory_gap`
|
||||||
|
- `field_mapping_gap`
|
||||||
- `answer_shape_mismatch`
|
- `answer_shape_mismatch`
|
||||||
- `ordering_semantics_mismatch`
|
- `ordering_semantics_mismatch`
|
||||||
- `runtime_capability_gap`
|
- `runtime_capability_gap`
|
||||||
|
- `business_utility_gap`
|
||||||
|
- `domain_anchor_gap`
|
||||||
- `loop_coverage_gap`
|
- `loop_coverage_gap`
|
||||||
|
|
||||||
Definitions:
|
Definitions:
|
||||||
- `semantic_understanding_gap`: the system did not understand the real user meaning
|
- `semantic_understanding_gap`: the system did not understand the real user meaning
|
||||||
- `edge_carryover_gap`: the follow-up lost date / object / scope across steps
|
- `edge_carryover_gap`: the follow-up lost date / object / scope across steps
|
||||||
|
- `object_memory_gap`: the system resolved the object once but failed to retain it for the next follow-up
|
||||||
|
- `field_mapping_gap`: the answer surfaced the wrong business field or mislabeled a field
|
||||||
- `answer_shape_mismatch`: the business object in the answer does not match the requested object
|
- `answer_shape_mismatch`: the business object in the answer does not match the requested object
|
||||||
- `ordering_semantics_mismatch`: ranking / chronology semantics are wrong
|
- `ordering_semantics_mismatch`: ranking / chronology semantics are wrong
|
||||||
- `runtime_capability_gap`: the product contour truly lacks the route / intent / capability / extractor / recipe
|
- `runtime_capability_gap`: the product contour truly lacks the route / intent / capability / extractor / recipe
|
||||||
|
- `business_utility_gap`: the answer may be grounded but is still not useful as a user-facing result
|
||||||
|
- `domain_anchor_gap`: the scenario uses a weak or wrong observed anchor, so the tree is semantically mis-specified
|
||||||
- `loop_coverage_gap`: the runtime could support the path or nearly support it, but the analyst/orchestrator never treated that path as mandatory acceptance coverage
|
- `loop_coverage_gap`: the runtime could support the path or nearly support it, but the analyst/orchestrator never treated that path as mandatory acceptance coverage
|
||||||
|
|
||||||
## Analyst responsibilities
|
## Analyst responsibilities
|
||||||
|
|
@ -116,6 +146,9 @@ The analyst must:
|
||||||
- call out broken edges explicitly;
|
- call out broken edges explicitly;
|
||||||
- verify colloquial and UI-generated variants as first-class coverage;
|
- verify colloquial and UI-generated variants as first-class coverage;
|
||||||
- verify direct-answer-first behavior where the user asked a direct lookup question;
|
- verify direct-answer-first behavior where the user asked a direct lookup question;
|
||||||
|
- verify business usefulness explicitly, not only technical validity;
|
||||||
|
- verify field truthfulness for surfaced supplier / buyer / organization labels;
|
||||||
|
- verify selected-object continuity and reusable object memory;
|
||||||
- verify answer granularity and ordering semantics;
|
- verify answer granularity and ordering semantics;
|
||||||
- lower the score when any critical edge or paraphrase family is broken.
|
- lower the score when any critical edge or paraphrase family is broken.
|
||||||
|
|
||||||
|
|
@ -136,7 +169,9 @@ Do not accept a domain when:
|
||||||
- selected-object follow-up is broken;
|
- selected-object follow-up is broken;
|
||||||
- `на эту дату` / `на ту дату` loses the originating date;
|
- `на эту дату` / `на ту дату` loses the originating date;
|
||||||
- the answer shape is wrong for the business question;
|
- the answer shape is wrong for the business question;
|
||||||
- chronology / ranking semantics are inverted.
|
- chronology / ranking semantics are inverted;
|
||||||
|
- the direct answer is not surfaced first on direct lookup questions;
|
||||||
|
- the answer is technically grounded but still business-useless.
|
||||||
|
|
||||||
Accepted requires:
|
Accepted requires:
|
||||||
- score >= 80
|
- score >= 80
|
||||||
|
|
@ -144,3 +179,5 @@ Accepted requires:
|
||||||
- critical path edges pass
|
- critical path edges pass
|
||||||
- canonical + colloquial + UI-selected-object variants pass for critical branches
|
- canonical + colloquial + UI-selected-object variants pass for critical branches
|
||||||
- no silent heuristic masking
|
- no silent heuristic masking
|
||||||
|
- `direct_answer_ok = true`
|
||||||
|
- `business_usefulness_ok = true`
|
||||||
|
|
|
||||||
|
|
@ -1,55 +1,74 @@
|
||||||
# Verdict
|
# Verdict
|
||||||
|
|
||||||
## 1. Смысл вопроса
|
## 1. Question meaning
|
||||||
...
|
...
|
||||||
|
|
||||||
## 2. Главный пользовательский путь и дерево сценария
|
## 2. Primary user path and scenario tree
|
||||||
- root:
|
- root:
|
||||||
- critical child nodes:
|
- critical child nodes:
|
||||||
- critical edges:
|
- critical edges:
|
||||||
- primary user path:
|
- primary user path:
|
||||||
|
|
||||||
## 3. Что реально посчитано
|
## 3. Expected direct answer
|
||||||
|
- what the first line should say:
|
||||||
|
- minimum acceptable business answer:
|
||||||
|
|
||||||
|
## 4. What the system actually computed
|
||||||
...
|
...
|
||||||
|
|
||||||
## 4. Где расхождение по бизнес-смыслу
|
## 5. Business mismatch
|
||||||
|
- did the answer solve the user's real question:
|
||||||
|
- did the direct answer appear first:
|
||||||
|
- is the answer usable for an operator/accountant/manager:
|
||||||
|
|
||||||
|
## 6. Route / capability mismatch
|
||||||
...
|
...
|
||||||
|
|
||||||
## 5. Где route / capability mismatch
|
## 7. State continuity and selected-object memory
|
||||||
...
|
- selected object continuity:
|
||||||
|
- date/period continuity:
|
||||||
|
- reusable answer-object continuity:
|
||||||
|
- pronoun resolution continuity:
|
||||||
|
|
||||||
## 6. Evidence quality
|
## 8. Field truth and evidence quality
|
||||||
- exact / partial / heuristic / technical insufficiency
|
- supplier vs organization:
|
||||||
- why
|
- buyer vs organization:
|
||||||
|
- exact / partial / heuristic / technical insufficiency:
|
||||||
|
- why:
|
||||||
|
|
||||||
## 7. P0 defects
|
## 9. P0 defects
|
||||||
- ...
|
- ...
|
||||||
|
|
||||||
## 8. P1 defects
|
## 10. P1 defects
|
||||||
- ...
|
- ...
|
||||||
|
|
||||||
## 9. P2 defects
|
## 11. P2 defects
|
||||||
- ...
|
- ...
|
||||||
|
|
||||||
## 10. Minimal patch directions
|
## 12. Minimal patch directions
|
||||||
- ...
|
- ...
|
||||||
|
|
||||||
## 11. Acceptance matrix for rerun
|
## 13. Acceptance matrix for rerun
|
||||||
- Node / edge coverage:
|
- Node / edge coverage:
|
||||||
- Canonical wording:
|
- Canonical wording:
|
||||||
- Colloquial wording:
|
- Colloquial wording:
|
||||||
- UI-generated selected-object wording:
|
- UI-generated selected-object wording:
|
||||||
- Carryover invariants:
|
- Carryover invariants:
|
||||||
- Expected answer shape:
|
- Expected answer shape:
|
||||||
|
- Expected direct answer:
|
||||||
|
- Business usefulness:
|
||||||
- Defect class:
|
- Defect class:
|
||||||
|
|
||||||
## 12. Acceptance criteria for rerun
|
## 14. Acceptance criteria for rerun
|
||||||
- ...
|
- ...
|
||||||
- Include colloquial/slang variants and UI-generated selected-object follow-up variants when they are part of the business flow.
|
- Include colloquial/slang variants and UI-generated selected-object follow-up variants when they are part of the business flow.
|
||||||
- Require the primary user path to pass end-to-end, not only the root node.
|
- Require the primary user path to pass end-to-end, not only the root node.
|
||||||
|
- Require direct-answer-first behavior on direct lookup questions.
|
||||||
|
- Require business-useful output rather than technically-grounded-but-noisy output.
|
||||||
|
- Require selected-object continuity and reusable answer-object continuity on follow-up chains.
|
||||||
|
|
||||||
## 13. Quality score
|
## 15. Quality score
|
||||||
- integer from 0 to 100
|
- integer from 0 to 100
|
||||||
|
|
||||||
## 14. Loop decision
|
## 16. Loop decision
|
||||||
- accepted / continue / partial / blocked / needs_exact_capability
|
- accepted / continue / partial / blocked / needs_exact_capability
|
||||||
|
|
|
||||||
|
|
@ -26,6 +26,7 @@ Rules:
|
||||||
- Do not accept a domain when only the root snapshot works but selected-object or drilldown follow-up edges still fail.
|
- Do not accept a domain when only the root snapshot works but selected-object or drilldown follow-up edges still fail.
|
||||||
- For critical branches, validate at least canonical wording, colloquial wording, and UI-generated selected-object wording when that UX exists.
|
- For critical branches, validate at least canonical wording, colloquial wording, and UI-generated selected-object wording when that UX exists.
|
||||||
- Treat temporal carryover, selected-object carryover, answer-shape match, and ordering semantics as first-class acceptance invariants rather than optional polish.
|
- Treat temporal carryover, selected-object carryover, answer-shape match, and ordering semantics as first-class acceptance invariants rather than optional polish.
|
||||||
|
- Treat direct-answer-first behavior, business usefulness, selected-object memory, and field truthfulness as first-class analyst criteria rather than optional presentation polish.
|
||||||
- If a case falls outside the current routed contour because the route/intent/capability is not wired yet, treat it as domain enablement work for this project, not as automatic out-of-scope rejection.
|
- If a case falls outside the current routed contour because the route/intent/capability is not wired yet, treat it as domain enablement work for this project, not as automatic out-of-scope rejection.
|
||||||
- For new unmarked domains, `needs_exact_capability` means "bootstrap or extend the contour" rather than "close the case as unsupported".
|
- For new unmarked domains, `needs_exact_capability` means "bootstrap or extend the contour" rather than "close the case as unsupported".
|
||||||
- A case can be marked `accepted` only when analyst verdict is at least `80/100`, no unresolved `P0` remains, and the rerun does not mask heuristic output as confirmed.
|
- A case can be marked `accepted` only when analyst verdict is at least `80/100`, no unresolved `P0` remains, and the rerun does not mask heuristic output as confirmed.
|
||||||
|
|
|
||||||
|
|
@ -780,6 +780,12 @@
|
||||||
"required_paraphrase_families": ["canonical", "ui_selected_object"],
|
"required_paraphrase_families": ["canonical", "ui_selected_object"],
|
||||||
"required_carryover_invariants": ["selected_object", "date_scope", "answer_shape"]
|
"required_carryover_invariants": ["selected_object", "date_scope", "answer_shape"]
|
||||||
},
|
},
|
||||||
|
"bindings": {
|
||||||
|
"target_date_historical": "2020-03-31",
|
||||||
|
"focus_item_historical": "Шкаф картотечный 1000*400*2100",
|
||||||
|
"observed_supplier_candidate": "Гамма-мебель, ООО",
|
||||||
|
"observed_customer_candidate": "Департамент капитального ремонта города Москвы"
|
||||||
|
},
|
||||||
"steps": [
|
"steps": [
|
||||||
{
|
{
|
||||||
"step_id": "step_01_account_41_historical",
|
"step_id": "step_01_account_41_historical",
|
||||||
|
|
@ -790,7 +796,7 @@
|
||||||
"title": "Historical account 41 anchor",
|
"title": "Historical account 41 anchor",
|
||||||
"question": "Какие товары числятся на 41 счете на дату {{bindings.target_date_historical}}",
|
"question": "Какие товары числятся на 41 счете на дату {{bindings.target_date_historical}}",
|
||||||
"analysis_context": {
|
"analysis_context": {
|
||||||
"as_of_date": "2019-03-31",
|
"as_of_date": "2020-03-31",
|
||||||
"source": "binding_target_date_historical"
|
"source": "binding_target_date_historical"
|
||||||
},
|
},
|
||||||
"expected_capability": "confirmed_inventory_on_hand_as_of_date",
|
"expected_capability": "confirmed_inventory_on_hand_as_of_date",
|
||||||
|
|
@ -823,13 +829,29 @@
|
||||||
"node_role": "supporting_child",
|
"node_role": "supporting_child",
|
||||||
"paraphrase_family": "canonical",
|
"paraphrase_family": "canonical",
|
||||||
"title": "Supplier to buyer overlap",
|
"title": "Supplier to buyer overlap",
|
||||||
"question": "Какие товары были куплены у поставщика {{bindings.observed_supplier_candidate}} и позже проданы покупателю {{bindings.observed_customer_candidate}}",
|
"question": "Есть ли документально подтвержденная цепочка: поставщик {{bindings.observed_supplier_candidate}} -> товар {{bindings.focus_item_historical}} -> покупатель {{bindings.observed_customer_candidate}}",
|
||||||
"depends_on": ["step_01_account_41_historical", "step_02_selected_item_buyer"]
|
"depends_on": ["step_01_account_41_historical", "step_02_selected_item_buyer"]
|
||||||
}
|
}
|
||||||
]
|
]
|
||||||
}
|
}
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
|
"agent_audit_expectations": {
|
||||||
|
"direct_answer_first": true,
|
||||||
|
"business_utility_required": true,
|
||||||
|
"state_continuity_required": true,
|
||||||
|
"selected_object_memory_required": true,
|
||||||
|
"field_truth_checks": [
|
||||||
|
"supplier_vs_organization",
|
||||||
|
"buyer_vs_organization"
|
||||||
|
],
|
||||||
|
"reusable_answer_object_expectations": [
|
||||||
|
"current_item",
|
||||||
|
"current_as_of_date",
|
||||||
|
"current_provenance_trace",
|
||||||
|
"current_sale_trace"
|
||||||
|
]
|
||||||
|
},
|
||||||
"acceptance_contract": {
|
"acceptance_contract": {
|
||||||
"acceptance_unit": "scenario_tree",
|
"acceptance_unit": "scenario_tree",
|
||||||
"do_not_accept_if": [
|
"do_not_accept_if": [
|
||||||
|
|
|
||||||
|
|
@ -5,13 +5,26 @@
|
||||||
"additionalProperties": false,
|
"additionalProperties": false,
|
||||||
"required": [
|
"required": [
|
||||||
"summary",
|
"summary",
|
||||||
|
"user_intent_summary",
|
||||||
|
"expected_direct_answer",
|
||||||
|
"actual_direct_answer",
|
||||||
"quality_score",
|
"quality_score",
|
||||||
|
"direct_answer_ok",
|
||||||
|
"business_usefulness_ok",
|
||||||
|
"business_utility_score",
|
||||||
|
"direct_answer_priority_score",
|
||||||
|
"state_continuity_score",
|
||||||
|
"answer_shape_score",
|
||||||
|
"evidence_clarity_score",
|
||||||
"loop_decision",
|
"loop_decision",
|
||||||
"requires_user_decision",
|
"requires_user_decision",
|
||||||
"user_decision_type",
|
"user_decision_type",
|
||||||
"user_decision_prompt",
|
"user_decision_prompt",
|
||||||
"unresolved_p0_count",
|
"unresolved_p0_count",
|
||||||
"regression_detected",
|
"regression_detected",
|
||||||
|
"root_cause_layers",
|
||||||
|
"broken_edge_ids",
|
||||||
|
"violated_invariants",
|
||||||
"priority_targets",
|
"priority_targets",
|
||||||
"acceptance_criteria",
|
"acceptance_criteria",
|
||||||
"notes"
|
"notes"
|
||||||
|
|
@ -20,11 +33,51 @@
|
||||||
"summary": {
|
"summary": {
|
||||||
"type": "string"
|
"type": "string"
|
||||||
},
|
},
|
||||||
|
"user_intent_summary": {
|
||||||
|
"type": "string"
|
||||||
|
},
|
||||||
|
"expected_direct_answer": {
|
||||||
|
"type": "string"
|
||||||
|
},
|
||||||
|
"actual_direct_answer": {
|
||||||
|
"type": ["string", "null"]
|
||||||
|
},
|
||||||
"quality_score": {
|
"quality_score": {
|
||||||
"type": "integer",
|
"type": "integer",
|
||||||
"minimum": 0,
|
"minimum": 0,
|
||||||
"maximum": 100
|
"maximum": 100
|
||||||
},
|
},
|
||||||
|
"direct_answer_ok": {
|
||||||
|
"type": "boolean"
|
||||||
|
},
|
||||||
|
"business_usefulness_ok": {
|
||||||
|
"type": "boolean"
|
||||||
|
},
|
||||||
|
"business_utility_score": {
|
||||||
|
"type": "integer",
|
||||||
|
"minimum": 0,
|
||||||
|
"maximum": 100
|
||||||
|
},
|
||||||
|
"direct_answer_priority_score": {
|
||||||
|
"type": "integer",
|
||||||
|
"minimum": 0,
|
||||||
|
"maximum": 100
|
||||||
|
},
|
||||||
|
"state_continuity_score": {
|
||||||
|
"type": "integer",
|
||||||
|
"minimum": 0,
|
||||||
|
"maximum": 100
|
||||||
|
},
|
||||||
|
"answer_shape_score": {
|
||||||
|
"type": "integer",
|
||||||
|
"minimum": 0,
|
||||||
|
"maximum": 100
|
||||||
|
},
|
||||||
|
"evidence_clarity_score": {
|
||||||
|
"type": "integer",
|
||||||
|
"minimum": 0,
|
||||||
|
"maximum": 100
|
||||||
|
},
|
||||||
"loop_decision": {
|
"loop_decision": {
|
||||||
"type": "string",
|
"type": "string",
|
||||||
"enum": ["accepted", "continue", "partial", "blocked", "needs_exact_capability"]
|
"enum": ["accepted", "continue", "partial", "blocked", "needs_exact_capability"]
|
||||||
|
|
@ -35,7 +88,17 @@
|
||||||
},
|
},
|
||||||
"user_decision_type": {
|
"user_decision_type": {
|
||||||
"type": "string",
|
"type": "string",
|
||||||
"enum": ["none", "architecture_fork", "important_business_question", "scope_tradeoff", "data_truth_gap", "missing_required_observation", "risky_workaround", "risky_complexity", "other"],
|
"enum": [
|
||||||
|
"none",
|
||||||
|
"architecture_fork",
|
||||||
|
"important_business_question",
|
||||||
|
"scope_tradeoff",
|
||||||
|
"data_truth_gap",
|
||||||
|
"missing_required_observation",
|
||||||
|
"risky_workaround",
|
||||||
|
"risky_complexity",
|
||||||
|
"other"
|
||||||
|
],
|
||||||
"description": "Explain why the loop needs user input. Use none when requires_user_decision is false."
|
"description": "Explain why the loop needs user input. Use none when requires_user_decision is false."
|
||||||
},
|
},
|
||||||
"user_decision_prompt": {
|
"user_decision_prompt": {
|
||||||
|
|
@ -49,6 +112,37 @@
|
||||||
"regression_detected": {
|
"regression_detected": {
|
||||||
"type": "boolean"
|
"type": "boolean"
|
||||||
},
|
},
|
||||||
|
"root_cause_layers": {
|
||||||
|
"type": "array",
|
||||||
|
"items": {
|
||||||
|
"type": "string",
|
||||||
|
"enum": [
|
||||||
|
"semantic_understanding_gap",
|
||||||
|
"runtime_capability_gap",
|
||||||
|
"edge_carryover_gap",
|
||||||
|
"object_memory_gap",
|
||||||
|
"field_mapping_gap",
|
||||||
|
"answer_shape_mismatch",
|
||||||
|
"ordering_semantics_mismatch",
|
||||||
|
"business_utility_gap",
|
||||||
|
"loop_coverage_gap",
|
||||||
|
"domain_anchor_gap",
|
||||||
|
"other"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"broken_edge_ids": {
|
||||||
|
"type": "array",
|
||||||
|
"items": {
|
||||||
|
"type": "string"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"violated_invariants": {
|
||||||
|
"type": "array",
|
||||||
|
"items": {
|
||||||
|
"type": "string"
|
||||||
|
}
|
||||||
|
},
|
||||||
"priority_targets": {
|
"priority_targets": {
|
||||||
"type": "array",
|
"type": "array",
|
||||||
"items": {
|
"items": {
|
||||||
|
|
@ -68,7 +162,23 @@
|
||||||
},
|
},
|
||||||
"problem_type": {
|
"problem_type": {
|
||||||
"type": "string",
|
"type": "string",
|
||||||
"enum": ["route_gap", "capability_gap", "evidence_gap", "presentation_gap", "regression", "other"]
|
"enum": [
|
||||||
|
"route_gap",
|
||||||
|
"capability_gap",
|
||||||
|
"evidence_gap",
|
||||||
|
"presentation_gap",
|
||||||
|
"semantic_understanding_gap",
|
||||||
|
"edge_carryover_gap",
|
||||||
|
"object_memory_gap",
|
||||||
|
"field_mapping_gap",
|
||||||
|
"answer_shape_mismatch",
|
||||||
|
"ordering_semantics_mismatch",
|
||||||
|
"business_utility_gap",
|
||||||
|
"loop_coverage_gap",
|
||||||
|
"domain_anchor_gap",
|
||||||
|
"regression",
|
||||||
|
"other"
|
||||||
|
]
|
||||||
},
|
},
|
||||||
"fix_goal": {
|
"fix_goal": {
|
||||||
"type": "string"
|
"type": "string"
|
||||||
|
|
|
||||||
|
|
@ -1340,6 +1340,17 @@ function hasInventoryPurchaseDocumentsSignal(text) {
|
||||||
function hasInventorySaleTraceSignal(text) {
|
function hasInventorySaleTraceSignal(text) {
|
||||||
return /(?:продаж|покупател|buyer|sale trace|purchase[\s-]?to[\s-]?sale|purchase -> warehouse -> sale|закупка.*продаж)/iu.test(text);
|
return /(?:продаж|покупател|buyer|sale trace|purchase[\s-]?to[\s-]?sale|purchase -> warehouse -> sale|закупка.*продаж)/iu.test(text);
|
||||||
}
|
}
|
||||||
|
function hasSelectedObjectInventoryCue(text) {
|
||||||
|
return /(?:по\s+выбранному\s+объекту|selected\s+object)/iu.test(text);
|
||||||
|
}
|
||||||
|
function hasSelectedObjectInventoryProvenanceSignal(text) {
|
||||||
|
return (hasSelectedObjectInventoryCue(text) &&
|
||||||
|
/(?:кто\s+(?:(?:это|этот\s+товар|эту\s+позицию)\s+)?(?:нам\s+)?поставил|кто\s+(?:нам\s+)?поставил\s+(?:это|этот\s+товар|эту\s+позицию)|от\s+какого\s+поставщика|у\s+какого\s+поставщика|от\s+кого\s+куплен|supplier|vendor|поставщик)/iu.test(text));
|
||||||
|
}
|
||||||
|
function hasSelectedObjectInventoryPurchaseDocumentsSignal(text) {
|
||||||
|
return (hasSelectedObjectInventoryCue(text) &&
|
||||||
|
/(?:по\s+каким\s+документам\s+(?:это|его|этот\s+товар|эту\s+позицию)\s+купили|по\s+каким\s+документам\s+(?:был\s+)?куплен|какими\s+документами\s+(?:это|его|этот\s+товар|эту\s+позицию)\s+купили|какими\s+документами\s+(?:был\s+)?куплен|purchase\s+documents|documents\s+of\s+purchase|through\s+which\s+documents)/iu.test(text));
|
||||||
|
}
|
||||||
function hasInventoryProvenanceSignalV2(text) {
|
function hasInventoryProvenanceSignalV2(text) {
|
||||||
const hasItemCue = /(?:товар|номенклатур|sku|item|product|остат(?:ок|ки)|склад)/iu.test(text);
|
const hasItemCue = /(?:товар|номенклатур|sku|item|product|остат(?:ок|ки)|склад)/iu.test(text);
|
||||||
const hasSupplierCue = /(?:от\s+какого\s+поставщика|у\s+какого\s+поставщика|от\s+кого\s+куплен|кто\s+(?:нам\s+)?поставил|кем\s+поставлен|поставщик|supplier|vendor)/iu.test(text);
|
const hasSupplierCue = /(?:от\s+какого\s+поставщика|у\s+какого\s+поставщика|от\s+кого\s+куплен|кто\s+(?:нам\s+)?поставил|кем\s+поставлен|поставщик|supplier|vendor)/iu.test(text);
|
||||||
|
|
@ -1541,6 +1552,13 @@ function resolveAddressIntent(userMessage) {
|
||||||
reasons: ["inventory_aging_signal_detected"]
|
reasons: ["inventory_aging_signal_detected"]
|
||||||
};
|
};
|
||||||
}
|
}
|
||||||
|
if (hasSelectedObjectInventoryProvenanceSignal(text)) {
|
||||||
|
return {
|
||||||
|
intent: "inventory_purchase_provenance_for_item",
|
||||||
|
confidence: "medium",
|
||||||
|
reasons: ["inventory_selected_object_provenance_signal_detected"]
|
||||||
|
};
|
||||||
|
}
|
||||||
if (hasInventoryProvenanceSignalV2(text)) {
|
if (hasInventoryProvenanceSignalV2(text)) {
|
||||||
return {
|
return {
|
||||||
intent: "inventory_purchase_provenance_for_item",
|
intent: "inventory_purchase_provenance_for_item",
|
||||||
|
|
@ -1555,6 +1573,13 @@ function resolveAddressIntent(userMessage) {
|
||||||
reasons: ["inventory_purchase_date_signal_detected"]
|
reasons: ["inventory_purchase_date_signal_detected"]
|
||||||
};
|
};
|
||||||
}
|
}
|
||||||
|
if (hasSelectedObjectInventoryPurchaseDocumentsSignal(text)) {
|
||||||
|
return {
|
||||||
|
intent: "inventory_purchase_documents_for_item",
|
||||||
|
confidence: "medium",
|
||||||
|
reasons: ["inventory_selected_object_purchase_documents_signal_detected"]
|
||||||
|
};
|
||||||
|
}
|
||||||
if (hasInventoryPurchaseDocumentsSignalV2(text)) {
|
if (hasInventoryPurchaseDocumentsSignalV2(text)) {
|
||||||
return {
|
return {
|
||||||
intent: "inventory_purchase_documents_for_item",
|
intent: "inventory_purchase_documents_for_item",
|
||||||
|
|
|
||||||
|
|
@ -2879,6 +2879,19 @@ class AddressQueryService {
|
||||||
const broadenedFactual = (0, composeStage_1.composeFactualReply)(intent.intent, broadenedFilteredRows, composeOptionsFromFilters(autoBroadenedFilters));
|
const broadenedFactual = (0, composeStage_1.composeFactualReply)(intent.intent, broadenedFilteredRows, composeOptionsFromFilters(autoBroadenedFilters));
|
||||||
const broadenedLimitations = [...filters.warnings, "period_window_auto_broadened_to_available_data"];
|
const broadenedLimitations = [...filters.warnings, "period_window_auto_broadened_to_available_data"];
|
||||||
const broadenedReasons = [...baseReasons, "period_window_auto_broadened_to_available_data"];
|
const broadenedReasons = [...baseReasons, "period_window_auto_broadened_to_available_data"];
|
||||||
|
const broadenedResultSemantics = mergeAddressResultSemantics(deriveAddressResultSemantics({
|
||||||
|
intent: intent.intent,
|
||||||
|
selectedRecipe: broadenedSelection.selected_recipe.recipe_id,
|
||||||
|
filters: filters.extracted_filters,
|
||||||
|
responseType: broadenedFactual.responseType,
|
||||||
|
rowsMatched: broadenedFilteredRows.length
|
||||||
|
}), broadenedFactual.semantics);
|
||||||
|
const broadenedRouteExpectationAudit = buildRouteExpectationAudit({
|
||||||
|
intent: routeExpectationIntent,
|
||||||
|
selectedRecipe: broadenedSelection.selected_recipe.recipe_id,
|
||||||
|
requestedResultMode,
|
||||||
|
resultMode: broadenedResultSemantics.result_mode
|
||||||
|
});
|
||||||
return {
|
return {
|
||||||
handled: true,
|
handled: true,
|
||||||
reply_text: injectNoticeAfterLeadLine(broadenedFactual.text, broadenedPrefix),
|
reply_text: injectNoticeAfterLeadLine(broadenedFactual.text, broadenedPrefix),
|
||||||
|
|
@ -2921,13 +2934,20 @@ class AddressQueryService {
|
||||||
runtime_readiness: "LIVE_QUERYABLE_WITH_LIMITS",
|
runtime_readiness: "LIVE_QUERYABLE_WITH_LIMITS",
|
||||||
limited_reason_category: null,
|
limited_reason_category: null,
|
||||||
response_type: broadenedFactual.responseType,
|
response_type: broadenedFactual.responseType,
|
||||||
...mergeAddressResultSemantics(deriveAddressResultSemantics({
|
capability_id: capabilityAudit.capabilityId,
|
||||||
intent: intent.intent,
|
capability_layer: capabilityAudit.layer,
|
||||||
selectedRecipe: broadenedSelection.selected_recipe.recipe_id,
|
capability_route_mode: capabilityAudit.routeMode,
|
||||||
filters: filters.extracted_filters,
|
capability_route_enabled: capabilityAudit.enabled,
|
||||||
responseType: broadenedFactual.responseType,
|
capability_route_reason: capabilityAudit.reason,
|
||||||
rowsMatched: broadenedFilteredRows.length
|
shadow_route_intent: shadowRouteAudit.intent,
|
||||||
}), broadenedFactual.semantics),
|
shadow_route_selected_recipe: shadowRouteAudit.selectedRecipe,
|
||||||
|
shadow_route_status: shadowRouteAudit.status,
|
||||||
|
route_expectation_status: broadenedRouteExpectationAudit.status,
|
||||||
|
route_expectation_reason: broadenedRouteExpectationAudit.reason,
|
||||||
|
route_expectation_expected_selected_recipes: broadenedRouteExpectationAudit.expectedSelectedRecipes,
|
||||||
|
route_expectation_expected_requested_result_modes: broadenedRouteExpectationAudit.expectedRequestedResultModes,
|
||||||
|
route_expectation_expected_result_modes: broadenedRouteExpectationAudit.expectedResultModes,
|
||||||
|
...broadenedResultSemantics,
|
||||||
limitations: broadenedLimitations,
|
limitations: broadenedLimitations,
|
||||||
reasons: withConfirmedBalanceFallbackReason(broadenedReasons, requestedResultMode, broadenedFactual.semantics)
|
reasons: withConfirmedBalanceFallbackReason(broadenedReasons, requestedResultMode, broadenedFactual.semantics)
|
||||||
}
|
}
|
||||||
|
|
|
||||||
|
|
@ -244,6 +244,24 @@ function mapCounterpartyIntentToContractIntent(intent) {
|
||||||
}
|
}
|
||||||
return null;
|
return null;
|
||||||
}
|
}
|
||||||
|
function isInventoryIntent(intent) {
|
||||||
|
return (intent === "inventory_on_hand_as_of_date" ||
|
||||||
|
intent === "inventory_purchase_provenance_for_item" ||
|
||||||
|
intent === "inventory_purchase_documents_for_item" ||
|
||||||
|
intent === "inventory_supplier_stock_overlap_as_of_date" ||
|
||||||
|
intent === "inventory_sale_trace_for_item" ||
|
||||||
|
intent === "inventory_purchase_to_sale_chain" ||
|
||||||
|
intent === "inventory_aging_by_purchase_date");
|
||||||
|
}
|
||||||
|
function hasSelectedObjectInventorySignal(text) {
|
||||||
|
return /(?:по\s+выбранному\s+объекту|for\s+selected\s+object)/iu.test(String(text ?? ""));
|
||||||
|
}
|
||||||
|
function hasInventorySupplierFollowupCue(text) {
|
||||||
|
return /(?:кто\s+(?:(?:это|этот\s+товар|эту\s+позицию)\s+)?(?:нам\s+)?поставил|кто\s+(?:нам\s+)?поставил\s+(?:это|этот\s+товар|эту\s+позицию)|от\s+какого\s+поставщика|у\s+какого\s+поставщика|от\s+кого\s+куплен|supplier|vendor|поставщик)/iu.test(String(text ?? ""));
|
||||||
|
}
|
||||||
|
function hasInventoryPurchaseDocumentsFollowupCue(text) {
|
||||||
|
return /(?:по\s+каким\s+документам\s+(?:это|его|этот\s+товар|эту\s+позицию)\s+купили|по\s+каким\s+документам\s+(?:был\s+)?куплен|какими\s+документами\s+(?:это|его|этот\s+товар|эту\s+позицию)\s+купили|какими\s+документами\s+(?:был\s+)?куплен|purchase\s+documents|documents\s+of\s+purchase|through\s+which\s+documents)/iu.test(String(text ?? ""));
|
||||||
|
}
|
||||||
function hasAddressFollowupContextSignal(text) {
|
function hasAddressFollowupContextSignal(text) {
|
||||||
const normalized = String(text ?? "").trim();
|
const normalized = String(text ?? "").trim();
|
||||||
if (!normalized) {
|
if (!normalized) {
|
||||||
|
|
@ -612,6 +630,32 @@ function deriveIntentWithFollowupContext(detectedIntent, userMessage, followupCo
|
||||||
reasons: [...detectedIntent.reasons, "intent_adjusted_to_balance_followup_context"]
|
reasons: [...detectedIntent.reasons, "intent_adjusted_to_balance_followup_context"]
|
||||||
};
|
};
|
||||||
}
|
}
|
||||||
|
const previousIsInventoryFamily = isInventoryIntent(previousIntent);
|
||||||
|
const inventorySelectedObjectFollowup = hasSelectedObjectInventorySignal(normalizedMessage) || (previousIsInventoryFamily && hasFollowupSignal);
|
||||||
|
if (inventorySelectedObjectFollowup && hasInventorySupplierFollowupCue(normalizedMessage)) {
|
||||||
|
if (detectedIntent.intent === "unknown" ||
|
||||||
|
detectedIntent.intent === "inventory_on_hand_as_of_date" ||
|
||||||
|
detectedIntent.intent === previousIntent) {
|
||||||
|
return {
|
||||||
|
intent: "inventory_purchase_provenance_for_item",
|
||||||
|
confidence: "low",
|
||||||
|
reasons: [...detectedIntent.reasons, "intent_adjusted_to_inventory_followup_context"]
|
||||||
|
};
|
||||||
|
}
|
||||||
|
}
|
||||||
|
if (inventorySelectedObjectFollowup && hasInventoryPurchaseDocumentsFollowupCue(normalizedMessage)) {
|
||||||
|
if (detectedIntent.intent === "unknown" ||
|
||||||
|
detectedIntent.intent === "list_documents_by_counterparty" ||
|
||||||
|
detectedIntent.intent === "list_documents_by_contract" ||
|
||||||
|
detectedIntent.intent === "inventory_on_hand_as_of_date" ||
|
||||||
|
detectedIntent.intent === previousIntent) {
|
||||||
|
return {
|
||||||
|
intent: "inventory_purchase_documents_for_item",
|
||||||
|
confidence: "low",
|
||||||
|
reasons: [...detectedIntent.reasons, "intent_adjusted_to_inventory_followup_context"]
|
||||||
|
};
|
||||||
|
}
|
||||||
|
}
|
||||||
if (hasPreviousContract) {
|
if (hasPreviousContract) {
|
||||||
if (detectedIntent.intent === "list_contracts_by_counterparty") {
|
if (detectedIntent.intent === "list_contracts_by_counterparty") {
|
||||||
if (hasBankSignal(normalizedMessage)) {
|
if (hasBankSignal(normalizedMessage)) {
|
||||||
|
|
|
||||||
|
|
@ -1603,6 +1603,28 @@ function hasInventorySaleTraceSignal(text: string): boolean {
|
||||||
);
|
);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
function hasSelectedObjectInventoryCue(text: string): boolean {
|
||||||
|
return /(?:по\s+выбранному\s+объекту|selected\s+object)/iu.test(text);
|
||||||
|
}
|
||||||
|
|
||||||
|
function hasSelectedObjectInventoryProvenanceSignal(text: string): boolean {
|
||||||
|
return (
|
||||||
|
hasSelectedObjectInventoryCue(text) &&
|
||||||
|
/(?:кто\s+(?:(?:это|этот\s+товар|эту\s+позицию)\s+)?(?:нам\s+)?поставил|кто\s+(?:нам\s+)?поставил\s+(?:это|этот\s+товар|эту\s+позицию)|от\s+какого\s+поставщика|у\s+какого\s+поставщика|от\s+кого\s+куплен|supplier|vendor|поставщик)/iu.test(
|
||||||
|
text
|
||||||
|
)
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
function hasSelectedObjectInventoryPurchaseDocumentsSignal(text: string): boolean {
|
||||||
|
return (
|
||||||
|
hasSelectedObjectInventoryCue(text) &&
|
||||||
|
/(?:по\s+каким\s+документам\s+(?:это|его|этот\s+товар|эту\s+позицию)\s+купили|по\s+каким\s+документам\s+(?:был\s+)?куплен|какими\s+документами\s+(?:это|его|этот\s+товар|эту\s+позицию)\s+купили|какими\s+документами\s+(?:был\s+)?куплен|purchase\s+documents|documents\s+of\s+purchase|through\s+which\s+documents)/iu.test(
|
||||||
|
text
|
||||||
|
)
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
function hasInventoryProvenanceSignalV2(text: string): boolean {
|
function hasInventoryProvenanceSignalV2(text: string): boolean {
|
||||||
const hasItemCue = /(?:товар|номенклатур|sku|item|product|остат(?:ок|ки)|склад)/iu.test(text);
|
const hasItemCue = /(?:товар|номенклатур|sku|item|product|остат(?:ок|ки)|склад)/iu.test(text);
|
||||||
const hasSupplierCue =
|
const hasSupplierCue =
|
||||||
|
|
@ -1871,6 +1893,14 @@ export function resolveAddressIntent(userMessage: string): AddressIntentResoluti
|
||||||
};
|
};
|
||||||
}
|
}
|
||||||
|
|
||||||
|
if (hasSelectedObjectInventoryProvenanceSignal(text)) {
|
||||||
|
return {
|
||||||
|
intent: "inventory_purchase_provenance_for_item",
|
||||||
|
confidence: "medium",
|
||||||
|
reasons: ["inventory_selected_object_provenance_signal_detected"]
|
||||||
|
};
|
||||||
|
}
|
||||||
|
|
||||||
if (hasInventoryProvenanceSignalV2(text)) {
|
if (hasInventoryProvenanceSignalV2(text)) {
|
||||||
return {
|
return {
|
||||||
intent: "inventory_purchase_provenance_for_item",
|
intent: "inventory_purchase_provenance_for_item",
|
||||||
|
|
@ -1887,6 +1917,14 @@ export function resolveAddressIntent(userMessage: string): AddressIntentResoluti
|
||||||
};
|
};
|
||||||
}
|
}
|
||||||
|
|
||||||
|
if (hasSelectedObjectInventoryPurchaseDocumentsSignal(text)) {
|
||||||
|
return {
|
||||||
|
intent: "inventory_purchase_documents_for_item",
|
||||||
|
confidence: "medium",
|
||||||
|
reasons: ["inventory_selected_object_purchase_documents_signal_detected"]
|
||||||
|
};
|
||||||
|
}
|
||||||
|
|
||||||
if (hasInventoryPurchaseDocumentsSignalV2(text)) {
|
if (hasInventoryPurchaseDocumentsSignalV2(text)) {
|
||||||
return {
|
return {
|
||||||
intent: "inventory_purchase_documents_for_item",
|
intent: "inventory_purchase_documents_for_item",
|
||||||
|
|
|
||||||
|
|
@ -3498,6 +3498,22 @@ export class AddressQueryService {
|
||||||
);
|
);
|
||||||
const broadenedLimitations = [...filters.warnings, "period_window_auto_broadened_to_available_data"];
|
const broadenedLimitations = [...filters.warnings, "period_window_auto_broadened_to_available_data"];
|
||||||
const broadenedReasons = [...baseReasons, "period_window_auto_broadened_to_available_data"];
|
const broadenedReasons = [...baseReasons, "period_window_auto_broadened_to_available_data"];
|
||||||
|
const broadenedResultSemantics = mergeAddressResultSemantics(
|
||||||
|
deriveAddressResultSemantics({
|
||||||
|
intent: intent.intent,
|
||||||
|
selectedRecipe: broadenedSelection.selected_recipe.recipe_id,
|
||||||
|
filters: filters.extracted_filters,
|
||||||
|
responseType: broadenedFactual.responseType,
|
||||||
|
rowsMatched: broadenedFilteredRows.length
|
||||||
|
}),
|
||||||
|
broadenedFactual.semantics
|
||||||
|
);
|
||||||
|
const broadenedRouteExpectationAudit = buildRouteExpectationAudit({
|
||||||
|
intent: routeExpectationIntent,
|
||||||
|
selectedRecipe: broadenedSelection.selected_recipe.recipe_id,
|
||||||
|
requestedResultMode,
|
||||||
|
resultMode: broadenedResultSemantics.result_mode
|
||||||
|
});
|
||||||
return {
|
return {
|
||||||
handled: true,
|
handled: true,
|
||||||
reply_text: injectNoticeAfterLeadLine(broadenedFactual.text, broadenedPrefix),
|
reply_text: injectNoticeAfterLeadLine(broadenedFactual.text, broadenedPrefix),
|
||||||
|
|
@ -3540,16 +3556,21 @@ export class AddressQueryService {
|
||||||
runtime_readiness: "LIVE_QUERYABLE_WITH_LIMITS",
|
runtime_readiness: "LIVE_QUERYABLE_WITH_LIMITS",
|
||||||
limited_reason_category: null,
|
limited_reason_category: null,
|
||||||
response_type: broadenedFactual.responseType,
|
response_type: broadenedFactual.responseType,
|
||||||
...mergeAddressResultSemantics(
|
capability_id: capabilityAudit.capabilityId,
|
||||||
deriveAddressResultSemantics({
|
capability_layer: capabilityAudit.layer,
|
||||||
intent: intent.intent,
|
capability_route_mode: capabilityAudit.routeMode,
|
||||||
selectedRecipe: broadenedSelection.selected_recipe.recipe_id,
|
capability_route_enabled: capabilityAudit.enabled,
|
||||||
filters: filters.extracted_filters,
|
capability_route_reason: capabilityAudit.reason,
|
||||||
responseType: broadenedFactual.responseType,
|
shadow_route_intent: shadowRouteAudit.intent,
|
||||||
rowsMatched: broadenedFilteredRows.length
|
shadow_route_selected_recipe: shadowRouteAudit.selectedRecipe,
|
||||||
}),
|
shadow_route_status: shadowRouteAudit.status,
|
||||||
broadenedFactual.semantics
|
route_expectation_status: broadenedRouteExpectationAudit.status,
|
||||||
),
|
route_expectation_reason: broadenedRouteExpectationAudit.reason,
|
||||||
|
route_expectation_expected_selected_recipes: broadenedRouteExpectationAudit.expectedSelectedRecipes,
|
||||||
|
route_expectation_expected_requested_result_modes:
|
||||||
|
broadenedRouteExpectationAudit.expectedRequestedResultModes,
|
||||||
|
route_expectation_expected_result_modes: broadenedRouteExpectationAudit.expectedResultModes,
|
||||||
|
...broadenedResultSemantics,
|
||||||
limitations: broadenedLimitations,
|
limitations: broadenedLimitations,
|
||||||
reasons: withConfirmedBalanceFallbackReason(
|
reasons: withConfirmedBalanceFallbackReason(
|
||||||
broadenedReasons,
|
broadenedReasons,
|
||||||
|
|
|
||||||
|
|
@ -306,6 +306,34 @@ function mapCounterpartyIntentToContractIntent(intent: AddressIntent): AddressIn
|
||||||
return null;
|
return null;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
function isInventoryIntent(intent: AddressIntent | undefined): boolean {
|
||||||
|
return (
|
||||||
|
intent === "inventory_on_hand_as_of_date" ||
|
||||||
|
intent === "inventory_purchase_provenance_for_item" ||
|
||||||
|
intent === "inventory_purchase_documents_for_item" ||
|
||||||
|
intent === "inventory_supplier_stock_overlap_as_of_date" ||
|
||||||
|
intent === "inventory_sale_trace_for_item" ||
|
||||||
|
intent === "inventory_purchase_to_sale_chain" ||
|
||||||
|
intent === "inventory_aging_by_purchase_date"
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
function hasSelectedObjectInventorySignal(text: string): boolean {
|
||||||
|
return /(?:по\s+выбранному\s+объекту|for\s+selected\s+object)/iu.test(String(text ?? ""));
|
||||||
|
}
|
||||||
|
|
||||||
|
function hasInventorySupplierFollowupCue(text: string): boolean {
|
||||||
|
return /(?:кто\s+(?:(?:это|этот\s+товар|эту\s+позицию)\s+)?(?:нам\s+)?поставил|кто\s+(?:нам\s+)?поставил\s+(?:это|этот\s+товар|эту\s+позицию)|от\s+какого\s+поставщика|у\s+какого\s+поставщика|от\s+кого\s+куплен|supplier|vendor|поставщик)/iu.test(
|
||||||
|
String(text ?? "")
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
function hasInventoryPurchaseDocumentsFollowupCue(text: string): boolean {
|
||||||
|
return /(?:по\s+каким\s+документам\s+(?:это|его|этот\s+товар|эту\s+позицию)\s+купили|по\s+каким\s+документам\s+(?:был\s+)?куплен|какими\s+документами\s+(?:это|его|этот\s+товар|эту\s+позицию)\s+купили|какими\s+документами\s+(?:был\s+)?куплен|purchase\s+documents|documents\s+of\s+purchase|through\s+which\s+documents)/iu.test(
|
||||||
|
String(text ?? "")
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
export function hasAddressFollowupContextSignal(text: string): boolean {
|
export function hasAddressFollowupContextSignal(text: string): boolean {
|
||||||
const normalized = String(text ?? "").trim();
|
const normalized = String(text ?? "").trim();
|
||||||
if (!normalized) {
|
if (!normalized) {
|
||||||
|
|
@ -752,6 +780,39 @@ function deriveIntentWithFollowupContext(
|
||||||
};
|
};
|
||||||
}
|
}
|
||||||
|
|
||||||
|
const previousIsInventoryFamily = isInventoryIntent(previousIntent);
|
||||||
|
const inventorySelectedObjectFollowup =
|
||||||
|
hasSelectedObjectInventorySignal(normalizedMessage) || (previousIsInventoryFamily && hasFollowupSignal);
|
||||||
|
if (inventorySelectedObjectFollowup && hasInventorySupplierFollowupCue(normalizedMessage)) {
|
||||||
|
if (
|
||||||
|
detectedIntent.intent === "unknown" ||
|
||||||
|
detectedIntent.intent === "inventory_on_hand_as_of_date" ||
|
||||||
|
detectedIntent.intent === previousIntent
|
||||||
|
) {
|
||||||
|
return {
|
||||||
|
intent: "inventory_purchase_provenance_for_item",
|
||||||
|
confidence: "low",
|
||||||
|
reasons: [...detectedIntent.reasons, "intent_adjusted_to_inventory_followup_context"]
|
||||||
|
};
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
if (inventorySelectedObjectFollowup && hasInventoryPurchaseDocumentsFollowupCue(normalizedMessage)) {
|
||||||
|
if (
|
||||||
|
detectedIntent.intent === "unknown" ||
|
||||||
|
detectedIntent.intent === "list_documents_by_counterparty" ||
|
||||||
|
detectedIntent.intent === "list_documents_by_contract" ||
|
||||||
|
detectedIntent.intent === "inventory_on_hand_as_of_date" ||
|
||||||
|
detectedIntent.intent === previousIntent
|
||||||
|
) {
|
||||||
|
return {
|
||||||
|
intent: "inventory_purchase_documents_for_item",
|
||||||
|
confidence: "low",
|
||||||
|
reasons: [...detectedIntent.reasons, "intent_adjusted_to_inventory_followup_context"]
|
||||||
|
};
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
if (hasPreviousContract) {
|
if (hasPreviousContract) {
|
||||||
if (detectedIntent.intent === "list_contracts_by_counterparty") {
|
if (detectedIntent.intent === "list_contracts_by_counterparty") {
|
||||||
if (hasBankSignal(normalizedMessage)) {
|
if (hasBankSignal(normalizedMessage)) {
|
||||||
|
|
|
||||||
|
|
@ -103,6 +103,8 @@ describe("inventory selected-object follow-up", () => {
|
||||||
expect(result?.debug.extracted_filters?.as_of_date).toBe("2021-03-31");
|
expect(result?.debug.extracted_filters?.as_of_date).toBe("2021-03-31");
|
||||||
expect(result?.debug.extracted_filters?.period_from).toBe("2021-03-01");
|
expect(result?.debug.extracted_filters?.period_from).toBe("2021-03-01");
|
||||||
expect(result?.debug.extracted_filters?.period_to).toBe("2021-03-31");
|
expect(result?.debug.extracted_filters?.period_to).toBe("2021-03-31");
|
||||||
|
expect(result?.debug.capability_id).toBe("inventory_inventory_purchase_provenance_for_item");
|
||||||
|
expect(result?.debug.capability_route_mode).toBe("exact");
|
||||||
expect(result?.debug.reasons).toContain("period_window_auto_broadened_to_available_data");
|
expect(result?.debug.reasons).toContain("period_window_auto_broadened_to_available_data");
|
||||||
expect(result?.debug.limitations).toContain("period_window_auto_broadened_to_available_data");
|
expect(result?.debug.limitations).toContain("period_window_auto_broadened_to_available_data");
|
||||||
const replyLines = String(result?.reply_text ?? "").split("\n");
|
const replyLines = String(result?.reply_text ?? "").split("\n");
|
||||||
|
|
@ -111,4 +113,97 @@ describe("inventory selected-object follow-up", () => {
|
||||||
expect(replyLines[1]).toContain("По окну 2021-03-01..2021-03-31 строк не найдено");
|
expect(replyLines[1]).toContain("По окну 2021-03-01..2021-03-31 строк не найдено");
|
||||||
expect(executeAddressMcpQueryMock).toHaveBeenCalledTimes(2);
|
expect(executeAddressMcpQueryMock).toHaveBeenCalledTimes(2);
|
||||||
});
|
});
|
||||||
|
|
||||||
|
it("handles selected-object supplier slang 'кто это поставил нам' as provenance follow-up", async () => {
|
||||||
|
executeAddressMcpQueryMock.mockResolvedValueOnce({
|
||||||
|
fetched_rows: 1,
|
||||||
|
matched_rows: 1,
|
||||||
|
raw_rows: [
|
||||||
|
{
|
||||||
|
Period: "2019-02-11T00:00:00Z",
|
||||||
|
Registrator: "Поступление товаров и услуг 00000000077 от 11.02.2019 0:00:00",
|
||||||
|
AccountDt: "41.01",
|
||||||
|
AccountKt: "60.01",
|
||||||
|
Amount: 3724.17,
|
||||||
|
SubcontoDt1: "Столешница 600*3050*26 дуб ниагара",
|
||||||
|
SubcontoDt3: "Основной склад",
|
||||||
|
SubcontoKt1: "Торговый дом \\Союз МСК\\",
|
||||||
|
SubcontoKt2: "Договор поставки № 12 от 01.02.2019",
|
||||||
|
Organization: "ООО \\Альтернатива Плюс\\"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
rows: [],
|
||||||
|
error: null
|
||||||
|
});
|
||||||
|
|
||||||
|
const service = new AddressQueryService();
|
||||||
|
const result = await service.tryHandle('По выбранному объекту "Столешница 600*3050*26 дуб ниагара": кто это поставил нам', {
|
||||||
|
followupContext: {
|
||||||
|
previous_intent: "inventory_on_hand_as_of_date",
|
||||||
|
previous_filters: {
|
||||||
|
as_of_date: "2019-03-31",
|
||||||
|
period_from: "2019-03-01",
|
||||||
|
period_to: "2019-03-31",
|
||||||
|
warehouse: "Основной склад",
|
||||||
|
organization: "ООО \\Альтернатива Плюс\\"
|
||||||
|
},
|
||||||
|
previous_anchor_type: "unknown",
|
||||||
|
previous_anchor_value: null
|
||||||
|
}
|
||||||
|
});
|
||||||
|
|
||||||
|
expect(result?.handled).toBe(true);
|
||||||
|
expect(result?.response_type).toBe("FACTUAL_SUMMARY");
|
||||||
|
expect(result?.debug.detected_intent).toBe("inventory_purchase_provenance_for_item");
|
||||||
|
expect(result?.debug.extracted_filters?.item).toBe("Столешница 600*3050*26 дуб ниагара");
|
||||||
|
expect(result?.debug.extracted_filters?.as_of_date).toBe("2019-03-31");
|
||||||
|
expect(String(result?.reply_text ?? "")).toContain("Торговый дом \\Союз МСК\\");
|
||||||
|
});
|
||||||
|
|
||||||
|
it("handles selected-object purchase-doc slang 'по каким документам это купили' as exact purchase-doc follow-up", async () => {
|
||||||
|
executeAddressMcpQueryMock.mockResolvedValueOnce({
|
||||||
|
fetched_rows: 1,
|
||||||
|
matched_rows: 1,
|
||||||
|
raw_rows: [
|
||||||
|
{
|
||||||
|
Period: "2019-02-11T00:00:00Z",
|
||||||
|
Registrator: "Поступление товаров и услуг 00000000077 от 11.02.2019 0:00:00",
|
||||||
|
AccountDt: "41.01",
|
||||||
|
AccountKt: "60.01",
|
||||||
|
Amount: 3724.17,
|
||||||
|
SubcontoDt1: "Столешница 600*3050*26 дуб ниагара",
|
||||||
|
SubcontoDt3: "Основной склад",
|
||||||
|
SubcontoKt1: "Торговый дом \\Союз МСК\\",
|
||||||
|
SubcontoKt2: "Договор поставки № 12 от 01.02.2019",
|
||||||
|
Organization: "ООО \\Альтернатива Плюс\\"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
rows: [],
|
||||||
|
error: null
|
||||||
|
});
|
||||||
|
|
||||||
|
const service = new AddressQueryService();
|
||||||
|
const result = await service.tryHandle('По выбранному объекту "Столешница 600*3050*26 дуб ниагара": по каким документам это купили', {
|
||||||
|
followupContext: {
|
||||||
|
previous_intent: "inventory_purchase_provenance_for_item",
|
||||||
|
previous_filters: {
|
||||||
|
as_of_date: "2019-03-31",
|
||||||
|
period_from: "2019-03-01",
|
||||||
|
period_to: "2019-03-31",
|
||||||
|
item: "Столешница 600*3050*26 дуб ниагара",
|
||||||
|
warehouse: "Основной склад"
|
||||||
|
},
|
||||||
|
previous_anchor_type: "unknown",
|
||||||
|
previous_anchor_value: null
|
||||||
|
}
|
||||||
|
});
|
||||||
|
|
||||||
|
expect(result?.handled).toBe(true);
|
||||||
|
expect(result?.response_type).toBe("FACTUAL_LIST");
|
||||||
|
expect(result?.debug.detected_intent).toBe("inventory_purchase_documents_for_item");
|
||||||
|
expect(result?.debug.selected_recipe).toBe("address_inventory_purchase_documents_for_item_v1");
|
||||||
|
expect(result?.debug.extracted_filters?.item).toBe("Столешница 600*3050*26 дуб ниагара");
|
||||||
|
expect(result?.debug.extracted_filters?.as_of_date).toBe("2019-03-31");
|
||||||
|
expect(String(result?.reply_text ?? "")).toContain("Поступление товаров и услуг 00000000077");
|
||||||
|
});
|
||||||
});
|
});
|
||||||
|
|
|
||||||
|
|
@ -173,6 +173,14 @@ describe("address query shape classifier", () => {
|
||||||
expect(filters.item).toBe("Кромка с клеем 33 альмандин 137 м");
|
expect(filters.item).toBe("Кромка с клеем 33 альмандин 137 м");
|
||||||
});
|
});
|
||||||
|
|
||||||
|
it("extracts item anchor from selected-object purchase-doc follow-up without explicit word товар", () => {
|
||||||
|
const filters = extractAddressFilters(
|
||||||
|
'По выбранному объекту "Столешница 600*3050*26 дуб ниагара": по каким документам это купили',
|
||||||
|
"inventory_purchase_documents_for_item"
|
||||||
|
).extracted_filters;
|
||||||
|
expect(filters.item).toBe("Столешница 600*3050*26 дуб ниагара");
|
||||||
|
});
|
||||||
|
|
||||||
it("keeps colloquial selected-object supplier follow-up in inventory provenance intent", () => {
|
it("keeps colloquial selected-object supplier follow-up in inventory provenance intent", () => {
|
||||||
const mode = detectAddressQuestionMode(
|
const mode = detectAddressQuestionMode(
|
||||||
'По выбранному объекту "Кромка с клеем 33 альмандин 137 м": кто поставил этот товар'
|
'По выбранному объекту "Кромка с клеем 33 альмандин 137 м": кто поставил этот товар'
|
||||||
|
|
@ -184,6 +192,28 @@ describe("address query shape classifier", () => {
|
||||||
expect(result.intent).toBe("inventory_purchase_provenance_for_item");
|
expect(result.intent).toBe("inventory_purchase_provenance_for_item");
|
||||||
});
|
});
|
||||||
|
|
||||||
|
it("keeps selected-object supplier slang with 'кто это поставил нам' in inventory provenance intent", () => {
|
||||||
|
const mode = detectAddressQuestionMode(
|
||||||
|
'По выбранному объекту "Столешница 600*3050*26 дуб ниагара": кто это поставил нам'
|
||||||
|
);
|
||||||
|
const result = resolveAddressIntent(
|
||||||
|
'По выбранному объекту "Столешница 600*3050*26 дуб ниагара": кто это поставил нам'
|
||||||
|
);
|
||||||
|
expect(mode.mode).toBe("address_query");
|
||||||
|
expect(result.intent).toBe("inventory_purchase_provenance_for_item");
|
||||||
|
});
|
||||||
|
|
||||||
|
it("keeps selected-object purchase-doc slang with 'по каким документам это купили' in purchase-doc intent", () => {
|
||||||
|
const mode = detectAddressQuestionMode(
|
||||||
|
'По выбранному объекту "Столешница 600*3050*26 дуб ниагара": по каким документам это купили'
|
||||||
|
);
|
||||||
|
const result = resolveAddressIntent(
|
||||||
|
'По выбранному объекту "Столешница 600*3050*26 дуб ниагара": по каким документам это купили'
|
||||||
|
);
|
||||||
|
expect(mode.mode).toBe("address_query");
|
||||||
|
expect(result.intent).toBe("inventory_purchase_documents_for_item");
|
||||||
|
});
|
||||||
|
|
||||||
it("keeps full supplier anchor with comma suffix for stock-overlap questions", () => {
|
it("keeps full supplier anchor with comma suffix for stock-overlap questions", () => {
|
||||||
const filters = extractAddressFilters(
|
const filters = extractAddressFilters(
|
||||||
"Какие товары от поставщика Гамма-мебель, ООО сейчас еще лежат на складе Основной склад?",
|
"Какие товары от поставщика Гамма-мебель, ООО сейчас еще лежат на складе Основной склад?",
|
||||||
|
|
@ -3874,6 +3904,49 @@ describe("address query limited taxonomy and stage diagnostics", { timeout: 1500
|
||||||
});
|
});
|
||||||
|
|
||||||
describe("address decompose stage follow-up carryover", () => {
|
describe("address decompose stage follow-up carryover", () => {
|
||||||
|
it("promotes selected-object supplier slang follow-up into inventory provenance with inherited date context", () => {
|
||||||
|
const result = runAddressDecomposeStage('По выбранному объекту "Столешница 600*3050*26 дуб ниагара": кто это поставил нам', {
|
||||||
|
previous_intent: "inventory_on_hand_as_of_date",
|
||||||
|
previous_filters: {
|
||||||
|
as_of_date: "2019-03-31",
|
||||||
|
period_from: "2019-03-01",
|
||||||
|
period_to: "2019-03-31",
|
||||||
|
warehouse: "Основной склад"
|
||||||
|
},
|
||||||
|
previous_anchor_type: "unknown",
|
||||||
|
previous_anchor_value: null
|
||||||
|
});
|
||||||
|
expect(result).not.toBeNull();
|
||||||
|
expect(result?.intent.intent).toBe("inventory_purchase_provenance_for_item");
|
||||||
|
expect(result?.filters.extracted_filters.as_of_date).toBe("2019-03-31");
|
||||||
|
expect(
|
||||||
|
result?.baseReasons?.includes("intent_adjusted_to_inventory_followup_context") ||
|
||||||
|
result?.intent.reasons.includes("inventory_selected_object_provenance_signal_detected")
|
||||||
|
).toBe(true);
|
||||||
|
});
|
||||||
|
|
||||||
|
it("promotes selected-object purchase-doc slang follow-up into inventory purchase documents with inherited date context", () => {
|
||||||
|
const result = runAddressDecomposeStage('По выбранному объекту "Столешница 600*3050*26 дуб ниагара": по каким документам это купили', {
|
||||||
|
previous_intent: "inventory_purchase_provenance_for_item",
|
||||||
|
previous_filters: {
|
||||||
|
as_of_date: "2019-03-31",
|
||||||
|
period_from: "2019-03-01",
|
||||||
|
period_to: "2019-03-31",
|
||||||
|
item: "Столешница 600*3050*26 дуб ниагара"
|
||||||
|
},
|
||||||
|
previous_anchor_type: "unknown",
|
||||||
|
previous_anchor_value: null
|
||||||
|
});
|
||||||
|
expect(result).not.toBeNull();
|
||||||
|
expect(result?.intent.intent).toBe("inventory_purchase_documents_for_item");
|
||||||
|
expect(result?.filters.extracted_filters.item).toBe("Столешница 600*3050*26 дуб ниагара");
|
||||||
|
expect(result?.filters.extracted_filters.as_of_date).toBe("2019-03-31");
|
||||||
|
expect(
|
||||||
|
result?.baseReasons?.includes("intent_adjusted_to_inventory_followup_context") ||
|
||||||
|
result?.intent.reasons.includes("inventory_selected_object_purchase_documents_signal_detected")
|
||||||
|
).toBe(true);
|
||||||
|
});
|
||||||
|
|
||||||
it("keeps slang all-customers-all-time wording in address lane via resolved intent fallback", () => {
|
it("keeps slang all-customers-all-time wording in address lane via resolved intent fallback", () => {
|
||||||
const result = runAddressDecomposeStage("выведи всех заков за все время", null);
|
const result = runAddressDecomposeStage("выведи всех заков за все время", null);
|
||||||
expect(result).not.toBeNull();
|
expect(result).not.toBeNull();
|
||||||
|
|
|
||||||
|
|
@ -2120,6 +2120,7 @@ def build_analyst_loop_prompt(
|
||||||
- `.codex/agents/domain_analyst.toml`
|
- `.codex/agents/domain_analyst.toml`
|
||||||
- `.codex/skills/domain-case-loop/SKILL.md`
|
- `.codex/skills/domain-case-loop/SKILL.md`
|
||||||
- `.codex/skills/domain-case-loop/references/verdict_template.md`
|
- `.codex/skills/domain-case-loop/references/verdict_template.md`
|
||||||
|
- `.codex/skills/domain-case-loop/references/business_first_analyst_rubric.md`
|
||||||
|
|
||||||
Current loop context:
|
Current loop context:
|
||||||
- loop_dir: `{loop_dir}`
|
- loop_dir: `{loop_dir}`
|
||||||
|
|
@ -2135,11 +2136,13 @@ def build_analyst_loop_prompt(
|
||||||
|
|
||||||
Goal:
|
Goal:
|
||||||
- evaluate current domain-pack correctness for business meaning, route/capability quality, evidence quality, and absence of silent heuristic masking;
|
- evaluate current domain-pack correctness for business meaning, route/capability quality, evidence quality, and absence of silent heuristic masking;
|
||||||
|
- evaluate business usefulness, direct-answer-first behavior, state continuity, and field truthfulness, not only technical groundedness;
|
||||||
- determine whether the gate `quality_score >= {target_score}` is reached;
|
- determine whether the gate `quality_score >= {target_score}` is reached;
|
||||||
- if not, provide the smallest high-value fix targets for the coder.
|
- if not, provide the smallest high-value fix targets for the coder.
|
||||||
|
|
||||||
Rules:
|
Rules:
|
||||||
- `accepted` is allowed only if quality_score >= {target_score}, unresolved_p0_count = 0, and regression_detected = false;
|
- `accepted` is allowed only if quality_score >= {target_score}, unresolved_p0_count = 0, and regression_detected = false;
|
||||||
|
- `accepted` also requires `direct_answer_ok = true` and `business_usefulness_ok = true`;
|
||||||
- `partial` means the pack is usable but exactness, routing, or coverage is still insufficient;
|
- `partial` means the pack is usable but exactness, routing, or coverage is still insufficient;
|
||||||
- `needs_exact_capability` means the primary blocker is a missing exact route or capability, but the loop should still continue autonomously unless a user decision is required;
|
- `needs_exact_capability` means the primary blocker is a missing exact route or capability, but the loop should still continue autonomously unless a user decision is required;
|
||||||
- `continue` means there is a clear next patch cycle;
|
- `continue` means there is a clear next patch cycle;
|
||||||
|
|
@ -2152,6 +2155,10 @@ def build_analyst_loop_prompt(
|
||||||
- if `requires_user_decision = true`, fill `user_decision_type` and `user_decision_prompt`;
|
- if `requires_user_decision = true`, fill `user_decision_type` and `user_decision_prompt`;
|
||||||
- if the pack is below {target_score} but there is still safe autonomous implementation work, keep `requires_user_decision = false`;
|
- if the pack is below {target_score} but there is still safe autonomous implementation work, keep `requires_user_decision = false`;
|
||||||
- do not request user input merely because the score is still below {target_score}; request it only when the loop would otherwise guess, overfit, or risk architecture drift.
|
- do not request user input merely because the score is still below {target_score}; request it only when the loop would otherwise guess, overfit, or risk architecture drift.
|
||||||
|
- return machine-readable fields for: `user_intent_summary`, `expected_direct_answer`, `actual_direct_answer`, `direct_answer_ok`, `business_usefulness_ok`, `business_utility_score`, `direct_answer_priority_score`, `state_continuity_score`, `answer_shape_score`, `evidence_clarity_score`, `root_cause_layers`, `broken_edge_ids`, `violated_invariants`;
|
||||||
|
- if the product found the evidence but failed to retain the selected object, provenance bundle, or another reusable resolved object across turns, classify that as `object_memory_gap` or `edge_carryover_gap`, not as a generic route problem;
|
||||||
|
- if the surfaced business field looks mislabeled, for example supplier vs organization, classify that as `field_mapping_gap`;
|
||||||
|
- if the answer is technically grounded but still weak for a manager/accountant/operator, classify that as `business_utility_gap`.
|
||||||
|
|
||||||
Use this UTF-8 evidence bundle as the source of truth for artifact contents. Do not treat shell rendering artifacts as file corruption if the embedded bundle is readable.
|
Use this UTF-8 evidence bundle as the source of truth for artifact contents. Do not treat shell rendering artifacts as file corruption if the embedded bundle is readable.
|
||||||
|
|
||||||
|
|
@ -2196,6 +2203,9 @@ def build_coder_loop_prompt(
|
||||||
- do not present heuristic answers as confirmed;
|
- do not present heuristic answers as confirmed;
|
||||||
- do not touch unrelated files;
|
- do not touch unrelated files;
|
||||||
- preserve already successful baseline flows.
|
- preserve already successful baseline flows.
|
||||||
|
- use `root_cause_layers`, `broken_edge_ids`, `violated_invariants`, and business-utility scores from the analyst verdict to choose the smallest fix;
|
||||||
|
- prioritize state continuity, selected-object persistence, direct-answer-first behavior, and field-truth mapping when those are the blocking layers;
|
||||||
|
- do not broaden scope when the analyst says the defect is mainly `object_memory_gap`, `field_mapping_gap`, `answer_shape_mismatch`, or `business_utility_gap`.
|
||||||
|
|
||||||
Required outputs:
|
Required outputs:
|
||||||
- create `{iteration_dir / 'coder_plan.md'}` with a short plan;
|
- create `{iteration_dir / 'coder_plan.md'}` with a short plan;
|
||||||
|
|
@ -2217,12 +2227,21 @@ def evaluate_analyst_gate(
|
||||||
quality_score = int(verdict.get("quality_score") or 0)
|
quality_score = int(verdict.get("quality_score") or 0)
|
||||||
unresolved_p0_count = int(verdict.get("unresolved_p0_count") or 0)
|
unresolved_p0_count = int(verdict.get("unresolved_p0_count") or 0)
|
||||||
regression_detected = bool(verdict.get("regression_detected"))
|
regression_detected = bool(verdict.get("regression_detected"))
|
||||||
|
direct_answer_ok = bool(verdict.get("direct_answer_ok", True))
|
||||||
|
business_usefulness_ok = bool(verdict.get("business_usefulness_ok", True))
|
||||||
loop_decision = str(verdict.get("loop_decision") or "").strip() or "continue"
|
loop_decision = str(verdict.get("loop_decision") or "").strip() or "continue"
|
||||||
requires_user_decision = bool(verdict.get("requires_user_decision"))
|
requires_user_decision = bool(verdict.get("requires_user_decision"))
|
||||||
user_decision_type = str(verdict.get("user_decision_type") or "").strip() or "none"
|
user_decision_type = str(verdict.get("user_decision_type") or "").strip() or "none"
|
||||||
user_decision_prompt_raw = verdict.get("user_decision_prompt")
|
user_decision_prompt_raw = verdict.get("user_decision_prompt")
|
||||||
user_decision_prompt = str(user_decision_prompt_raw).strip() if user_decision_prompt_raw else None
|
user_decision_prompt = str(user_decision_prompt_raw).strip() if user_decision_prompt_raw else None
|
||||||
accepted = quality_score >= target_score and unresolved_p0_count == 0 and not regression_detected and loop_decision == "accepted"
|
accepted = (
|
||||||
|
quality_score >= target_score
|
||||||
|
and unresolved_p0_count == 0
|
||||||
|
and not regression_detected
|
||||||
|
and direct_answer_ok
|
||||||
|
and business_usefulness_ok
|
||||||
|
and loop_decision == "accepted"
|
||||||
|
)
|
||||||
return accepted, loop_decision, requires_user_decision, user_decision_type, user_decision_prompt
|
return accepted, loop_decision, requires_user_decision, user_decision_type, user_decision_prompt
|
||||||
|
|
||||||
|
|
||||||
|
|
|
||||||
Loading…
Reference in New Issue