81 lines
2.2 KiB
Markdown
81 lines
2.2 KiB
Markdown
# Verdict
|
|
|
|
## 1. Question meaning
|
|
...
|
|
|
|
## 2. Primary user path and scenario tree
|
|
- root:
|
|
- critical child nodes:
|
|
- critical edges:
|
|
- primary user path:
|
|
|
|
## 3. Expected direct answer
|
|
- what the first line should say:
|
|
- minimum acceptable business answer:
|
|
|
|
## 4. What the system actually computed
|
|
...
|
|
|
|
## 5. Business mismatch
|
|
- did the answer solve the user's real question:
|
|
- did the direct answer appear first:
|
|
- is the answer usable for an operator/accountant/manager:
|
|
|
|
## 6. Route / capability mismatch
|
|
...
|
|
|
|
## 7. State continuity and selected-object memory
|
|
- selected object continuity:
|
|
- focus object continuity:
|
|
- date/period continuity:
|
|
- reusable answer-object continuity:
|
|
- provenance or sale bundle reuse:
|
|
- pronoun resolution continuity:
|
|
- follow-up action resolution continuity:
|
|
|
|
## 8. Field truth and evidence quality
|
|
- supplier vs organization:
|
|
- buyer vs organization:
|
|
- exact / partial / heuristic / technical insufficiency:
|
|
- why:
|
|
|
|
## 9. P0 defects
|
|
- ...
|
|
|
|
## 10. P1 defects
|
|
- ...
|
|
|
|
## 11. P2 defects
|
|
- ...
|
|
|
|
## 12. Minimal patch directions
|
|
- ...
|
|
|
|
## 13. Acceptance matrix for rerun
|
|
- Node / edge coverage:
|
|
- Canonical wording:
|
|
- Colloquial wording:
|
|
- UI-generated selected-object wording:
|
|
- Pronoun-only follow-up wording:
|
|
- Carryover invariants:
|
|
- Expected answer shape:
|
|
- Expected direct answer:
|
|
- Business usefulness:
|
|
- Recommended state objects:
|
|
- Defect class:
|
|
|
|
## 14. Acceptance criteria for rerun
|
|
- ...
|
|
- Include colloquial/slang variants and UI-generated selected-object follow-up variants when they are part of the business flow.
|
|
- Require the primary user path to pass end-to-end, not only the root node.
|
|
- Require direct-answer-first behavior on direct lookup questions.
|
|
- Require business-useful output rather than technically-grounded-but-noisy output.
|
|
- Require selected-object continuity and reusable answer-object continuity on follow-up chains.
|
|
- Require focus-object continuity, bundle reuse, and correct action resolution for short follow-ups like `по ней` / `по этой позиции` when they are part of the business flow.
|
|
|
|
## 15. Quality score
|
|
- integer from 0 to 100
|
|
|
|
## 16. Loop decision
|
|
- accepted / continue / partial / blocked / needs_exact_capability
|