2.2 KiB
2.2 KiB
Verdict
1. Question meaning
...
2. Primary user path and scenario tree
- root:
- critical child nodes:
- critical edges:
- primary user path:
3. Expected direct answer
- what the first line should say:
- minimum acceptable business answer:
4. What the system actually computed
...
5. Business mismatch
- did the answer solve the user's real question:
- did the direct answer appear first:
- is the answer usable for an operator/accountant/manager:
6. Route / capability mismatch
...
7. State continuity and selected-object memory
- selected object continuity:
- focus object continuity:
- date/period continuity:
- reusable answer-object continuity:
- provenance or sale bundle reuse:
- pronoun resolution continuity:
- follow-up action resolution continuity:
8. Field truth and evidence quality
- supplier vs organization:
- buyer vs organization:
- exact / partial / heuristic / technical insufficiency:
- why:
9. P0 defects
- ...
10. P1 defects
- ...
11. P2 defects
- ...
12. Minimal patch directions
- ...
13. Acceptance matrix for rerun
- Node / edge coverage:
- Canonical wording:
- Colloquial wording:
- UI-generated selected-object wording:
- Pronoun-only follow-up wording:
- Carryover invariants:
- Expected answer shape:
- Expected direct answer:
- Business usefulness:
- Recommended state objects:
- Defect class:
14. Acceptance criteria for rerun
- ...
- Include colloquial/slang variants and UI-generated selected-object follow-up variants when they are part of the business flow.
- Require the primary user path to pass end-to-end, not only the root node.
- Require direct-answer-first behavior on direct lookup questions.
- Require business-useful output rather than technically-grounded-but-noisy output.
- Require selected-object continuity and reusable answer-object continuity on follow-up chains.
- Require focus-object continuity, bundle reuse, and correct action resolution for short follow-ups like
по ней/по этой позицииwhen they are part of the business flow.
15. Quality score
- integer from 0 to 100
16. Loop decision
- accepted / continue / partial / blocked / needs_exact_capability