diff --git a/docs/ARCH/11 - architecture_turnaround/06 - phase_acceptance_matrix.md b/docs/ARCH/11 - architecture_turnaround/06 - phase_acceptance_matrix.md index b0f4aef..726abf5 100644 --- a/docs/ARCH/11 - architecture_turnaround/06 - phase_acceptance_matrix.md +++ b/docs/ARCH/11 - architecture_turnaround/06 - phase_acceptance_matrix.md @@ -29,7 +29,7 @@ Current reporting baseline: - Open-World Business Overview implementation breadth: `~99%` through Slice 25. - Open-World Semantic Control Gate: accepted critical subset after EHMO/W5/W7 hardening; fat GUI pack review remains a broad human-pressure gate. - Route-Candidate-Driven Enablement Loop: `100%`, now regression-gated by phase91-phase98 canaries. -- Open-World Schema/Primitive Discovery: `38%`, financial-counterparty slice accepted live at `4/4` and limit-honesty/business-language gate accepted live at `6/6`; next schema/primitive candidate should come from real live/manual replay evidence. +- Open-World Schema/Primitive Discovery: `95%`, phases97-105 accepted live and saved as user-runnable AGENT autoruns; latest closure replay `phase105_mixed_schema_primitive_closure_live3` accepted `13/13`. Remaining gate is manual GUI/fat-pack review before final closure or a narrow phase106 repair if that review reveals a new semantic defect. ## Archived Execution Snapshot (2026-04-17) diff --git a/docs/ARCH/11 - architecture_turnaround/21 - current_status_canon_2026-05-01.md b/docs/ARCH/11 - architecture_turnaround/21 - current_status_canon_2026-05-01.md index 6913f08..8a9750b 100644 --- a/docs/ARCH/11 - architecture_turnaround/21 - current_status_canon_2026-05-01.md +++ b/docs/ARCH/11 - architecture_turnaround/21 - current_status_canon_2026-05-01.md @@ -6,7 +6,7 @@ This note is the short source of truth for current module wording after the Post It exists to prevent stale percentage drift in planning discussions. -If another document says `78%`, `87%`, `92%`, or `85%` for a module that is now closed, read that value as a historical snapshot unless this note explicitly repeats it as current. +If another document says `78%`, `84%`, `87%`, `92%`, or `85%` for a module that is now closed or has since advanced, read that value as a historical snapshot unless this note explicitly repeats it as current. ## 2026-05-05 Overlay - EHMO Manual QA Gate @@ -87,11 +87,18 @@ Fresh validation cut: - Completed autonomy slice inside that loop: `Inventory Reserve/Liquidation Quality Reviewed Route`: `inventory_reserve_liquidation_quality` now promotes to reviewed inventory quality-event evidence from posted write-off, receipt-adjustment, stocktaking, and revaluation documents; phase96 live replay is accepted. - Completed broader schema/primitive discovery slice: `Financial Counterparty Flow Hints`: bank-document money-flow recipes expose operation/purpose/comment fields, ranked value-flow buckets carry `financial_flow_hint`, explicit `СБЕРБАНК` wording is not swallowed by supplier/customer tails, and bank-like leaders are bounded away from ordinary supplier/customer overclaim; phase97 live replay is accepted. - Completed broader schema/primitive discovery support slice: `Limit Honesty And Business Language Gate`: compact business-overview replies sanitize route/proxy/MCP-style wording, keep row-limit disclosure relevant to the asked contour, and preserve debt/VAT/bank/inventory/supplier canaries; phase98 live replay is accepted. -- Current live canary: `phase98_limit_honesty_business_language_live3` accepted `6/6`. -- Current accepted autorun: `AGENT | Phase 98 limit honesty and business-language replay` (`gen-ag05122315-f1e27c`). +- Completed broader schema/primitive discovery support slice: `Large-Query Budget And Continuation Policy`: explicit-year `business_overview` now receives the chunked monthly recovery budget already used by value-flow routes, yearly money-flow coverage can recover from broad-row caps without fake limit refusal, and profit follow-ups answer direct-first that cash-flow net is not clean profit while still surfacing checked accounting close evidence when present; phase99 live replay is accepted. +- Completed broader schema/primitive discovery support slice: `Large-Query Continuation UX`: all-time broad `business_overview` row-cap disclosure now becomes a safe year/quarter continuation path, while narrowed explicit-year follow-ups keep company scope instead of falling into placeholder counterparty wording; phase100 live replay is accepted. +- Completed broader schema/primitive discovery support slice: `Inventory Root Scope Without Warehouse Clarification`: broad stock-on-hand root wording now has replay proof that the assistant asks only for company when organization scope is ambiguous, resumes the all-warehouse company snapshot after the company choice, and does not invent warehouse/item/category/material requirements for root inventory questions; phase101 live replay is accepted. +- Completed broader schema/primitive discovery support slice: `Debt Mirror Clean-Scope Polarity`: fresh bare organization-name turns can bind scope from a live data-scope probe, confirmed payables/receivables keep the selected organization, short mirror follow-ups override stale/open-items LLM expansion, and mirrored 76.09 financial-security rows are disclosed as offset evidence rather than counted as clean debt in both directions; phase102 live replay is accepted. +- Completed broader schema/primitive discovery support slice: `Financial Role/Purpose Arbitration`: grounded exact `bank_operations_*` answers now win over generic value-flow discovery when bank-like counterparties need role/purpose classification; compact bank answers summarize incoming/outgoing rows and do not overclaim ordinary customer revenue or supplier dependency without operation/purpose/contract evidence; phase103 live replay is accepted. +- Completed broader schema/primitive discovery support slice: `Generic Role-Tail Anchor Hygiene`: broad role wording such as `не обычный клиент или поставщик` no longer leaks `или поставщик` into counterparty filters, selected objects, or discovery predecompose input, while explicit supplier-payment wording still keeps real counterparties and routes to `supplier_payouts_profile`; phase104 live replay is accepted. +- Completed broader schema/primitive discovery closure slice: `Mixed Schema/Primitive Closure Replay`: phase105 validates the combined current module surface across inventory root scope, historical inventory carryover, role-tail hygiene, bank role/purpose, supplier payout, bidirectional SVK value-flow, clean debt polarity, VAT tax-period continuity, and cash-flow/profit boundary; phase105 live replay is accepted. +- Current live canary: `phase105_mixed_schema_primitive_closure_live3` accepted `13/13`. +- Current accepted autorun: `AGENT | Phase 105 mixed schema/primitive closure replay` (`gen-ag05131312-2d0445`). - Implementation breadth: `~99% (Open-World Bounded Autonomy Breadth through Slice 25)`. -- Active broader autonomy module: `Open-World Schema/Primitive Discovery`, with `Financial Counterparty Flow Hints` and `Limit Honesty And Business Language Gate` accepted and saved. -- Next active slice: select the next unfamiliar 1C ask from live/manual replay evidence, then continue broader schema/primitive discovery while using phase91-phase98 as regression canaries. +- Active broader autonomy module: `Open-World Schema/Primitive Discovery`, with phases97-105 accepted and saved; the module is now at manual-review readiness rather than another blind coding slice. +- Next active slice: run/review the phase105 GUI autorun or the user's fat manual pack; if it stays clean, close this module, otherwise convert the next observed failure into a narrow phase106 repair/replay. - Operating-layer progress: `~99% (Agentic Semantic Development Loop, accepted dogfood loop + autorun hygiene; manual GUI confirmation still required)`. ## Reporting Rule @@ -104,7 +111,7 @@ Use these labels when reporting progress: - `Прогресс модуля: 99% (Agentic Semantic Development Loop, accepted dogfood loop + autorun hygiene; manual GUI confirmation still required)` when discussing the current development-loop operating layer. - `Прогресс модуля: 100% (Open-World Route Candidate Promotion, declared phase90 slice accepted)` when discussing the route-candidate handoff slice itself. - `Прогресс модуля: 100% (Route-Candidate-Driven Enablement Loop, final reviewed proof-family route accepted; use as regression gate)` when discussing the current candidate-driven enablement loop. -- `Прогресс модуля: 38% (Open-World Schema/Primitive Discovery, phase97 financial-counterparty slice and phase98 business-language gate accepted; next schema/primitive candidate pending)` when discussing the current broader schema/primitive discovery module. +- `Прогресс модуля: 95% (Open-World Schema/Primitive Discovery, phases97-105 accepted; phase105 GUI/manual checkpoint pending before final closure)` when discussing the current broader schema/primitive discovery module. - `Open-World Business Overview implementation breadth: ~99%, Semantic Control Gate critical subset accepted, fat GUI pack still pending` when discussing only the already wired Slice 25 breadth. - `Прогресс модуля: X% (Open-World Bounded Autonomy Breadth, active slice: )` for later breadth work after the Semantic Control Gate is accepted. @@ -170,12 +177,18 @@ For current planning, read: 9. `27 - proof_family_enablement_candidates_2026-05-10.md` 10. `26 - route_candidate_driven_enablement_loop_2026-05-10.md` 11. `25 - open_world_route_candidate_promotion_2026-05-10.md` -12. `24 - agentic_semantic_development_loop_and_autorun_hygiene_2026-05-10.md` -13. `23 - current_execution_spine_and_semantic_control_gate_2026-05-05.md` -14. `22 - open_world_bounded_autonomy_breadth_2026-05-01.md` -15. `20 - planner_autonomy_consolidation_2026-05-01.md` -16. `19 - inventory_stock_open_world_breadth_proof_2026-05-01.md` -17. `17 - post_f_semantic_integrity_hardening_2026-04-23.md` -18. `16 - data_need_graph_and_open_world_mcp_plan_2026-04-22.md` +12. `34 - large_query_budget_continuation_2026-05-13.md` +13. `35 - large_query_continuation_ux_2026-05-13.md` +14. `36 - inventory_root_scope_no_warehouse_clarification_2026-05-13.md` +15. `37 - debt_mirror_clean_scope_polarity_2026-05-13.md` +16. `24 - agentic_semantic_development_loop_and_autorun_hygiene_2026-05-10.md` +17. `23 - current_execution_spine_and_semantic_control_gate_2026-05-05.md` +18. `22 - open_world_bounded_autonomy_breadth_2026-05-01.md` +19. `20 - planner_autonomy_consolidation_2026-05-01.md` +20. `19 - inventory_stock_open_world_breadth_proof_2026-05-01.md` +21. `40 - mixed_schema_primitive_closure_replay_2026-05-13.md` +22. `39 - generic_role_tail_anchor_hygiene_2026-05-13.md` +23. `17 - post_f_semantic_integrity_hardening_2026-04-23.md` +24. `16 - data_need_graph_and_open_world_mcp_plan_2026-04-22.md` Documents `01` through `15` remain valuable, but mostly as the historical architecture trail. diff --git a/docs/ARCH/11 - architecture_turnaround/34 - large_query_budget_continuation_2026-05-13.md b/docs/ARCH/11 - architecture_turnaround/34 - large_query_budget_continuation_2026-05-13.md new file mode 100644 index 0000000..f383711 --- /dev/null +++ b/docs/ARCH/11 - architecture_turnaround/34 - large_query_budget_continuation_2026-05-13.md @@ -0,0 +1,75 @@ +# 34 - Large Query Budget And Continuation Policy (2026-05-13) + +## Purpose + +This slice hardens explicit-year large business questions inside the broader `Open-World Schema/Primitive Discovery` module. + +The triggering user concern was not only answer wording. It was the runtime behavior where a broad yearly business question could hit the MCP row cap and then answer as if coverage was materially limited, even though the value-flow runtime already had a safe monthly recovery mechanism. + +## Runtime Change + +The planner now grants the existing chunked coverage budget to explicit-year `business_overview` routes, not only to direct value-flow routes. + +Key boundary: + +- this is not a global unlimited query mode; +- this does not remove evidence gates; +- this does not turn all-time broad analysis into a full accounting audit; +- it enables bounded monthly recovery for explicit-year business overview money-flow evidence when the broad money probe reaches the row cap. + +The pilot executor now also records successful business-overview monthly recovery in reason codes: + +- `pilot_business_overview_incoming_monthly_period_chunking_recovered_coverage` +- `pilot_business_overview_outgoing_monthly_period_chunking_recovered_coverage` + +The answer layer was tightened so profit follow-ups answer direct-first: + +- money-flow net is not clean profit; +- if accounting close evidence exists, the answer may add the checked 90/91/99 result; +- user-facing text avoids route jargon such as "бухгалтерский маршрут" and account-specific margin phrasing like `90.01`. + +## Validation + +Local validation: + +- `npm.cmd test -- assistantMcpDiscoveryPlanner.test.ts assistantMcpDiscoveryPilotExecutor.test.ts assistantMcpDiscoveryAnswerAdapter.test.ts assistantMcpDiscoveryResponseCandidate.test.ts` +- result: `141 passed, 1 skipped` +- `npm.cmd run build` +- result: passed + +Live semantic replay: + +- spec: `docs/orchestration/address_truth_harness_phase99_large_query_budget_continuation.json` +- accepted run: `artifacts/domain_runs/phase99_large_query_budget_continuation_live2` +- result: `4/4`, `final_status=accepted` + +Saved user-runnable AGENT autorun: + +- title: `AGENT | Phase 99 large-query budget and continuation policy replay` +- generation id: `gen-ag05131009-f08174` +- validation status: `accepted_live_replay` + +## Acceptance Meaning + +Accepted for this slice: + +- explicit-year company business overview can recover yearly money-flow coverage through monthly probes when the broad money query hits the row cap; +- yearly overview answers remain direct-first and preserve bank/counterparty role boundaries; +- profit follow-up after money overview does not equate cash-flow net to clean profit; +- if accounting-result evidence exists, it is framed as checked 1C period-close evidence, not external audit or legal reporting. + +Still not accepted as universal: + +- arbitrary all-time large 1C scans; +- automatic long-running user confirmation UX; +- complete accounting audit, statutory P&L, or legal reporting; +- unlimited primitive exploration outside reviewed route candidates. + +## Current Status + +`Open-World Schema/Primitive Discovery` moves to `52%` after this slice. + +Why not higher: + +- phase97, phase98, and phase99 prove important runtime breadth/support behavior; +- but broader dynamic schema/primitive discovery still needs more unfamiliar 1C asks, more primitive descriptors where live evidence demands them, and continuation UX for genuinely large all-time questions. diff --git a/docs/ARCH/11 - architecture_turnaround/35 - large_query_continuation_ux_2026-05-13.md b/docs/ARCH/11 - architecture_turnaround/35 - large_query_continuation_ux_2026-05-13.md new file mode 100644 index 0000000..68a4b8c --- /dev/null +++ b/docs/ARCH/11 - architecture_turnaround/35 - large_query_continuation_ux_2026-05-13.md @@ -0,0 +1,65 @@ +# 35 - Large Query Continuation UX (2026-05-13) + +## Purpose + +This slice hardens the user-facing behavior after genuinely broad all-time business questions. + +The previous phase99 slice made explicit-year `business_overview` safe by giving it the existing monthly recovery budget. Phase100 covers the adjacent UX seam: when an all-time or very wide question still reaches row caps, the assistant must not pretend that the checked slice is a complete accounting answer, and it must not leave the user at a dead end. + +## Runtime Change + +The compact business-overview answer now turns a row-limit disclosure into an actionable continuation policy: + +- state that the current result is an expanded checked slice, not a guaranteed full accounting turnover; +- tell the user to choose a concrete year or quarter for a safe follow-up; +- avoid promising unlimited scans or unreviewed full exports; +- keep the answer business-readable and free of MCP/route/probe jargon. + +The bidirectional value-flow compact answer also now respects organization scope when no counterparty is selected. If the turn is scoped to `ООО Альтернатива Плюс` and the derived value-flow has no explicit counterparty, the answer says `по компании ООО Альтернатива Плюс`, not `по контрагенту запрошенному контрагенту`. + +## Validation + +Local validation: + +- `npm.cmd test -- assistantMcpDiscoveryResponseCandidate.test.ts` +- result: `28 passed` +- `npm.cmd run build` +- result: passed + +Live semantic replay: + +- spec: `docs/orchestration/address_truth_harness_phase100_large_query_continuation_ux.json` +- first run: `artifacts/domain_runs/phase100_large_query_continuation_ux_live1`, partial because the initial spec over-forbade a correct negative profit-boundary phrase and exposed the company-vs-counterparty wording defect in step 2; +- accepted run: `artifacts/domain_runs/phase100_large_query_continuation_ux_live2` +- result: `3/3`, `final_status=accepted` + +Saved user-runnable AGENT autorun: + +- title: `AGENT | Phase 100 large-query continuation UX replay` +- generation id: `gen-ag05131028-234e5e` +- validation status: `accepted_live_replay` + +## Acceptance Meaning + +Accepted for this slice: + +- all-time company business overview can answer from the checked slice while honestly warning about row caps; +- the answer includes a safe continuation path through a concrete year or quarter instead of a technical dead end; +- explicit-year follow-up after the all-time answer recovers the 2020 money-flow numbers and labels the scope as company scope; +- profit follow-up still separates operating cash-flow net from clean profit/accounting financial result. + +Still not accepted as universal: + +- automatic long-running confirmation UX; +- arbitrary all-time full accounting export; +- unlimited primitive exploration outside reviewed route candidates; +- legal or external-audit profit claims. + +## Current Status + +`Open-World Schema/Primitive Discovery` moves to `60%` after this slice. + +Why not higher: + +- phase97 through phase100 now cover financial counterparty hints, language/limit hygiene, explicit-year large-query recovery, and all-time continuation UX; +- but broader dynamic schema/primitive discovery still needs more unfamiliar 1C asks, more primitive descriptors where live evidence proves real gaps, and more continuation behavior for non-money domains. diff --git a/docs/ARCH/11 - architecture_turnaround/36 - inventory_root_scope_no_warehouse_clarification_2026-05-13.md b/docs/ARCH/11 - architecture_turnaround/36 - inventory_root_scope_no_warehouse_clarification_2026-05-13.md new file mode 100644 index 0000000..79d71b9 --- /dev/null +++ b/docs/ARCH/11 - architecture_turnaround/36 - inventory_root_scope_no_warehouse_clarification_2026-05-13.md @@ -0,0 +1,75 @@ +# 36 - Inventory Root Scope Without Warehouse Clarification (2026-05-13) + +## Purpose + +This slice hardens a real manual-GUI signal from `assistant-stage1-hyh1A1WR3j`. + +The problematic seam was not a missing inventory route. The route already knew how to answer stock-on-hand snapshots. The semantic risk was narrower: + +- user asks a broad root question such as `что там на складе по остаткам?`; +- assistant correctly asks for company because several organizations exist; +- user selects `АЛЬТЕРНАТИВА`; +- assistant must resume the broad inventory snapshot for the selected company instead of inventing a new requirement for a concrete warehouse, item, category, or material. + +For root inventory, `склад` in user language means the stock contour, not necessarily a warehouse filter. A warehouse filter is valid only when the user names or asks for a specific warehouse. + +## Runtime Meaning + +No new code patch was required in this slice. + +The live replay proved the current runtime already behaves correctly after the latest scope/continuation work: + +- ambiguous broad inventory root asks only for organization; +- bare company selection resumes the pending inventory intent; +- extracted filters contain organization and date, but not `warehouse`, `item`, or `category`; +- historical capability follow-up stays human and does not ask for a warehouse; +- month-only follow-ups keep the inventory contour and the selected organization. + +This is still important because it converts a manual concern into a saved regression gate rather than leaving it as a vague GUI memory. + +## Validation + +Live semantic replay: + +- spec: `docs/orchestration/address_truth_harness_phase101_inventory_root_scope_no_warehouse_clarification.json` +- accepted run: `artifacts/domain_runs/phase101_inventory_root_scope_no_warehouse_clarification_live1` +- result: `7/7`, `final_status=accepted` + +Critical step evidence: + +- `step_03_inventory_root_requires_company_only`: asks for organization only. +- `step_04_company_choice_resumes_inventory_without_warehouse`: answers `На 13.05.2026 на складе подтверждено 11 позиций...`, with `organization=ООО Альтернатива Плюс` and no warehouse/item/category filters. +- `step_06_inventory_june_2017_after_capability`: answers the June 2017 snapshot with organization/date carryover and no warehouse filter. +- `step_07_inventory_march_2016_stays_root`: answers the March 2016 snapshot with organization/date carryover and no warehouse filter. + +Saved user-runnable AGENT autorun: + +- title: `AGENT | Phase 101 inventory root scope without warehouse clarification replay` +- generation id: `gen-ag05131044-cbe2ff` +- validation status: `accepted_live_replay` + +## Acceptance Meaning + +Accepted for this slice: + +- broad inventory root wording may ask for company when organization scope is ambiguous; +- after company choice, the assistant answers across available company stock evidence instead of asking for a warehouse; +- historical inventory follow-ups keep company/date carryover; +- root inventory does not promote natural Russian `на складе` into a literal warehouse anchor by itself. + +Still not accepted as universal: + +- arbitrary warehouse-specific analytics without a named warehouse; +- inventory turnover/FIFO/liquidity beyond already reviewed proxy contours; +- item-level provenance unless an item or selected object is present; +- external audit-grade stock valuation outside confirmed 1C evidence. + +## Current Status + +`Open-World Schema/Primitive Discovery` moves to `68%` after this slice. + +Why not higher: + +- this was a control/semantic-proof slice rather than new primitive expansion; +- the live replay covers the manual warehouse-clarification seam, but broader unfamiliar inventory asks can still reveal new primitive gaps; +- the next slice should continue from real manual/live evidence, especially where a broad business noun could be misread as a required technical axis. diff --git a/docs/ARCH/11 - architecture_turnaround/37 - debt_mirror_clean_scope_polarity_2026-05-13.md b/docs/ARCH/11 - architecture_turnaround/37 - debt_mirror_clean_scope_polarity_2026-05-13.md new file mode 100644 index 0000000..c3335f6 --- /dev/null +++ b/docs/ARCH/11 - architecture_turnaround/37 - debt_mirror_clean_scope_polarity_2026-05-13.md @@ -0,0 +1,83 @@ +# 37 - Debt Mirror Clean-Scope Polarity (2026-05-13) + +## Purpose + +This slice hardens a real debt-polarity question from manual replay `assistant-stage1-87gHJCwTI9`. + +The dangerous symptom was semantic, not just routing noise: + +- the same `Комитет государственных услуг` amount could appear in both "мы должны" and "нам должны"; +- mirrored financial-security rows on account `76.09` could be mistaken for clean payable or receivable debt; +- a short follow-up such as `а мы кому?` after receivables could be expanded by LLM predecompose into an open-items route instead of the clean payables route; +- a fresh-session bare company name such as `Альтернатива Плюс` could fall into generic living chat instead of becoming the selected organization scope. + +The target behavior is direct and conservative: clean debt answers may disclose mirrored/offset evidence, but must not count it as real debt in both directions. + +## Runtime Changes + +Implemented: + +- `assistantTransitionPolicy` now lets a detected debt-role swap override an LLM open-items expansion for short mirror follow-ups. +- `assistantLivingChatRuntimeAdapter` now probes live data scope for short bare organization-name turns when no organization is active yet. +- If the probe resolves a known organization, living chat emits the deterministic organization-selection reply and stores the selected/active organization before subsequent address turns. +- The debug contract now records `living_chat_bare_scope_probe_attempted` and `living_chat_bare_scope_probe_matched_organization`. + +This keeps the fix bounded: + +- normal data/meta/capability/safety questions are not reinterpreted as organization selection; +- unmatched bare text still falls back to the previous living-chat path; +- clean payables/receivables still run through reviewed exact debt recipes. + +## Validation + +Live semantic replay: + +- spec: `docs/orchestration/address_truth_harness_phase102_debt_mirror_clean_scope_polarity.json` +- accepted run: `artifacts/domain_runs/phase102_debt_mirror_clean_scope_polarity_live3` +- result: `6/6`, `final_status=accepted` + +Critical step evidence: + +- `step_01_choose_company_scope`: bare `Альтернатива Плюс` selects `ООО Альтернатива Плюс` through deterministic data-scope selection, not LLM chat. +- `step_02_clean_payables_today`: clean payables keep `organization=ООО Альтернатива Плюс`. +- `step_03_committee_payable_is_not_clean_debt`: payable debt to the Committee is confirmed as `0,00`. +- `step_04_committee_receivable_is_not_clean_debt`: confirmed receivable debt from the Committee is not found. +- `step_05_clean_receivables_today`: clean receivables keep the selected organization. +- `step_06_payables_mirror_followup_keeps_clean_scope`: `а мы кому?` resolves to `payables_confirmed_as_of_date`, not open-items, and keeps the organization scope. + +Targeted verification: + +- `npm.cmd test -- assistantLivingChatRuntimeAdapter.test.ts assistantLivingChatAttemptInputBuilder.test.ts assistantLivingChatAttemptRuntimeInputBuilder.test.ts assistantTransitionPolicy.test.ts assistantAddressFollowupContext.test.ts` +- `npm.cmd run build` + +Saved user-runnable AGENT autorun: + +- title: `AGENT | Phase 102 debt mirror clean-scope polarity replay` +- generation id: `gen-ag05131121-8c41ab` +- validation status: `accepted_live_replay` + +## Acceptance Meaning + +Accepted for this slice: + +- fresh bare organization labels can bind organization scope from live known organizations; +- clean debt root questions keep selected organization scope; +- short receivables-to-payables mirror follow-ups stay in the confirmed clean-debt route; +- mirrored financial-security evidence may be disclosed separately, but is not counted as clean debt in both directions. + +Still not accepted as universal: + +- arbitrary legal/commercial classification of every account `76.*` without reviewed semantics; +- external audit-grade netting rules outside confirmed 1C evidence; +- full contractual delinquency analysis beyond the reviewed due-date aging route; +- organization/counterparty names that cannot be matched by the live data-scope probe. + +## Current Status + +`Open-World Schema/Primitive Discovery` moves to `72%` after this slice. + +Why not higher: + +- this slice fixes an important polarity/scope integrity seam, but does not expand the primitive catalog to arbitrary 1C facts; +- account-role semantics beyond the reviewed debt recipes still need evidence-driven expansion; +- the next slice should again be selected from real manual/live replay evidence where user business wording exposes an unsupported schema primitive or wrong semantic arbitration. diff --git a/docs/ARCH/11 - architecture_turnaround/38 - financial_role_purpose_arbitration_2026-05-13.md b/docs/ARCH/11 - architecture_turnaround/38 - financial_role_purpose_arbitration_2026-05-13.md new file mode 100644 index 0000000..9096d5e --- /dev/null +++ b/docs/ARCH/11 - architecture_turnaround/38 - financial_role_purpose_arbitration_2026-05-13.md @@ -0,0 +1,86 @@ +# 38 - Financial Role/Purpose Arbitration (2026-05-13) + +## Purpose + +This slice hardens a real schema/semantic seam around bank-like counterparties. + +The dangerous symptom was not that `СБЕРБАНК` could be found. It was that the same evidence could be misread through an ordinary counterparty lens: + +- incoming bank rows could be overclaimed as customer revenue; +- outgoing bank rows could be overclaimed as supplier payments; +- generic `value_flow` discovery could replace the more precise exact `bank_operations_*` route; +- long bank operation lists could bury the business answer in raw rows. + +The target behavior is conservative: bank-like counterparties are first treated as financial-operation evidence, and ordinary customer/supplier/revenue meaning must be proven from operation kind, payment purpose, contract, and checked rows rather than inferred from the name or movement direction alone. + +## Runtime Changes + +Implemented: + +- `assistantMcpDiscoveryResponsePolicy` now keeps confirmed exact `bank_operations_by_counterparty` / `bank_operations_by_contract` replies over generic value-flow discovery candidates. +- The guard is intentionally narrow: it requires an exact bank operation intent, matching bank recipe, address-lane source, and grounded/matched/exact runtime evidence. +- Bank operation answer composition now includes a compact direction summary over incoming/outgoing/unknown rows. +- Bank role boundary wording now says directly when a bank is not proven as an ordinary customer, supplier, or client revenue source. +- Bank operation drilldown answers now show a bounded first-page sample instead of dumping every row into direct business answers. +- Evidence selection prefers the direction implied by the user question when showing the short 1C basis line. + +This keeps the fix bounded: + +- ordinary customer/supplier value-flow routes are unchanged; +- generic value-flow candidates can still replace stale or unsupported replies outside exact bank-operation evidence; +- the assistant does not classify bank flows as credit, deposit, commission, or revenue without operation/purpose/contract evidence. + +## Validation + +Live semantic replay: + +- spec: `docs/orchestration/address_truth_harness_phase103_financial_role_purpose_arbitration.json` +- accepted run: `artifacts/domain_runs/phase103_financial_role_purpose_arbitration_live3` +- result: `6/6`, `final_status=accepted` + +Critical step evidence: + +- `step_01_choose_company_scope`: fresh `Альтернатива Плюс` binds `ООО Альтернатива Плюс`. +- `step_02_sberbank_role_purpose_summary`: exact bank route wins over generic value-flow and answers the role/purpose question. +- `step_03_sberbank_incoming_customer_boundary`: incoming bank evidence is not overclaimed as customer revenue. +- `step_04_sberbank_outgoing_supplier_boundary`: outgoing bank evidence is not overclaimed as ordinary supplier dependency. +- `step_05_business_overview_excludes_bank_role_overclaim`: broad company overview keeps bank role boundaries in top-flow discussion. +- `step_06_svk_value_flow_canary_after_bank_context`: normal non-bank counterparty value-flow canary remains healthy after the bank context. + +Targeted verification: + +- `npm.cmd test -- assistantMcpDiscoveryResponsePolicy.test.ts` +- `npm.cmd test -- addressQueryRuntimeM23.test.ts assistantMcpDiscoveryResponsePolicy.test.ts` +- `npm.cmd run build` + +Saved user-runnable AGENT autorun: + +- title: `AGENT | Phase 103 financial role and purpose arbitration replay` +- generation id: `gen-ag05131200-0ed59a` +- validation status: `accepted_live_replay` + +## Acceptance Meaning + +Accepted for this slice: + +- exact bank-operation evidence is preferred over generic value-flow when the exact route is grounded; +- bank-like counterparties are not classified as ordinary customers, suppliers, or revenue sources without purpose/operation/contract evidence; +- direct bank boundary answers stay compact enough for business review; +- the fix preserves normal counterparty value-flow behavior. + +Still not accepted as universal: + +- full bank-product accounting classification across every bank and account family; +- external audit-grade loan/deposit/commission categorization without reviewed 1C document semantics; +- arbitrary financial-instrument semantics outside the currently materialized operation/purpose/contract fields; +- replacement of all value-flow logic with bank-specific heuristics. + +## Current Status + +`Open-World Schema/Primitive Discovery` moves to `78%` after this slice. + +Why not higher: + +- the slice closes a high-value role/purpose arbitration gap, but it is still one family of financial-operation semantics; +- broader schema/primitive discovery still needs additional real evidence around unfamiliar 1C asks; +- the next slice should come from another live/manual replay seam where the assistant can reach evidence but still lacks the right primitive, role, or answer shape. diff --git a/docs/ARCH/11 - architecture_turnaround/39 - generic_role_tail_anchor_hygiene_2026-05-13.md b/docs/ARCH/11 - architecture_turnaround/39 - generic_role_tail_anchor_hygiene_2026-05-13.md new file mode 100644 index 0000000..9790aef --- /dev/null +++ b/docs/ARCH/11 - architecture_turnaround/39 - generic_role_tail_anchor_hygiene_2026-05-13.md @@ -0,0 +1,80 @@ +# 39 - Generic Role-Tail Anchor Hygiene (2026-05-13) + +## Purpose + +This slice hardens a subtle semantic-integrity seam found after the financial role/purpose work. + +The user may describe ordinary business roles inside a broad company question, for example `не обычный клиент или поставщик`. Those words are business semantics, not entity anchors. The dangerous failure mode was that the tail `или поставщик` could become a fake `counterparty` in exact filters, selected objects, and MCP discovery input. + +The target behavior is narrow: + +- role words such as `клиент`, `поставщик`, or `или поставщик` must not become counterparty anchors by themselves; +- real explicit role-prefixed counterparties such as `по поставщику Группа СВК` must still keep the actual counterparty; +- broad business overview can discuss customer/supplier/bank roles without poisoning later bank or value-flow drilldowns. + +## Runtime Changes + +Implemented: + +- `addressFilterExtractor` now treats standalone generic role tails as low-quality counterparty anchors. +- `address_runtime/decomposeStage` clears inherited low-quality role-tail anchors from follow-up state before recipe selection. +- `assistantMcpDiscoveryTurnInputAdapter` rejects generic role tails from LLM predecompose before they become discovery entity candidates. +- `addressIntentResolver` now recognizes `сколько мы ему заплатили` / `заплатили` supplier-payment wording as `supplier_payouts_profile`, preventing a supplier payment question from first falling into `counterparty_population_and_roles`. + +The fix is bounded: + +- `по поставщику Альфа` and `по поставщику Группа СВК` remain valid explicit counterparty anchors; +- supplier count / role split questions such as `скока поставщиков` remain in counterparty population; +- MCP discovery can still recover unsupported value-flow asks, but no longer receives a fake role-tail entity from the accepted replay seam. + +## Validation + +Live semantic replay: + +- spec: `docs/orchestration/address_truth_harness_phase104_generic_role_tail_anchor_hygiene.json` +- accepted run: `artifacts/domain_runs/phase104_generic_role_tail_anchor_hygiene_live2` +- result: `4/4`, `final_status=accepted` + +Critical step evidence: + +- `step_01_choose_company_scope`: fresh `Альтернатива Плюс` binds `ООО Альтернатива Плюс`. +- `step_02_business_overview_role_tail_not_counterparty`: broad company overview has no `counterparty` filter while still explaining bank/customer/supplier boundaries. +- `step_03_sberbank_role_after_role_tail_overview`: SBERBANK bank-role drilldown remains grounded after the role-tail overview. +- `step_04_real_supplier_prefix_still_keeps_counterparty`: real supplier-prefixed `Группа СВК` stays a counterparty and routes through `supplier_payouts_profile` instead of population. + +Targeted verification: + +- `npm.cmd test -- addressQueryRuntimeM23.test.ts assistantMcpDiscoveryTurnInputAdapter.test.ts` +- `npm.cmd run build` + +Saved user-runnable AGENT autorun: + +- title: `AGENT | Phase 104 generic role-tail anchor hygiene replay` +- generation id: `gen-ag05131226-630ddf` +- validation status: `accepted_live_replay` + +## Acceptance Meaning + +Accepted for this slice: + +- generic role-tail anchors are filtered across exact extraction, follow-up merge, and discovery predecompose input; +- bank role/purpose context does not contaminate later normal counterparty value-flow questions; +- explicit supplier-payment wording is routed as supplier payout, not as a population/role count question; +- the accepted replay checks both user-facing answer quality and machine-level forbidden-filter hygiene. + +Still not accepted as universal: + +- every possible Russian business-role paraphrase outside the covered role-tail family; +- full natural-language role taxonomy; +- automatic proof that every role word in every sentence is entity-free; +- broad arbitrary 1C schema discovery outside the currently accepted phase97-phase104 seams. + +## Current Status + +`Open-World Schema/Primitive Discovery` moves to `84%` after this slice. + +Why not higher: + +- the slice closes a real semantic poisoning seam, but it is still a targeted role-tail/intent-arbitration hardening cut; +- the module still needs at least one broader mixed replay pass over unfamiliar asks before a `95%+` claim is honest; +- future slices should continue coming from live/manual replay evidence where the assistant can reach evidence but still risks wrong primitive, stale state, wrong role, or noisy answer shape. diff --git a/docs/ARCH/11 - architecture_turnaround/40 - mixed_schema_primitive_closure_replay_2026-05-13.md b/docs/ARCH/11 - architecture_turnaround/40 - mixed_schema_primitive_closure_replay_2026-05-13.md new file mode 100644 index 0000000..662749f --- /dev/null +++ b/docs/ARCH/11 - architecture_turnaround/40 - mixed_schema_primitive_closure_replay_2026-05-13.md @@ -0,0 +1,43 @@ +# 40 - Mixed Schema/Primitive Closure Replay (2026-05-13) + +## Status + +Accepted. + +This slice closes the current `Open-World Schema/Primitive Discovery` pass to manual-review readiness. It is intentionally not a new proof-family expansion: it is a mixed semantic closure replay across the routes and integrity seams already hardened in phases 97-104. + +Validated replay: + +- spec: `docs/orchestration/address_truth_harness_phase105_mixed_schema_primitive_closure.json` +- accepted run: `artifacts/domain_runs/phase105_mixed_schema_primitive_closure_live3` +- result: `13/13`, `final_status=accepted` +- saved autorun: `AGENT | Phase 105 mixed schema/primitive closure replay` +- generation id: `gen-ag05131312-2d0445` + +## What It Proves + +- Broad inventory root wording such as `что там на складе по остаткам` answers from the resolved organization scope without inventing a warehouse, item, category, or material clarification. +- Historical inventory follow-ups keep organization and temporal continuity. +- Business-overview wording with generic role tails does not leak `или поставщик` or similar role text into counterparty scope. +- Bank-like counterparties such as `СБЕРБАНК` are classified from bank-operation evidence and are not overclaimed as ordinary clients or suppliers. +- Explicit supplier-payment wording for `Группа СВК` routes to supplier payout evidence. +- Bidirectional counterparty value-flow keeps the full multi-token subject `Группа СВК` through raw-scope discovery and answers incoming, outgoing, and net money flow directly. +- Broad debt-polarity questions switch cleanly between payables and receivables without inheriting stale counterparty scope. +- VAT tax-period questions survive after inventory, role, value-flow, and debt pivots and keep the quarter tax-period basis explicit. +- Earnings/cash-flow wording routes to business overview and does not equate operational cash net with clean accounting profit. + +## Code-Level Corrections From The Replay + +- `address_runtime/decomposeStage.ts` suppresses stale counterparty carryover for broad debt-polarity questions such as `кому мы должны` and `а нам кто должен`, while preserving explicit referential counterparty follow-ups such as `по нему`. +- `addressFilterExtractor.ts` treats generic inventory wording like `по остаткам` as a low-quality warehouse pseudo-anchor. +- `assistantMcpDiscoveryTurnInputAdapter.ts` preserves multi-token raw scoped counterparties in discovery input, so `по Группа СВК за 2020` does not degrade to `Группа`. + +## Verification + +- `npm.cmd test -- addressQueryRuntimeM23.test.ts assistantMcpDiscoveryTurnInputAdapter.test.ts --testTimeout=45000` passed: `537 passed`, `7 skipped`. +- `npm.cmd run build` passed. +- `phase105_mixed_schema_primitive_closure_live3` passed `13/13`. + +## Remaining Risk + +The module is now ready for a fat manual GUI check rather than more blind implementation. Remaining risk is not that phases 97-105 are absent; the risk is that a human mixed run may still expose a new unfamiliar 1C phrasing, another entity alias edge, or an answer-shape expectation outside the current replay. diff --git a/docs/ARCH/11 - architecture_turnaround/README.md b/docs/ARCH/11 - architecture_turnaround/README.md index 0268086..65da3e0 100644 --- a/docs/ARCH/11 - architecture_turnaround/README.md +++ b/docs/ARCH/11 - architecture_turnaround/README.md @@ -51,6 +51,13 @@ This package answers the next question: 31. [31 - inventory_reserve_liquidation_quality_reviewed_route_2026-05-12.md](./31%20-%20inventory_reserve_liquidation_quality_reviewed_route_2026-05-12.md) 32. [32 - financial_counterparty_flow_hints_2026-05-13.md](./32%20-%20financial_counterparty_flow_hints_2026-05-13.md) 33. [33 - limit_honesty_business_language_2026-05-13.md](./33%20-%20limit_honesty_business_language_2026-05-13.md) +34. [34 - large_query_budget_continuation_2026-05-13.md](./34%20-%20large_query_budget_continuation_2026-05-13.md) +35. [35 - large_query_continuation_ux_2026-05-13.md](./35%20-%20large_query_continuation_ux_2026-05-13.md) +36. [36 - inventory_root_scope_no_warehouse_clarification_2026-05-13.md](./36%20-%20inventory_root_scope_no_warehouse_clarification_2026-05-13.md) +37. [37 - debt_mirror_clean_scope_polarity_2026-05-13.md](./37%20-%20debt_mirror_clean_scope_polarity_2026-05-13.md) +38. [38 - financial_role_purpose_arbitration_2026-05-13.md](./38%20-%20financial_role_purpose_arbitration_2026-05-13.md) +39. [39 - generic_role_tail_anchor_hygiene_2026-05-13.md](./39%20-%20generic_role_tail_anchor_hygiene_2026-05-13.md) +40. [40 - mixed_schema_primitive_closure_replay_2026-05-13.md](./40%20-%20mixed_schema_primitive_closure_replay_2026-05-13.md) ## Current Status Snapshot (2026-05-13) @@ -112,6 +119,20 @@ Status canon for planning: - The accepted user-runnable autorun for that slice is `AGENT | Phase 97 financial counterparty flow hints replay` (`gen-ag05122250-4451a8`). - The second broader schema/primitive discovery support slice is now accepted: `limit honesty and business-language gate` sanitizes route/proxy/MCP-style answer wording, keeps row-limit disclosure relevant to the asked business contour, and preserves debt/VAT/bank/inventory/supplier canaries; `phase98_limit_honesty_business_language_live3` passed `6/6`. - The accepted user-runnable autorun for that slice is `AGENT | Phase 98 limit honesty and business-language replay` (`gen-ag05122315-f1e27c`). +- The third broader schema/primitive discovery support slice is now accepted: `large-query budget and continuation policy` grants explicit-year `business_overview` the existing monthly recovery budget, avoids artificial row-limit refusal when yearly money-flow coverage can be chunked safely, and fixes the profit follow-up answer shape so cash-flow net is not equated with clean profit; `phase99_large_query_budget_continuation_live2` passed `4/4`. +- The accepted user-runnable autorun for that slice is `AGENT | Phase 99 large-query budget and continuation policy replay` (`gen-ag05131009-f08174`). +- The fourth broader schema/primitive discovery support slice is now accepted: `large-query continuation UX` turns all-time row-cap disclosure into a safe year/quarter continuation path, keeps broad answers honest about checked-slice coverage, and fixes organization-scoped bidirectional value-flow wording after continuation; `phase100_large_query_continuation_ux_live2` passed `3/3`. +- The accepted user-runnable autorun for that slice is `AGENT | Phase 100 large-query continuation UX replay` (`gen-ag05131028-234e5e`). +- The fifth broader schema/primitive discovery support slice is now accepted: `inventory root scope without warehouse clarification` proves a broad stock-on-hand root query resumes after company clarification as an all-warehouse company snapshot instead of asking for a warehouse, item, category, or material; `phase101_inventory_root_scope_no_warehouse_clarification_live1` passed `7/7`. +- The accepted user-runnable autorun for that slice is `AGENT | Phase 101 inventory root scope without warehouse clarification replay` (`gen-ag05131044-cbe2ff`). +- The sixth broader schema/primitive discovery support slice is now accepted: `debt mirror clean-scope polarity` proves a bare company-name turn in a fresh session can bind organization scope from live data-scope probe, confirmed payables/receivables keep the selected organization, short mirror follow-ups such as `а мы кому?` stay in the clean debt route instead of drifting into open-items, and mirrored 76.09 financial-security rows are disclosed as offset/mirror evidence rather than counted as debt in both directions; `phase102_debt_mirror_clean_scope_polarity_live3` passed `6/6`. +- The accepted user-runnable autorun for that slice is `AGENT | Phase 102 debt mirror clean-scope polarity replay` (`gen-ag05131121-8c41ab`). +- The seventh broader schema/primitive discovery support slice is now accepted: `financial role/purpose arbitration` keeps grounded exact `bank_operations_*` evidence over generic value-flow candidates, summarizes incoming/outgoing bank rows compactly, and prevents bank-like counterparties from being classified as ordinary customer revenue or supplier dependency without operation/purpose/contract evidence; `phase103_financial_role_purpose_arbitration_live3` passed `6/6`. +- The accepted user-runnable autorun for that slice is `AGENT | Phase 103 financial role and purpose arbitration replay` (`gen-ag05131200-0ed59a`). +- The eighth broader schema/primitive discovery support slice is now accepted: `generic role-tail anchor hygiene` prevents wording such as `или поставщик` from becoming a fake counterparty in exact filters, selected objects, or discovery predecompose input, while preserving real role-prefixed counterparties such as `по поставщику Группа СВК`; `phase104_generic_role_tail_anchor_hygiene_live2` passed `4/4`. +- The accepted user-runnable autorun for that slice is `AGENT | Phase 104 generic role-tail anchor hygiene replay` (`gen-ag05131226-630ddf`). +- The ninth broader schema/primitive discovery closure slice is now accepted: `mixed schema/primitive closure replay` validates inventory scope, historical inventory carryover, role-tail hygiene, bank role/purpose, supplier payout wording, bidirectional SVK value-flow, clean payables/receivables polarity, VAT tax-period continuity, and cash-flow-vs-profit answer shape together; `phase105_mixed_schema_primitive_closure_live3` passed `13/13`. +- The accepted user-runnable autorun for that slice is `AGENT | Phase 105 mixed schema/primitive closure replay` (`gen-ag05131312-2d0445`). - The phase94 replay spec was repaired to real UTF-8 Russian before autorun persistence, so the saved user-runnable pack does not repeat the earlier GUI mojibake/card-text regression. - The short source of truth for status wording is [21 - current_status_canon_2026-05-01.md](./21%20-%20current_status_canon_2026-05-01.md). - The current execution spine after EHMO is [23 - current_execution_spine_and_semantic_control_gate_2026-05-05.md](./23%20-%20current_execution_spine_and_semantic_control_gate_2026-05-05.md). @@ -125,6 +146,13 @@ Status canon for planning: - The fourth/final reviewed proof-family route slice is [31 - inventory_reserve_liquidation_quality_reviewed_route_2026-05-12.md](./31%20-%20inventory_reserve_liquidation_quality_reviewed_route_2026-05-12.md). - The first broader schema/primitive discovery slice is [32 - financial_counterparty_flow_hints_2026-05-13.md](./32%20-%20financial_counterparty_flow_hints_2026-05-13.md), now accepted live and saved as a user-runnable AGENT autorun. - The second broader schema/primitive discovery support slice is [33 - limit_honesty_business_language_2026-05-13.md](./33%20-%20limit_honesty_business_language_2026-05-13.md), now accepted live and saved as a user-runnable AGENT autorun. +- The third broader schema/primitive discovery support slice is [34 - large_query_budget_continuation_2026-05-13.md](./34%20-%20large_query_budget_continuation_2026-05-13.md), now accepted live and saved as a user-runnable AGENT autorun. +- The fourth broader schema/primitive discovery support slice is [35 - large_query_continuation_ux_2026-05-13.md](./35%20-%20large_query_continuation_ux_2026-05-13.md), now accepted live and saved as a user-runnable AGENT autorun. +- The fifth broader schema/primitive discovery support slice is [36 - inventory_root_scope_no_warehouse_clarification_2026-05-13.md](./36%20-%20inventory_root_scope_no_warehouse_clarification_2026-05-13.md), now accepted live and saved as a user-runnable AGENT autorun. +- The sixth broader schema/primitive discovery support slice is [37 - debt_mirror_clean_scope_polarity_2026-05-13.md](./37%20-%20debt_mirror_clean_scope_polarity_2026-05-13.md), now accepted live and saved as a user-runnable AGENT autorun. +- The seventh broader schema/primitive discovery support slice is [38 - financial_role_purpose_arbitration_2026-05-13.md](./38%20-%20financial_role_purpose_arbitration_2026-05-13.md), now accepted live and saved as a user-runnable AGENT autorun. +- The eighth broader schema/primitive discovery support slice is [39 - generic_role_tail_anchor_hygiene_2026-05-13.md](./39%20-%20generic_role_tail_anchor_hygiene_2026-05-13.md), now accepted live and saved as a user-runnable AGENT autorun. +- The mixed schema/primitive closure replay is [40 - mixed_schema_primitive_closure_replay_2026-05-13.md](./40%20-%20mixed_schema_primitive_closure_replay_2026-05-13.md), now accepted live and saved as a user-runnable AGENT autorun. It now documents a turnaround that is already operational in code, already materially past the acute regression breakpoint, and already moved through bounded MCP autonomy, Post-F hardening, inventory breadth proof, and the declared Planner Autonomy slice: @@ -187,7 +215,7 @@ Current honest status: - exit-from-danger-zone readiness: `~97%` - pre-multidomain readiness: `~90%` - bounded-autonomy foundation readiness: `~89%` -- open-world bounded-autonomy readiness: `~87%` +- legacy open-world bounded-autonomy parent-readiness snapshot: `~87%` before the later route-candidate/schema-primitive closure; use the active module line below for the current `95%` schema/primitive discovery status - active Open-World Bounded Autonomy Breadth implementation breadth: `~99%`, with business-overview evidence fusion, the reviewed `business_overview` catalog/data-need/planner route-fabric slice, the fresh multi-probe runtime bridge, the explicit-period VAT/tax fact-family bridge, the explicit-period debt-position bridge, the explicit-date inventory-position bridge, the open-settlement quality bridge accepted by live semantic replay, selected-item profitability bridged by local semantic/runtime regression tests, contract-date debt age bridged locally, debt staleness-risk proxy bridged locally, debt due-date boundary arbitration bridged locally, inventory reserve/liquidation boundary arbitration bridged locally, supplier/procurement-quality boundary arbitration bridged locally, supplier concentration proxy bridged locally, document/account-section activity profile bridged locally, counterparty population/roles and contract usage profiles bridged locally, yearly operating-flow proxy bridged locally, earnings/best-year wording arbitration bridged locally, profit/margin wording boundary arbitration bridged locally, analyst synthesis added to business-overview answer drafting, company-period trading margin proxy bridged locally, inventory sales-to-stock proxy bridged locally, inventory staleness-risk proxy bridged locally, gap-specific answer shaping bridged locally, missing proof families recorded as runtime evidence ledger, exact accounting profit/margin promoted into a reviewed 90/91/99 route by phase93, debt due-date aging promoted into a reviewed payment-term/open-balance route by phase94, vendor/procurement quality promoted into reviewed procurement-concentration evidence by phase95, inventory reserve/write-off/liquidation quality promoted into reviewed inventory quality-event evidence by phase96, and bank-like financial counterparty role/purpose hints accepted by phase97 - active Open-World Bounded Autonomy Breadth accepted-module progress: `~99%`, because the EHMO-derived `Open-World Semantic Control Gate` critical subset accepts live at `21/21` after W5/W7 hardening; full closure is still held back for the fat manual GUI pack and remaining answer-shape residual review - Post-F semantic integrity module progress: `~99%` operationally closed, with remaining risk now treated as next-slice discovery rather than an open blocker inside the closed slice @@ -195,7 +223,8 @@ Current honest status: - Planner Autonomy Consolidation progress: `100%` for the declared module, with catalog-fabric, value-flow arbitration, lifecycle bounded inference, broad-evaluation bridge, inventory catalog templates, inventory runtime-boundary honesty, exact inventory recipe bridging, unambiguous metadata-surface lane inference, catalog chain-template scoring, structured chain-match contract exposure, runtime/debug propagation, subject-aware bidirectional comparison arbitration, structured catalog-alignment verdicts, representative alignment regression guard, catalog-alignment reason-code telemetry, explicit `alignment_status` propagation, truth-harness/acceptance-matrix surfacing, soft divergence warning, `catalog_alignment_ok` acceptance invariant, step-level expected catalog-alignment assertions, phase66 and phase32 spec alignment expectations, AGENT source-catalog surfacing, generated phase83 mixed planner-brain replay spec, checked-source user-facing error sanitation, surface-grounded catalog promotion, and guarded live phase83 acceptance validated. Broader unfamiliar 1C asks are now next-module breadth work rather than an open blocker inside this declared slice - Open-World Route Candidate Promotion progress: `100%` for the declared phase90 slice, with structured `route_candidate` runtime contract, artifact propagation, live semantic replay accepted at `5/5`, and accepted AGENT autorun persistence; broader autonomous route enablement remains the next active slice - Route-Candidate-Driven Enablement Loop progress: `100%`, with deterministic repair-target grouping, Lead Codex handoff surfacing, local tooling tests, live phase91 canary acceptance, phase92 proof-family candidates accepted/saved as a user-runnable AGENT autorun, `accounting_profit_margin` promoted into reviewed 90/91/99 execution by phase93 live replay, `debt_due_date_aging_quality` promoted into reviewed payment-term/open-balance execution by phase94 live replay, `vendor_risk_procurement_quality` promoted into reviewed procurement-concentration evidence by phase95 live replay, and `inventory_reserve_liquidation_quality` promoted into reviewed inventory quality-event evidence by phase96 live replay; the declared route-candidate-driven enablement loop is now closed and should be used as a regression gate for the next broader autonomy slice -- Open-World Schema/Primitive Discovery progress: `38%`, with `financial counterparty flow hints` accepted live at `4/4` and `limit honesty/business language` accepted live at `6/6`; bank-document money-flow recipes expose operation/purpose/comment fields, ranked value-flow buckets carry `financial_flow_hint`, Sberbank-like leaders are bounded away from name-only supplier/customer overclaim, route/proxy/MCP-style answer wording is sanitized, and the next slice should be selected from real unfamiliar 1C asks rather than synthetic domain wish lists. +- Open-World Schema/Primitive Discovery progress: `95%`, with phases97-105 accepted live and saved as user-runnable AGENT autoruns; the latest closure replay `phase105_mixed_schema_primitive_closure_live3` passed `13/13` across inventory scope, historical inventory carryover, business overview role-tail hygiene, bank role/purpose, supplier payout wording, bidirectional SVK value-flow, clean payables/receivables polarity, VAT tax-period continuity, and cash-flow-vs-profit answer shape. +- Current manual checkpoint for this module: run `AGENT | Phase 105 mixed schema/primitive closure replay` (`gen-ag05131312-2d0445`) from GUI autoruns before moving the module from `95%` to final closure. - graph snapshot after latest rebuild: see `graphify-out/GRAPH_REPORT.md` - current regression-gate breakpoint: - the validated hot paths are no longer structurally broken; @@ -288,6 +317,11 @@ Latest live proof now includes: - inventory reserve/liquidation quality reviewed route accepted locally/live: answer/runtime/candidate tests passed `84/84` with `1` skipped, pilot-executor tests passed `34/34`, build passed; direct MCP query for `address_inventory_quality_events_for_organization_v1` returned `fetched_rows=0`, `matched_rows=0`, `error=null`; `phase96_inventory_reserve_liquidation_quality_rerun` accepted `2/2`; `inventory_reserve_liquidation_quality` now derives reviewed evidence from posted write-off, receipt-adjustment, stocktaking, and revaluation documents, removes the proof family from `missing_proof_families` when this reviewed route executes, anchors the organization in the direct answer, and can promote `inventory_reserve_boundary` route candidates to `ready_for_reviewed_execution`; the accepted autorun is `AGENT | Phase 96 inventory reserve/liquidation quality-events` (`gen-ag05122057-c9786e`). - financial counterparty flow hints accepted locally/live: targeted bank-flow/intent/turn-input/answer tests passed `554/554` with `7` skipped, build passed, graphify rebuilt to `6483` nodes, `14382` edges, `143` communities; `phase97_financial_counterparty_flow_hints_live4` accepted `4/4`, proving explicit `СБЕРБАНК` wording, bank-operation purpose/direction disclosure, incoming-bank no-overclaim, business-overview bank boundaries, and `Группа СВК` net-flow canary continuity; the accepted autorun is `AGENT | Phase 97 financial counterparty flow hints replay` (`gen-ag05122250-4451a8`). - limit honesty and business-language gate accepted locally/live: response-candidate/answer-adapter/pilot-executor/M23 tests passed `519/519` with `1` skipped, build passed, graphify rebuilt to `6484` nodes, `14385` edges, `142` communities; `phase98_limit_honesty_business_language_live3` accepted `6/6`, proving debt due-date boundary, short follow-up directness, VAT debug hygiene, top incoming bank boundary, inventory reserve boundary language, and supplier dependency language together; the accepted autorun is `AGENT | Phase 98 limit honesty and business-language replay` (`gen-ag05122315-f1e27c`). +- large-query budget/continuation policy accepted locally/live: targeted planner/pilot/answer/candidate tests passed `141/141` with `1` skipped, build passed; `phase99_large_query_budget_continuation_live2` accepted `4/4`, proving explicit-year business overview can recover money-flow coverage through monthly probes, cash-flow net is not treated as clean profit, bank-like incoming leaders stay bounded, and supplier-dependency answers remain concentration-only unless stronger evidence exists; the accepted autorun is `AGENT | Phase 99 large-query budget and continuation policy replay` (`gen-ag05131009-f08174`). +- large-query continuation UX accepted locally/live: response-candidate tests passed `28/28`, build passed; `phase100_large_query_continuation_ux_live2` accepted `3/3`, proving all-time row-cap disclosure becomes a safe year/quarter continuation path, the 2020 follow-up recovers checked incoming/outgoing/net numbers under company scope, and profit follow-up remains cash-flow-vs-profit honest; the accepted autorun is `AGENT | Phase 100 large-query continuation UX replay` (`gen-ag05131028-234e5e`). +- inventory root scope without warehouse clarification accepted live: `phase101_inventory_root_scope_no_warehouse_clarification_live1` accepted `7/7`, proving the manual `assistant-stage1-hyh1A1WR3j` stock-root seam now asks only for company when organization scope is ambiguous, resumes the root stock snapshot after `АЛЬТЕРНАТИВА`, preserves organization/date carryover for June 2017 and March 2016, and does not invent warehouse/item/category/material clarification requirements; the accepted autorun is `AGENT | Phase 101 inventory root scope without warehouse clarification replay` (`gen-ag05131044-cbe2ff`). + +- debt mirror clean-scope polarity accepted locally/live: targeted living-chat/transition/follow-up tests passed `95/95` with `1` skipped, build passed; `phase102_debt_mirror_clean_scope_polarity_live3` accepted `6/6`, proving a fresh bare company-name turn binds `ООО Альтернатива Плюс` through data-scope probe, clean payables/receivables keep organization scope, Committee 76.09 mirror rows are disclosed as offset evidence rather than double-counted debt, and short `а мы кому?` follow-up stays in `payables_confirmed_as_of_date` instead of drifting into open-items; the accepted autorun is `AGENT | Phase 102 debt mirror clean-scope polarity replay` (`gen-ag05131121-8c41ab`). Current architectural reading: @@ -319,6 +353,18 @@ For the detailed audit, current percentages, and remaining debt, read: - [26 - route_candidate_driven_enablement_loop_2026-05-10.md](./26%20-%20route_candidate_driven_enablement_loop_2026-05-10.md) - [27 - proof_family_enablement_candidates_2026-05-10.md](./27%20-%20proof_family_enablement_candidates_2026-05-10.md) - [28 - accounting_profit_margin_reviewed_route_2026-05-10.md](./28%20-%20accounting_profit_margin_reviewed_route_2026-05-10.md) +- [29 - debt_due_date_aging_reviewed_route_2026-05-10.md](./29%20-%20debt_due_date_aging_reviewed_route_2026-05-10.md) +- [30 - vendor_procurement_quality_reviewed_route_2026-05-12.md](./30%20-%20vendor_procurement_quality_reviewed_route_2026-05-12.md) +- [31 - inventory_reserve_liquidation_quality_reviewed_route_2026-05-12.md](./31%20-%20inventory_reserve_liquidation_quality_reviewed_route_2026-05-12.md) +- [32 - financial_counterparty_flow_hints_2026-05-13.md](./32%20-%20financial_counterparty_flow_hints_2026-05-13.md) +- [33 - limit_honesty_business_language_2026-05-13.md](./33%20-%20limit_honesty_business_language_2026-05-13.md) +- [34 - large_query_budget_continuation_2026-05-13.md](./34%20-%20large_query_budget_continuation_2026-05-13.md) +- [35 - large_query_continuation_ux_2026-05-13.md](./35%20-%20large_query_continuation_ux_2026-05-13.md) +- [36 - inventory_root_scope_no_warehouse_clarification_2026-05-13.md](./36%20-%20inventory_root_scope_no_warehouse_clarification_2026-05-13.md) +- [37 - debt_mirror_clean_scope_polarity_2026-05-13.md](./37%20-%20debt_mirror_clean_scope_polarity_2026-05-13.md) +- [38 - financial_role_purpose_arbitration_2026-05-13.md](./38%20-%20financial_role_purpose_arbitration_2026-05-13.md) +- [39 - generic_role_tail_anchor_hygiene_2026-05-13.md](./39%20-%20generic_role_tail_anchor_hygiene_2026-05-13.md) +- [40 - mixed_schema_primitive_closure_replay_2026-05-13.md](./40%20-%20mixed_schema_primitive_closure_replay_2026-05-13.md) ## Architectural Objects Of Planning @@ -370,6 +416,13 @@ Read in this order: 32. `31 - inventory_reserve_liquidation_quality_reviewed_route_2026-05-12.md` 33. `32 - financial_counterparty_flow_hints_2026-05-13.md` 34. `33 - limit_honesty_business_language_2026-05-13.md` +35. `34 - large_query_budget_continuation_2026-05-13.md` +36. `35 - large_query_continuation_ux_2026-05-13.md` +37. `36 - inventory_root_scope_no_warehouse_clarification_2026-05-13.md` +38. `37 - debt_mirror_clean_scope_polarity_2026-05-13.md` +39. `38 - financial_role_purpose_arbitration_2026-05-13.md` +40. `39 - generic_role_tail_anchor_hygiene_2026-05-13.md` +41. `40 - mixed_schema_primitive_closure_replay_2026-05-13.md` ## Planning Rules diff --git a/docs/orchestration/address_truth_harness_phase100_large_query_continuation_ux.json b/docs/orchestration/address_truth_harness_phase100_large_query_continuation_ux.json new file mode 100644 index 0000000..04ebf7e --- /dev/null +++ b/docs/orchestration/address_truth_harness_phase100_large_query_continuation_ux.json @@ -0,0 +1,112 @@ +{ + "schema_version": "domain_truth_harness_spec_v1", + "scenario_id": "address_truth_harness_phase100_large_query_continuation_ux", + "domain": "address_phase100_large_query_continuation_ux", + "title": "Phase 100 large-query continuation UX replay", + "description": "Focused semantic replay for all-time or very broad business overview questions: the assistant should answer from checked evidence, disclose row-limit coverage honestly, offer a safe period-narrowing continuation path, and then recover the explicit-year follow-up through chunked evidence without leaking technical mechanics.", + "bindings": {}, + "steps": [ + { + "step_id": "step_01_all_time_overview_limit_becomes_continuation_path", + "title": "All-time overview gives checked partial answer plus safe continuation path", + "question": "Дай общий бизнес-обзор ООО Альтернатива Плюс за весь доступный период: входящие, исходящие, нетто, лучший год. Если срез слишком широкий, не выдумывай полный итог, а скажи как безопасно дособрать.", + "allowed_reply_types": [ + "factual", + "factual_with_explanation", + "partial_coverage" + ], + "required_answer_patterns_all": [ + "(?i)альтернатива", + "(?i)входящ|поступлен", + "(?i)исходящ|списан|платеж", + "(?i)нетто|денежн", + "(?i)год|период", + "(?i)лимит строк|проверенн.*срез|не гарантия полного|не полный", + "(?i)конкретн.*год|год или квартал|дозапрос|собрать.*част" + ], + "forbidden_answer_patterns": [ + "(?i)\\bMCP\\b", + "(?i)route_candidate", + "(?i)primitive", + "(?i)planner_", + "(?i)catalog_", + "(?i)snapshot_items", + "(?i)answer_object" + ], + "criticality": "critical", + "semantic_tags": [ + "business_overview", + "large_query_continuation", + "limit_honesty" + ] + }, + { + "step_id": "step_02_user_narrows_to_2020_and_gets_recovered_year", + "title": "Explicit-year continuation recovers yearly money evidence", + "question": "Ок, тогда дособери конкретно 2020: входящие, исходящие и расчетное денежное нетто.", + "allowed_reply_types": [ + "factual", + "factual_with_explanation", + "partial_coverage" + ], + "required_answer_patterns_all": [ + "(?i)2020", + "(?i)входящ|поступлен", + "(?i)исходящ|списан|платеж", + "(?i)нетто|денежн", + "(?i)47\\s*628\\s*853|47,?6", + "(?i)43\\s*763\\s*351|43,?7|43,?8" + ], + "forbidden_answer_patterns": [ + "(?i)уп[её]р.*лимит", + "(?i)лимит выборки", + "(?i)лимит строк", + "(?i)\\bMCP\\b", + "(?i)route_candidate", + "(?i)primitive", + "(?i)planner_", + "(?i)catalog_", + "(?i)snapshot_items", + "(?i)answer_object" + ], + "criticality": "critical", + "semantic_tags": [ + "business_overview", + "large_query_budget", + "followup_continuation" + ] + }, + { + "step_id": "step_03_profit_followup_keeps_boundary_after_continuation", + "title": "Profit follow-up keeps cash-flow boundary after continuation", + "question": "Это можно считать прибылью за 2020 или нет? Ответь коротко и по делу.", + "allowed_reply_types": [ + "factual", + "factual_with_explanation", + "partial_coverage" + ], + "required_answer_patterns_all": [ + "(?i)нет|нельзя|не стоит", + "(?i)прибыл", + "(?i)денежн|поток|нетто|поступлен", + "(?i)90/91/99|закрыт|финрезультат|себестоим|расход" + ], + "forbidden_answer_patterns": [ + "(?i)\\bMCP\\b", + "(?i)route_candidate", + "(?i)primitive", + "(?i)planner_", + "(?i)catalog_", + "(?i)snapshot_items", + "(?i)answer_object", + "(?i)бухгалтерскому маршруту" + ], + "criticality": "critical", + "semantic_tags": [ + "profit_boundary", + "followup_directness", + "business_language" + ] + } + ] +} diff --git a/docs/orchestration/address_truth_harness_phase101_inventory_root_scope_no_warehouse_clarification.json b/docs/orchestration/address_truth_harness_phase101_inventory_root_scope_no_warehouse_clarification.json new file mode 100644 index 0000000..5494311 --- /dev/null +++ b/docs/orchestration/address_truth_harness_phase101_inventory_root_scope_no_warehouse_clarification.json @@ -0,0 +1,267 @@ +{ + "schema_version": "domain_truth_harness_spec_v1", + "scenario_id": "address_truth_harness_phase101_inventory_root_scope_no_warehouse_clarification", + "domain": "address_phase101_inventory_root_scope_no_warehouse_clarification", + "title": "Phase 101 inventory root scope without warehouse clarification replay", + "description": "Focused semantic replay for the manual assistant-stage1-hyh1A1WR3j signal: after a broad stock-on-hand question and organization clarification, the assistant must resume the root inventory snapshot for the selected company across available warehouses instead of asking the user to name a warehouse, item, category, or material.", + "bindings": {}, + "steps": [ + { + "step_id": "step_01_smalltalk_entry", + "title": "Smalltalk entry stays human", + "question": "приветик - че как там дела", + "required_answer_patterns_all": [ + "(?i)привет|дела|помочь|могу" + ], + "forbidden_answer_patterns": [ + "(?i)\\bMCP\\b", + "(?i)route_candidate", + "(?i)primitive", + "(?i)planner_", + "(?i)snapshot_items", + "(?i)answer_object" + ], + "criticality": "important", + "semantic_tags": [ + "smalltalk_entry", + "human_answer_quality" + ] + }, + { + "step_id": "step_02_capability_meta_entry", + "title": "Capability meta entry stays human and business-oriented", + "question": "расскажи что можешь интересного", + "allowed_reply_types": [ + "factual", + "factual_with_explanation" + ], + "required_direct_answer_patterns_any": [ + "(?i)могу|умею", + "(?i)ндс|документ|контрагент|долг|склад|остат" + ], + "forbidden_direct_answer_patterns": [ + "(?i)\\bMCP\\b", + "(?i)route_candidate", + "(?i)primitive", + "(?i)planner_", + "(?i)snapshot_items", + "(?i)answer_object" + ], + "criticality": "important", + "semantic_tags": [ + "capability_meta", + "human_answer_quality" + ] + }, + { + "step_id": "step_03_inventory_root_requires_company_only", + "title": "Broad inventory root asks only for organization when company scope is ambiguous", + "question": "кайф - что там на складе по остаткам?", + "required_answer_patterns_all": [ + "(?i)уточни|уточнить|выбери|какую компанию|какая компания|организац", + "(?i)альтернатива плюс|лайсвуд|райм" + ], + "forbidden_answer_patterns": [ + "(?i)\\bMCP\\b", + "(?i)route_candidate", + "(?i)primitive", + "(?i)planner_", + "(?i)snapshot_items", + "(?i)answer_object", + "(?i)какой\\s+конкретн.*склад", + "(?i)какой\\s+склад", + "(?i)конкретн.*товар", + "(?i)названи[ея]\\s+товар", + "(?i)категор", + "(?i)материал" + ], + "criticality": "critical", + "semantic_tags": [ + "inventory_root", + "clarification_required", + "warehouse_not_required" + ] + }, + { + "step_id": "step_04_company_choice_resumes_inventory_without_warehouse", + "title": "Company choice resumes root inventory snapshot without asking for warehouse or item", + "question": "АЛЬТЕРНАТИВА", + "allowed_reply_types": [ + "factual", + "factual_with_explanation", + "partial_coverage" + ], + "expected_intents": [ + "inventory_on_hand_as_of_date" + ], + "required_filters": { + "as_of_date": "{{runtime.today_iso}}", + "organization": "ООО Альтернатива Плюс" + }, + "forbidden_filter_keys": [ + "warehouse", + "item", + "category" + ], + "required_direct_answer_patterns_any": [ + "(?i)на складе|остат", + "{{runtime.today_dot_regex}}" + ], + "forbidden_direct_answer_patterns": [ + "(?i)^отлично, фиксирую рабочую организацию", + "(?i)^фиксирую рабочую организацию", + "(?i)уточните организацию", + "(?i)какой\\s+конкретн.*склад", + "(?i)какой\\s+склад", + "(?i)конкретн.*товар", + "(?i)названи[ея]\\s+товар", + "(?i)категор", + "(?i)материал", + "(?i)\\bMCP\\b", + "(?i)route_candidate", + "(?i)primitive", + "(?i)planner_", + "(?i)snapshot_items", + "(?i)answer_object" + ], + "criticality": "critical", + "semantic_tags": [ + "clarification_resume", + "inventory_root", + "warehouse_not_required" + ] + }, + { + "step_id": "step_05_historical_inventory_capability_no_warehouse_reask", + "title": "Historical inventory capability follow-up stays human and does not ask for warehouse", + "question": "а исторические остатки на другие даты умеешь?", + "allowed_reply_types": [ + "factual", + "factual_with_explanation" + ], + "required_answer_patterns_any": [ + "(?i)историческ|история", + "(?i)могу|умею" + ], + "forbidden_answer_patterns": [ + "(?i)\\bMCP\\b", + "(?i)route_candidate", + "(?i)primitive", + "(?i)planner_", + "(?i)snapshot_items", + "(?i)answer_object", + "(?i)какой\\s+конкретн.*склад", + "(?i)какой\\s+склад", + "(?i)конкретн.*товар", + "(?i)названи[ея]\\s+товар", + "(?i)категор", + "(?i)материал" + ], + "criticality": "important", + "semantic_tags": [ + "inventory_capability_meta", + "warehouse_not_required" + ] + }, + { + "step_id": "step_06_inventory_june_2017_after_capability", + "title": "Month-only follow-up keeps inventory contour and selected organization", + "question": "давай на июнь 2017", + "allowed_reply_types": [ + "factual", + "factual_with_explanation", + "partial_coverage" + ], + "expected_intents": [ + "inventory_on_hand_as_of_date" + ], + "required_filters": { + "as_of_date": "2017-06-30", + "period_from": "2017-06-01", + "period_to": "2017-06-30", + "organization": "ООО Альтернатива Плюс" + }, + "forbidden_filter_keys": [ + "warehouse", + "item", + "category" + ], + "required_direct_answer_patterns_any": [ + "30\\.06\\.2017", + "(?i)на складе|остат" + ], + "forbidden_direct_answer_patterns": [ + "(?i)^отлично, фиксирую рабочую организацию", + "(?i)уточните организацию", + "(?i)какой\\s+конкретн.*склад", + "(?i)какой\\s+склад", + "(?i)конкретн.*товар", + "(?i)названи[ея]\\s+товар", + "(?i)категор", + "(?i)материал", + "(?i)\\bMCP\\b", + "(?i)route_candidate", + "(?i)primitive", + "(?i)planner_", + "(?i)snapshot_items", + "(?i)answer_object" + ], + "criticality": "critical", + "semantic_tags": [ + "historical_inventory", + "date_followup", + "warehouse_not_required" + ] + }, + { + "step_id": "step_07_inventory_march_2016_stays_root", + "title": "Another short month stays in inventory root and organization scope", + "question": "март 2016", + "allowed_reply_types": [ + "factual", + "factual_with_explanation", + "partial_coverage" + ], + "expected_intents": [ + "inventory_on_hand_as_of_date" + ], + "required_filters": { + "as_of_date": "2016-03-31", + "period_from": "2016-03-01", + "period_to": "2016-03-31", + "organization": "ООО Альтернатива Плюс" + }, + "forbidden_filter_keys": [ + "warehouse", + "item", + "category" + ], + "required_direct_answer_patterns_any": [ + "31\\.03\\.2016", + "(?i)на складе|остат" + ], + "forbidden_direct_answer_patterns": [ + "(?i)^отлично, фиксирую рабочую организацию", + "(?i)уточните организацию", + "(?i)какой\\s+конкретн.*склад", + "(?i)какой\\s+склад", + "(?i)конкретн.*товар", + "(?i)названи[ея]\\s+товар", + "(?i)категор", + "(?i)материал", + "(?i)\\bMCP\\b", + "(?i)route_candidate", + "(?i)primitive", + "(?i)planner_", + "(?i)snapshot_items", + "(?i)answer_object" + ], + "criticality": "critical", + "semantic_tags": [ + "historical_inventory", + "date_followup", + "warehouse_not_required" + ] + } + ] +} diff --git a/docs/orchestration/address_truth_harness_phase102_debt_mirror_clean_scope_polarity.json b/docs/orchestration/address_truth_harness_phase102_debt_mirror_clean_scope_polarity.json new file mode 100644 index 0000000..1093f8c --- /dev/null +++ b/docs/orchestration/address_truth_harness_phase102_debt_mirror_clean_scope_polarity.json @@ -0,0 +1,207 @@ +{ + "schema_version": "domain_truth_harness_spec_v1", + "scenario_id": "address_truth_harness_phase102_debt_mirror_clean_scope_polarity", + "domain": "address_phase102_debt_mirror_clean_scope_polarity", + "title": "Phase 102 debt mirror clean-scope polarity replay", + "description": "Focused semantic replay for the assistant-stage1-87gHJCwTI9 concern: mirrored 76.* settlement rows around Комитет государственных услуг must not look like the same clean debt in both directions. The assistant may disclose the mirrored/offset part, but the direct business answer must keep payables and receivables polarity clean.", + "bindings": {}, + "steps": [ + { + "step_id": "step_01_choose_company_scope", + "title": "Select Альтернатива Плюс as active organization", + "question": "Альтернатива Плюс", + "required_answer_patterns_all": [ + "(?i)альтернатива плюс", + "(?i)зафиксир|работаем|организац" + ], + "forbidden_answer_patterns": [ + "(?i)\\bMCP\\b", + "(?i)route_candidate", + "(?i)primitive", + "(?i)planner_", + "(?i)snapshot_items", + "(?i)answer_object" + ], + "criticality": "important", + "semantic_tags": [ + "company_selected", + "organization_scope" + ] + }, + { + "step_id": "step_02_clean_payables_today", + "title": "Clean payables root excludes mirrored settlement offset from direct answer", + "question": "мы кому реально должны денег на сегодня? коротко, чистый долг к оплате, без встречных обеспечений как основного долга", + "allowed_reply_types": [ + "factual", + "factual_with_explanation" + ], + "expected_intents": [ + "payables_confirmed_as_of_date" + ], + "required_filters": { + "as_of_date": "{{runtime.today_iso}}", + "organization": "ООО Альтернатива Плюс" + }, + "required_direct_answer_patterns_any": [ + "(?i)долг к оплате|должны|кредитор|к оплате" + ], + "forbidden_direct_answer_patterns": [ + "(?i)комитет.*3\\s*[\\.,]?\\s*677\\s*[\\.,]?\\s*454", + "(?i)3\\s*[\\.,]?\\s*677\\s*[\\.,]?\\s*454.*комитет" + ], + "forbidden_answer_patterns": [ + "(?i)\\bMCP\\b", + "(?i)route_candidate", + "(?i)primitive", + "(?i)planner_", + "(?i)snapshot_items", + "(?i)answer_object" + ], + "criticality": "critical", + "semantic_tags": [ + "payables_snapshot", + "debt_mirror_clean_scope", + "direct_answer_first" + ] + }, + { + "step_id": "step_03_committee_payable_is_not_clean_debt", + "title": "Committee payable check explains mirrored/offset nature instead of clean debt overclaim", + "question": "а Комитету государственных услуг мы реально должны эти 3,6 млн или это встречное обеспечение/зачет? объясни именно по смыслу долга", + "allowed_reply_types": [ + "factual", + "factual_with_explanation", + "partial_coverage" + ], + "required_answer_patterns_all": [ + "(?i)комитет", + "(?i)встречн|обеспеч|зач[её]т|исключ|не.*чист|не.*основн" + ], + "forbidden_direct_answer_patterns": [ + "(?i)^.*мы\\s+должны.*комитет.*3\\s*[\\.,]?\\s*677\\s*[\\.,]?\\s*454", + "(?i)^.*комитет.*чист.*долг.*3\\s*[\\.,]?\\s*677\\s*[\\.,]?\\s*454" + ], + "forbidden_answer_patterns": [ + "(?i)\\bMCP\\b", + "(?i)route_candidate", + "(?i)primitive", + "(?i)planner_", + "(?i)snapshot_items", + "(?i)answer_object" + ], + "criticality": "critical", + "semantic_tags": [ + "payables_counterparty_check", + "debt_mirror_clean_scope", + "polarity_honesty" + ] + }, + { + "step_id": "step_04_committee_receivable_is_not_clean_debt", + "title": "Mirrored receivable question does not present the same amount as clean customer debt", + "question": "а нам Комитет государственных услуг тоже должен 3,6 млн? это же та же сумма — скажи честно, это дебиторка или встречная часть?", + "allowed_reply_types": [ + "factual", + "factual_with_explanation", + "partial_coverage" + ], + "required_answer_patterns_all": [ + "(?i)комитет", + "(?i)встречн|обеспеч|зач[её]т|исключ|не.*чист|не.*основн|не.*дебитор" + ], + "forbidden_direct_answer_patterns": [ + "(?i)^.*нам\\s+долж.*комитет.*3\\s*[\\.,]?\\s*677\\s*[\\.,]?\\s*454", + "(?i)^.*комитет.*чист.*дебитор.*3\\s*[\\.,]?\\s*677\\s*[\\.,]?\\s*454" + ], + "forbidden_answer_patterns": [ + "(?i)\\bMCP\\b", + "(?i)route_candidate", + "(?i)primitive", + "(?i)planner_", + "(?i)snapshot_items", + "(?i)answer_object" + ], + "criticality": "critical", + "semantic_tags": [ + "receivables_counterparty_check", + "debt_mirror_clean_scope", + "polarity_honesty" + ] + }, + { + "step_id": "step_05_clean_receivables_today", + "title": "Clean receivables root excludes mirrored settlement offset from direct answer", + "question": "тогда кто нам реально должен денег на сегодня? именно чистая дебиторка", + "allowed_reply_types": [ + "factual", + "factual_with_explanation" + ], + "expected_intents": [ + "receivables_confirmed_as_of_date" + ], + "required_filters": { + "as_of_date": "{{runtime.today_iso}}", + "organization": "ООО Альтернатива Плюс" + }, + "required_direct_answer_patterns_any": [ + "(?i)нам должны|дебитор|задолж|долг" + ], + "forbidden_direct_answer_patterns": [ + "(?i)комитет.*3\\s*[\\.,]?\\s*677\\s*[\\.,]?\\s*454", + "(?i)3\\s*[\\.,]?\\s*677\\s*[\\.,]?\\s*454.*комитет" + ], + "forbidden_answer_patterns": [ + "(?i)\\bMCP\\b", + "(?i)route_candidate", + "(?i)primitive", + "(?i)planner_", + "(?i)snapshot_items", + "(?i)answer_object" + ], + "criticality": "critical", + "semantic_tags": [ + "receivables_snapshot", + "debt_mirror_clean_scope", + "direct_answer_first" + ] + }, + { + "step_id": "step_06_payables_mirror_followup_keeps_clean_scope", + "title": "Short mirror follow-up returns clean payables again, not the mirrored Committee amount", + "question": "а мы кому?", + "allowed_reply_types": [ + "factual", + "factual_with_explanation" + ], + "expected_intents": [ + "payables_confirmed_as_of_date" + ], + "required_filters": { + "as_of_date": "{{runtime.today_iso}}", + "organization": "ООО Альтернатива Плюс" + }, + "required_direct_answer_patterns_any": [ + "(?i)долг к оплате|должны|кредитор|к оплате" + ], + "forbidden_direct_answer_patterns": [ + "(?i)комитет.*3\\s*[\\.,]?\\s*677\\s*[\\.,]?\\s*454", + "(?i)3\\s*[\\.,]?\\s*677\\s*[\\.,]?\\s*454.*комитет" + ], + "forbidden_answer_patterns": [ + "(?i)\\bMCP\\b", + "(?i)route_candidate", + "(?i)primitive", + "(?i)planner_", + "(?i)snapshot_items", + "(?i)answer_object" + ], + "criticality": "critical", + "semantic_tags": [ + "settlements_mirror_followup", + "payables_snapshot", + "debt_mirror_clean_scope" + ] + } + ] +} diff --git a/docs/orchestration/address_truth_harness_phase103_financial_role_purpose_arbitration.json b/docs/orchestration/address_truth_harness_phase103_financial_role_purpose_arbitration.json new file mode 100644 index 0000000..a6c1b8c --- /dev/null +++ b/docs/orchestration/address_truth_harness_phase103_financial_role_purpose_arbitration.json @@ -0,0 +1,176 @@ +{ + "schema_version": "domain_truth_harness_spec_v1", + "scenario_id": "address_truth_harness_phase103_financial_role_purpose_arbitration", + "domain": "address_phase103_financial_role_purpose_arbitration", + "title": "Phase 103 financial role and purpose arbitration replay", + "description": "Focused semantic replay for the Open-World Schema/Primitive Discovery slice: bank-like counterparties must be explained through confirmed 1C operation/purpose evidence and must not contaminate ordinary customer, supplier, revenue, or procurement semantics. The replay starts with a bare organization choice and ends with a normal counterparty value-flow canary.", + "bindings": {}, + "steps": [ + { + "step_id": "step_01_choose_company_scope", + "title": "Bare organization choice binds active company scope", + "question": "Альтернатива Плюс", + "allowed_reply_types": [ + "factual_with_explanation", + "partial_coverage" + ], + "required_answer_patterns_all": [ + "(?i)фиксир|рабоч.*организац|контур", + "(?i)альтернатива" + ], + "forbidden_answer_patterns": [ + "(?i)программ|продукт|услуг.*компан", + "(?i)не могу определить", + "(?i)route_candidate|primitive|planner_|catalog_|snapshot_items|answer_object" + ], + "criticality": "critical", + "semantic_tags": [ + "bare_org_scope", + "phase102_canary" + ] + }, + { + "step_id": "step_02_sberbank_role_purpose_summary", + "title": "Sberbank role is answered by operation and purpose evidence", + "question": "По СБЕРБАНКУ за 2020 покажи коротко: сколько денег входило и уходило, и что это по смыслу в 1С — клиентская выручка, поставщик, комиссия, кредит или другой финансовый поток?", + "allowed_reply_types": [ + "factual", + "factual_with_explanation", + "partial_coverage" + ], + "required_answer_patterns_all": [ + "(?i)сбербанк|банк|финансов", + "(?i)вход|поступ|исход|списан|ушл", + "(?i)комисс|кредит|возврат|вид операц|назначен|договор", + "(?i)не.*обычн.*клиент|не.*обычн.*поставщик|не.*клиентск.*выручк|нельзя.*читать" + ], + "forbidden_answer_patterns": [ + "(?i)обычный поставщик.*сбербанк", + "(?i)обычный клиент.*сбербанк", + "(?i)главный поставщик.*сбербанк", + "(?i)главный клиент.*сбербанк", + "(?i)route_candidate|primitive|planner_|catalog_|snapshot_items|answer_object" + ], + "criticality": "critical", + "semantic_tags": [ + "financial_role_purpose", + "bank_like_counterparty", + "bank_operations_by_counterparty" + ] + }, + { + "step_id": "step_03_sberbank_incoming_customer_boundary", + "title": "Incoming Sberbank money is not overclaimed as customer revenue", + "question": "Если СБЕРБАНК есть во входящих поступлениях, можно ли считать его нашим клиентом и выручкой? Скажи по подтвержденным строкам, без притягивания.", + "allowed_reply_types": [ + "factual", + "factual_with_explanation", + "partial_coverage" + ], + "required_answer_patterns_all": [ + "(?i)сбербанк|банк", + "(?i)не.*клиент|не.*выручк|нельзя.*считать|не подтвержд", + "(?i)поступ|вид операц|договор|возврат|кредит|депозит|финансов" + ], + "forbidden_answer_patterns": [ + "(?i)сбербанк.*обычный клиент", + "(?i)сбербанк.*клиентская выручка", + "(?i)точно.*выручка", + "(?i)route_candidate|primitive|planner_|catalog_|snapshot_items|answer_object" + ], + "criticality": "critical", + "semantic_tags": [ + "bank_like_customer_boundary", + "customer_revenue_and_payments" + ] + }, + { + "step_id": "step_04_sberbank_outgoing_supplier_boundary", + "title": "Outgoing Sberbank money is not overclaimed as supplier dependency", + "question": "А если деньги уходили в СБЕРБАНК, это наш поставщик или финансовые списания? Раздели по смыслу и покажи основание.", + "allowed_reply_types": [ + "factual", + "factual_with_explanation", + "partial_coverage" + ], + "required_answer_patterns_all": [ + "(?i)сбербанк|банк", + "(?i)не.*поставщик|не.*обычн.*поставщик|финансов|банковск", + "(?i)комисс|назначен|вид операц|списан|договор" + ], + "forbidden_answer_patterns": [ + "(?i)сбербанк.*обычный поставщик", + "(?i)главный поставщик.*сбербанк", + "(?i)точно.*закупк", + "(?i)route_candidate|primitive|planner_|catalog_|snapshot_items|answer_object" + ], + "criticality": "critical", + "semantic_tags": [ + "bank_like_supplier_boundary", + "supplier_payouts_profile" + ] + }, + { + "step_id": "step_05_business_overview_excludes_bank_role_overclaim", + "title": "Business overview keeps bank leaders out of ordinary operating role overclaim", + "question": "Теперь дай взрослый обзор за 2020 по компании: входящие, исходящие, нетто, топы, но банк в топах отдельно объясни как финансовый поток, если по назначению он не обычный клиент или поставщик.", + "allowed_reply_types": [ + "factual", + "factual_with_explanation", + "partial_coverage" + ], + "expected_catalog_alignment_status": "selected_matches_top", + "expected_catalog_chain_top_match": "business_overview", + "expected_catalog_selected_matches_top": true, + "required_answer_patterns_all": [ + "(?i)2020|компан|организац", + "(?i)вход|поступ|исход|списан|нетто", + "(?i)банк|сбербанк|финансов", + "(?i)не.*обычн.*клиент|не.*обычн.*поставщик|не.*выручк|не.*закупк|назначен" + ], + "forbidden_answer_patterns": [ + "(?i)сбербанк.*обычный поставщик", + "(?i)сбербанк.*обычный клиент", + "(?i)чистая прибыль.*точно", + "(?i)route_candidate|primitive|planner_|catalog_|snapshot_items|answer_object" + ], + "criticality": "critical", + "semantic_tags": [ + "business_overview", + "financial_role_purpose", + "bank_like_counterparty" + ] + }, + { + "step_id": "step_06_svk_value_flow_canary_after_bank_context", + "title": "Normal counterparty net flow survives after bank-role context", + "question": "А теперь отдельно по Группа СВК за 2020: сколько денег получили, сколько заплатили и какое нетто?", + "allowed_reply_types": [ + "factual", + "factual_with_explanation", + "partial_coverage" + ], + "expected_catalog_alignment_status": "selected_matches_top", + "expected_catalog_chain_top_match": "value_flow_comparison", + "expected_catalog_selected_matches_top": true, + "required_answer_patterns_all": [ + "(?i)свк|группа", + "(?i)2020", + "(?i)получ|вход|поступ", + "(?i)заплат|исход|списан", + "(?i)нетто|сальдо|разниц" + ], + "forbidden_answer_patterns": [ + "(?i)сбербанк", + "(?i)уточните организац|какую компанию", + "(?i)route_candidate|primitive|planner_|catalog_|snapshot_items|answer_object" + ], + "criticality": "critical", + "semantic_tags": [ + "counterparty_net_cash_flow", + "stale_scope_guard", + "canary" + ] + } + ] +} diff --git a/docs/orchestration/address_truth_harness_phase104_generic_role_tail_anchor_hygiene.json b/docs/orchestration/address_truth_harness_phase104_generic_role_tail_anchor_hygiene.json new file mode 100644 index 0000000..07045c0 --- /dev/null +++ b/docs/orchestration/address_truth_harness_phase104_generic_role_tail_anchor_hygiene.json @@ -0,0 +1,138 @@ +{ + "schema_version": "domain_truth_harness_spec_v1", + "scenario_id": "address_truth_harness_phase104_generic_role_tail_anchor_hygiene", + "domain": "address_phase104_generic_role_tail_anchor_hygiene", + "title": "Phase 104 generic role-tail anchor hygiene replay", + "description": "Focused semantic replay for the Open-World Schema/Primitive Discovery slice: broad company overview wording may mention ordinary customer/supplier roles as business semantics, but generic role tails such as 'или поставщик' must not become a counterparty anchor or poison later bank/counterparty drilldowns.", + "bindings": {}, + "steps": [ + { + "step_id": "step_01_choose_company_scope", + "title": "Bare organization choice binds active company scope", + "question": "Альтернатива Плюс", + "allowed_reply_types": [ + "factual_with_explanation", + "partial_coverage" + ], + "required_answer_patterns_all": [ + "(?i)фиксир|рабоч.*организац|контур", + "(?i)альтернатива" + ], + "forbidden_answer_patterns": [ + "(?i)не могу определить", + "(?i)route_candidate|primitive|planner_|catalog_|snapshot_items|answer_object" + ], + "criticality": "critical", + "semantic_tags": [ + "bare_org_scope", + "phase102_canary" + ] + }, + { + "step_id": "step_02_business_overview_role_tail_not_counterparty", + "title": "Business overview role words do not become counterparty anchors", + "question": "Теперь дай взрослый обзор за 2020 по компании: входящие, исходящие, нетто, топы, но банк в топах отдельно объясни как финансовый поток, если по назначению он не обычный клиент или поставщик.", + "allowed_reply_types": [ + "factual", + "factual_with_explanation", + "partial_coverage" + ], + "expected_catalog_alignment_status": "selected_matches_top", + "expected_catalog_chain_top_match": "business_overview", + "expected_catalog_selected_matches_top": true, + "forbidden_filter_keys": [ + "counterparty" + ], + "forbidden_filter_values": { + "counterparty": [ + "или поставщик", + "поставщик", + "клиент", + "обычный клиент", + "обычный клиент или поставщик" + ] + }, + "required_answer_patterns_all": [ + "(?i)2020|компан|организац", + "(?i)вход|поступ|исход|списан|нетто", + "(?i)банк|сбербанк|финансов", + "(?i)не.*обычн.*клиент|не.*обычн.*поставщик|не.*выручк|не.*закупк|назначен" + ], + "forbidden_answer_patterns": [ + "(?i)сбербанк.*обычный поставщик", + "(?i)сбербанк.*обычный клиент", + "(?i)чистая прибыль.*точно", + "(?i)route_candidate|primitive|planner_|catalog_|snapshot_items|answer_object" + ], + "criticality": "critical", + "semantic_tags": [ + "business_overview", + "generic_role_tail_anchor_hygiene", + "financial_role_purpose" + ] + }, + { + "step_id": "step_03_sberbank_role_after_role_tail_overview", + "title": "Bank-role drilldown still resolves Sberbank after role-tail overview", + "question": "А отдельно по СБЕРБАНКУ: он для нас клиент, поставщик или финансовый поток? Дай коротко по подтвержденным строкам.", + "allowed_reply_types": [ + "factual", + "factual_with_explanation", + "partial_coverage" + ], + "required_filters": { + "counterparty": "СБЕРБАНК" + }, + "required_answer_patterns_all": [ + "(?i)сбербанк|банк", + "(?i)финансов|банковск|кредит|комисс|возврат|назначен|вид операц", + "(?i)не.*обычн.*клиент|не.*обычн.*поставщик|не.*клиентск.*выручк|нельзя.*читать" + ], + "forbidden_answer_patterns": [ + "(?i)или поставщик", + "(?i)сбербанк.*обычный поставщик", + "(?i)сбербанк.*обычный клиент", + "(?i)route_candidate|primitive|planner_|catalog_|snapshot_items|answer_object" + ], + "criticality": "critical", + "semantic_tags": [ + "bank_like_counterparty", + "financial_role_purpose", + "post_overview_anchor_integrity" + ] + }, + { + "step_id": "step_04_real_supplier_prefix_still_keeps_counterparty", + "title": "Explicit supplier prefix still preserves a real counterparty", + "question": "А теперь по поставщику Группа СВК за 2020: сколько мы ему заплатили и какой общий денежный смысл?", + "allowed_reply_types": [ + "factual", + "factual_with_explanation", + "partial_coverage" + ], + "expected_catalog_alignment_status": "selected_matches_top", + "expected_catalog_chain_top_match": "value_flow", + "expected_catalog_selected_matches_top": true, + "required_filters": { + "counterparty": "Группа СВК" + }, + "required_answer_patterns_all": [ + "(?i)свк|группа", + "(?i)2020", + "(?i)заплат|исход|списан|получ|поступ|нетто|смысл" + ], + "forbidden_answer_patterns": [ + "(?i)или поставщик", + "(?i)сбербанк", + "(?i)уточните организац|какую компанию", + "(?i)route_candidate|primitive|planner_|catalog_|snapshot_items|answer_object" + ], + "criticality": "critical", + "semantic_tags": [ + "counterparty_net_cash_flow", + "supplier_prefix_canary", + "stale_scope_guard" + ] + } + ] +} diff --git a/docs/orchestration/address_truth_harness_phase105_mixed_schema_primitive_closure.json b/docs/orchestration/address_truth_harness_phase105_mixed_schema_primitive_closure.json new file mode 100644 index 0000000..979788e --- /dev/null +++ b/docs/orchestration/address_truth_harness_phase105_mixed_schema_primitive_closure.json @@ -0,0 +1,429 @@ +{ + "schema_version": "domain_truth_harness_spec_v1", + "scenario_id": "address_truth_harness_phase105_mixed_schema_primitive_closure", + "domain": "address_phase105_mixed_schema_primitive_closure", + "title": "Phase 105 mixed schema/primitive closure replay", + "description": "A broad live semantic replay for the Open-World Schema/Primitive Discovery module. It intentionally crosses old and new seams: inventory root clarification, historical inventory carryover, business overview role-tail hygiene, bank role/purpose boundaries, supplier payout wording, bidirectional value-flow, clean debt polarity, VAT continuity, and cash-flow-vs-profit answer shape.", + "bindings": {}, + "steps": [ + { + "step_id": "step_01_inventory_root_requires_company_only", + "title": "Broad inventory root uses resolved organization and does not ask for warehouse", + "question": "кайф - что там на складе по остаткам?", + "allowed_reply_types": [ + "factual", + "factual_with_explanation", + "partial_coverage" + ], + "expected_intents": [ + "inventory_on_hand_as_of_date" + ], + "required_filters": { + "as_of_date": "{{runtime.today_iso}}", + "organization": "ООО Альтернатива Плюс" + }, + "forbidden_filter_keys": [ + "warehouse", + "item", + "category", + "counterparty" + ], + "required_direct_answer_patterns_any": [ + "(?i)на складе|остат", + "{{runtime.today_dot_regex}}" + ], + "forbidden_answer_patterns": [ + "(?i)какой\\s+конкретн.*склад", + "(?i)какой\\s+склад", + "(?i)конкретн.*товар", + "(?i)названи[ея]\\s+товар", + "(?i)категор", + "(?i)материал", + "(?i)route_candidate|primitive|planner_|snapshot_items|answer_object" + ], + "criticality": "critical", + "semantic_tags": [ + "inventory_root", + "resolved_organization_scope", + "warehouse_not_required" + ] + }, + { + "step_id": "step_02_redundant_company_choice_scope_ack", + "title": "Redundant company choice only updates organization scope", + "question": "АЛЬТЕРНАТИВА", + "allowed_reply_types": [ + "factual_with_explanation", + "partial_coverage" + ], + "forbidden_filter_keys": [ + "warehouse", + "item", + "counterparty", + "category" + ], + "required_answer_patterns_all": [ + "(?i)альтернатива плюс", + "(?i)фиксир|организац|контур|компан" + ], + "forbidden_direct_answer_patterns": [ + "(?i)уточните организацию", + "(?i)какой\\s+конкретн.*склад", + "(?i)какой\\s+склад", + "(?i)конкретн.*товар", + "(?i)названи[ея]\\s+товар", + "(?i)категор", + "(?i)материал", + "(?i)route_candidate|primitive|planner_|snapshot_items|answer_object" + ], + "criticality": "critical", + "semantic_tags": [ + "redundant_scope_selection", + "organization_scope", + "warehouse_not_required" + ] + }, + { + "step_id": "step_03_historical_inventory_meta", + "title": "Historical inventory capability follow-up stays human", + "question": "а исторические остатки на другие даты умеешь?", + "allowed_reply_types": [ + "factual", + "factual_with_explanation" + ], + "required_answer_patterns_any": [ + "(?i)историческ|история", + "(?i)могу|умею" + ], + "forbidden_answer_patterns": [ + "(?i)какой\\s+конкретн.*склад", + "(?i)какой\\s+склад", + "(?i)конкретн.*товар", + "(?i)названи[ея]\\s+товар", + "(?i)категор", + "(?i)материал", + "(?i)route_candidate|primitive|planner_|snapshot_items|answer_object" + ], + "criticality": "important", + "semantic_tags": [ + "inventory_capability_meta", + "warehouse_not_required" + ] + }, + { + "step_id": "step_04_inventory_june_2017_followup", + "title": "Month-only follow-up keeps inventory contour and selected organization", + "question": "давай на июнь 2017", + "allowed_reply_types": [ + "factual", + "factual_with_explanation", + "partial_coverage" + ], + "expected_intents": [ + "inventory_on_hand_as_of_date" + ], + "required_filters": { + "as_of_date": "2017-06-30", + "period_from": "2017-06-01", + "period_to": "2017-06-30", + "organization": "ООО Альтернатива Плюс" + }, + "forbidden_filter_keys": [ + "warehouse", + "item", + "category" + ], + "required_direct_answer_patterns_any": [ + "30\\.06\\.2017", + "(?i)на складе|остат" + ], + "forbidden_direct_answer_patterns": [ + "(?i)уточните организацию", + "(?i)какой\\s+конкретн.*склад", + "(?i)какой\\s+склад", + "(?i)конкретн.*товар", + "(?i)названи[ея]\\s+товар", + "(?i)категор", + "(?i)материал", + "(?i)route_candidate|primitive|planner_|snapshot_items|answer_object" + ], + "criticality": "critical", + "semantic_tags": [ + "historical_inventory", + "date_followup", + "warehouse_not_required" + ] + }, + { + "step_id": "step_05_business_overview_role_tail_hygiene", + "title": "Business overview keeps role-tail words out of counterparty filters", + "question": "Теперь дай взрослый обзор за 2020 по компании: входящие, исходящие, нетто, топы, но банк в топах отдельно объясни как финансовый поток, если по назначению он не обычный клиент или поставщик.", + "allowed_reply_types": [ + "factual", + "factual_with_explanation", + "partial_coverage" + ], + "expected_catalog_alignment_status": "selected_matches_top", + "expected_catalog_chain_top_match": "business_overview", + "expected_catalog_selected_matches_top": true, + "forbidden_filter_keys": [ + "counterparty" + ], + "required_answer_patterns_all": [ + "(?i)2020|компан|организац", + "(?i)вход|поступ|исход|списан|нетто", + "(?i)банк|сбербанк|финансов", + "(?i)не.*обычн.*клиент|не.*обычн.*поставщик|не.*выручк|не.*закупк|назначен" + ], + "forbidden_answer_patterns": [ + "(?i)сбербанк.*обычный поставщик", + "(?i)сбербанк.*обычный клиент", + "(?i)route_candidate|primitive|planner_|snapshot_items|answer_object" + ], + "criticality": "critical", + "semantic_tags": [ + "business_overview", + "generic_role_tail_anchor_hygiene" + ] + }, + { + "step_id": "step_06_sberbank_role_after_overview", + "title": "Sberbank role drilldown stays on bank-operation evidence", + "question": "А отдельно по СБЕРБАНКУ: он для нас клиент, поставщик или финансовый поток? Дай коротко по подтвержденным строкам.", + "allowed_reply_types": [ + "factual", + "factual_with_explanation", + "partial_coverage" + ], + "required_filters": { + "counterparty": "СБЕРБАНК" + }, + "required_answer_patterns_all": [ + "(?i)сбербанк|банк", + "(?i)финансов|банковск|кредит|комисс|возврат|назначен|вид операц", + "(?i)не.*обычн.*клиент|не.*обычн.*поставщик|не.*клиентск.*выручк|нельзя.*читать" + ], + "forbidden_answer_patterns": [ + "(?i)или поставщик", + "(?i)сбербанк.*обычный поставщик", + "(?i)сбербанк.*обычный клиент", + "(?i)route_candidate|primitive|planner_|snapshot_items|answer_object" + ], + "criticality": "critical", + "semantic_tags": [ + "bank_like_counterparty", + "financial_role_purpose" + ] + }, + { + "step_id": "step_07_supplier_payment_wording", + "title": "Supplier payment wording routes as supplier payout, not population", + "question": "А теперь по поставщику Группа СВК за 2020: сколько мы ему заплатили и какой общий денежный смысл?", + "allowed_reply_types": [ + "factual", + "factual_with_explanation", + "partial_coverage" + ], + "expected_intents": [ + "supplier_payouts_profile" + ], + "expected_catalog_alignment_status": "selected_matches_top", + "expected_catalog_chain_top_match": "value_flow", + "expected_catalog_selected_matches_top": true, + "required_answer_patterns_all": [ + "(?i)свк|группа", + "(?i)2020", + "(?i)заплат|исход|списан|платеж|платёж|денежн|смысл" + ], + "forbidden_answer_patterns": [ + "(?i)или поставщик", + "(?i)сбербанк", + "(?i)уточните организац|какую компанию", + "(?i)route_candidate|primitive|planner_|snapshot_items|answer_object" + ], + "criticality": "critical", + "semantic_tags": [ + "supplier_prefix_canary", + "value_flow" + ] + }, + { + "step_id": "step_08_svk_bidirectional_value_flow", + "title": "Normal counterparty bidirectional value flow survives after supplier-only ask", + "question": "А теперь по Группа СВК за 2020: сколько денег получили, сколько заплатили и какое нетто?", + "allowed_reply_types": [ + "factual", + "factual_with_explanation", + "partial_coverage" + ], + "expected_catalog_alignment_status": "selected_matches_top", + "expected_catalog_chain_top_match": "value_flow_comparison", + "expected_catalog_selected_matches_top": true, + "required_answer_patterns_all": [ + "(?i)свк", + "(?i)2020", + "(?i)получ|вход|поступ", + "(?i)заплат|исход|списан", + "(?i)нетто|сальдо|разниц" + ], + "forbidden_answer_patterns": [ + "(?i)сбербанк", + "(?i)уточните организац|какую компанию", + "(?i)route_candidate|primitive|planner_|snapshot_items|answer_object" + ], + "criticality": "critical", + "semantic_tags": [ + "counterparty_net_cash_flow", + "stale_scope_guard" + ] + }, + { + "step_id": "step_09_clean_payables_scope", + "title": "Clean payables question keeps organization scope and debt polarity", + "question": "кому мы должны на конец 2020?", + "allowed_reply_types": [ + "factual", + "factual_with_explanation", + "partial_coverage" + ], + "expected_intents": [ + "payables_confirmed_as_of_date" + ], + "required_filters": { + "as_of_date": "2020-12-31", + "organization": "ООО Альтернатива Плюс" + }, + "forbidden_filter_keys": [ + "counterparty" + ], + "required_answer_patterns_all": [ + "(?i)должн|кредитор|кредиторск|поставщик|обязательств", + "(?i)2020|31\\.12\\.2020|конец" + ], + "forbidden_answer_patterns": [ + "(?i)нам должны.*мы должны", + "(?i)route_candidate|primitive|planner_|snapshot_items|answer_object" + ], + "criticality": "critical", + "semantic_tags": [ + "debt_polarity", + "payables" + ] + }, + { + "step_id": "step_10_clean_receivables_scope", + "title": "Mirror receivables follow-up switches polarity cleanly", + "question": "а нам кто должен на конец 2020?", + "allowed_reply_types": [ + "factual", + "factual_with_explanation", + "partial_coverage" + ], + "expected_intents": [ + "receivables_confirmed_as_of_date" + ], + "required_filters": { + "as_of_date": "2020-12-31", + "organization": "ООО Альтернатива Плюс" + }, + "forbidden_filter_keys": [ + "counterparty" + ], + "required_answer_patterns_all": [ + "(?i)нам.*долж|дебитор|дебиторск|покупател|заказчик", + "(?i)2020|31\\.12\\.2020|конец" + ], + "forbidden_answer_patterns": [ + "(?i)мы должны.*нам должны", + "(?i)route_candidate|primitive|planner_|snapshot_items|answer_object" + ], + "criticality": "critical", + "semantic_tags": [ + "debt_polarity", + "receivables" + ] + }, + { + "step_id": "step_11_vat_tax_period_continuity", + "title": "VAT tax-period exact question survives after debt/value-flow pivots", + "question": "сколько НДС надо заплатить в налоговую за декабрь 2019?", + "allowed_reply_types": [ + "factual", + "factual_with_explanation", + "partial_coverage" + ], + "expected_intents": [ + "vat_liability_confirmed_for_tax_period" + ], + "required_filters": { + "period_from": "2019-10-01", + "period_to": "2019-12-31", + "organization": "ООО Альтернатива Плюс" + }, + "required_answer_patterns_all": [ + "(?i)ндс", + "(?i)декабр|2019", + "(?i)налог|заплат|уплат|к уплате|не найден" + ], + "forbidden_answer_patterns": [ + "(?i)route_candidate|primitive|planner_|snapshot_items|answer_object" + ], + "criticality": "critical", + "semantic_tags": [ + "vat_continuity", + "tax_period" + ] + }, + { + "step_id": "step_12_earnings_wording_to_business_overview", + "title": "Organization earnings slang routes to business overview, not customer ranking", + "question": "скока денег альтернатива заработала за 20 год?", + "allowed_reply_types": [ + "factual", + "factual_with_explanation", + "partial_coverage" + ], + "expected_catalog_alignment_status": "selected_matches_top", + "expected_catalog_chain_top_match": "business_overview", + "expected_catalog_selected_matches_top": true, + "required_answer_patterns_all": [ + "(?i)альтернатива|компан|организац", + "(?i)2020", + "(?i)поступ|вход|денег|заработ", + "(?i)не.*прибыл|не.*чист" + ], + "forbidden_answer_patterns": [ + "(?i)route_candidate|primitive|planner_|snapshot_items|answer_object" + ], + "criticality": "critical", + "semantic_tags": [ + "business_overview", + "earnings_wording" + ] + }, + { + "step_id": "step_13_profit_followup_boundary", + "title": "Profit follow-up distinguishes cash-flow net from clean profit", + "question": "а это чистая прибыль?", + "allowed_reply_types": [ + "factual", + "factual_with_explanation", + "partial_coverage" + ], + "required_answer_patterns_all": [ + "(?i)не.*чист.*прибыл|не.*является.*прибыл|это.*не.*прибыл", + "(?i)поступ|денежн|cash|нетто|расход|себестоим|закрытие" + ], + "forbidden_answer_patterns": [ + "(?i)это чистая прибыль", + "(?i)точно.*прибыл", + "(?i)route_candidate|primitive|planner_|snapshot_items|answer_object" + ], + "criticality": "critical", + "semantic_tags": [ + "profit_boundary", + "business_overview_followup" + ] + } + ] +} diff --git a/docs/orchestration/address_truth_harness_phase99_large_query_budget_continuation.json b/docs/orchestration/address_truth_harness_phase99_large_query_budget_continuation.json new file mode 100644 index 0000000..71db079 --- /dev/null +++ b/docs/orchestration/address_truth_harness_phase99_large_query_budget_continuation.json @@ -0,0 +1,144 @@ +{ + "schema_version": "domain_truth_harness_spec_v1", + "scenario_id": "address_truth_harness_phase99_large_query_budget_continuation", + "domain": "address_phase99_large_query_budget_continuation", + "title": "Phase 99 large-query budget and continuation policy replay", + "description": "Focused semantic replay for explicit-year large business questions: the assistant should use chunked 1C evidence where available, answer direct-first, avoid fake profit claims, keep bank/counterparty boundaries, and not collapse into row-limit refusal wording when yearly money coverage can be recovered.", + "bindings": {}, + "steps": [ + { + "step_id": "step_01_year_business_overview_uses_chunked_money_evidence", + "title": "Explicit-year business overview gives direct money summary without row-limit refusal", + "question": "Дай взрослый бизнес-обзор ООО Альтернатива Плюс за 2020: входящие, исходящие, нетто, кто основные источники денег и где важные ограничения.", + "allowed_reply_types": [ + "factual", + "factual_with_explanation", + "partial_coverage" + ], + "required_answer_patterns_all": [ + "(?i)альтернатива", + "(?i)2020", + "(?i)входящ|поступлен", + "(?i)исходящ|списан|платеж", + "(?i)нетто|денежн", + "(?i)прибыл|не.*прибыл|не.*бухгалтерск" + ], + "forbidden_answer_patterns": [ + "(?i)уп[её]р.*лимит", + "(?i)лимит выборки", + "(?i)лимит строк", + "(?i)\\bMCP\\b", + "(?i)route_candidate", + "(?i)primitive", + "(?i)planner_", + "(?i)catalog_", + "(?i)snapshot_items", + "(?i)answer_object" + ], + "criticality": "critical", + "semantic_tags": [ + "business_overview", + "large_query_budget", + "direct_answer_first" + ] + }, + { + "step_id": "step_02_followup_profit_boundary_stays_short", + "title": "Follow-up keeps cash-flow versus profit boundary", + "question": "То есть это можно считать прибылью за 2020 или нет? Коротко.", + "allowed_reply_types": [ + "factual", + "factual_with_explanation", + "partial_coverage" + ], + "required_answer_patterns_all": [ + "(?i)нет|нельзя|не стоит", + "(?i)прибыл", + "(?i)денежн|поступлен|поток", + "(?i)себестоим|расход|финрезультат|бухгалтерск" + ], + "forbidden_answer_patterns": [ + "(?i)\\bMCP\\b", + "(?i)route_candidate", + "(?i)primitive", + "(?i)planner_", + "(?i)catalog_", + "(?i)snapshot_items", + "(?i)answer_object" + ], + "criticality": "critical", + "semantic_tags": [ + "profit_boundary", + "followup_directness", + "business_language" + ] + }, + { + "step_id": "step_03_top_money_source_keeps_bank_boundary", + "title": "Top money source keeps bank boundary after broad overview", + "question": "А кто за 2020 принес больше всего денег, и если там банк, не называй его обычным клиентом.", + "allowed_reply_types": [ + "factual", + "factual_with_explanation", + "partial_coverage" + ], + "required_answer_patterns_all": [ + "(?i)2020", + "(?i)деньг|поступлен|входящ", + "(?i)сбербанк|банк|финансов|не.*обычн|не.*клиент" + ], + "forbidden_answer_patterns": [ + "(?i)сбербанк.*обычн.*клиент", + "(?i)сбербанк.*главн.*клиент", + "(?i)уп[её]р.*лимит", + "(?i)лимит выборки", + "(?i)лимит строк", + "(?i)\\bMCP\\b", + "(?i)route_candidate", + "(?i)primitive", + "(?i)planner_", + "(?i)catalog_" + ], + "criticality": "critical", + "semantic_tags": [ + "financial_counterparty_flow_hint", + "large_query_budget", + "context_continuity" + ] + }, + { + "step_id": "step_04_supplier_dependency_uses_checked_outgoing_scope", + "title": "Supplier-dependency question uses checked outgoing scope and avoids hard audit overclaim", + "question": "По исходящим за 2020 есть зависимость от одного поставщика или это только денежная концентрация?", + "allowed_reply_types": [ + "factual", + "factual_with_explanation", + "partial_coverage" + ], + "required_answer_patterns_all": [ + "(?i)2020", + "(?i)исходящ|платеж|списан", + "(?i)поставщик|получател|концентрац", + "(?i)не.*аудит|не.*полный|не.*надежност|не.*качество" + ], + "forbidden_answer_patterns": [ + "(?i)сбербанк.*обычн.*поставщик", + "(?i)сбербанк.*главн.*поставщик", + "(?i)уп[её]р.*лимит", + "(?i)лимит выборки", + "(?i)лимит строк", + "(?i)\\bMCP\\b", + "(?i)route_candidate", + "(?i)primitive", + "(?i)planner_", + "(?i)catalog_" + ], + "criticality": "critical", + "semantic_tags": [ + "vendor_risk_procurement_quality", + "large_query_budget", + "business_language" + ] + } + ] +} diff --git a/llm_normalizer/backend/dist/services/addressFilterExtractor.js b/llm_normalizer/backend/dist/services/addressFilterExtractor.js index dca64c7..dbd6145 100644 --- a/llm_normalizer/backend/dist/services/addressFilterExtractor.js +++ b/llm_normalizer/backend/dist/services/addressFilterExtractor.js @@ -715,6 +715,33 @@ function isLowQualityCounterpartyAnchorValue(rawValue) { const isLowQualityTimeToken = (token) => lowQualityTimeTokens.has(token) || /^(?:январ|феврал|март|апрел|ма(?:й|я|е)|июн|июл|август|сентябр|октябр|ноябр|декабр)/iu.test(token); const lowQualityGenericTokens = new Set([ + "или", + "обычный", + "обычная", + "обычное", + "обычные", + "обычного", + "обычному", + "обычным", + "контрагент", + "контрагента", + "контрагенту", + "клиент", + "клиента", + "клиенту", + "клиентом", + "клиенты", + "поставщик", + "поставщика", + "поставщику", + "поставщиком", + "поставщики", + "покупатель", + "покупателя", + "покупателю", + "заказчик", + "заказчика", + "заказчику", "деньги", "денег", "деньгам", @@ -1240,6 +1267,10 @@ function isLowQualityWarehouseAnchorValue(rawValue) { "лежали", "на", "по", + "остатка", + "остаткам", + "остатками", + "остатков", "компания", "компании", "компанию", @@ -1320,7 +1351,7 @@ function extractInventoryWarehouseAnchor(text) { isLowQualityWarehouseAnchorValue(candidate) || normalizedCandidate.startsWith("по состоянию") || isTemporalWarehousePhrase(candidate) || - /^(?:сейчас|на|дату|дате|остаток|остатки)$/iu.test(candidate)) { + /^(?:сейчас|на|дату|дате|остат(?:ок|ки|ка|кам|ками|ков)|по\s+остат(?:кам|ки|ку|ка|ков))$/iu.test(candidate)) { continue; } return candidate; diff --git a/llm_normalizer/backend/dist/services/addressIntentResolver.js b/llm_normalizer/backend/dist/services/addressIntentResolver.js index e9bc132..525dd91 100644 --- a/llm_normalizer/backend/dist/services/addressIntentResolver.js +++ b/llm_normalizer/backend/dist/services/addressIntentResolver.js @@ -1865,7 +1865,9 @@ function resolveUnicodeAddressIntentBridge(text) { } if (/(?:поставщик|vendor|supplier|кому\s+(?:ушло|платили|заплатили)|выплат|исходящ|списан|сгрузил)/iu.test(normalized) && !/(?:аванс.*(?:не\s+)?закрыт|закрыт.*аванс)/iu.test(normalized) && - (hasMoneyCue || hasRankingCue || /плат[её]ж|оплат|выплат|outflow|payout|хвост|задержк|проблем/iu.test(normalized))) { + (hasMoneyCue || + hasRankingCue || + /заплат|платил|платили|уплат|плат[её]ж|оплат|выплат|outflow|payout|хвост|задержк|проблем/iu.test(normalized))) { return unicodeBridgeResolution(/(?:хвост|задержк|проблем)/iu.test(normalized) ? "list_payables_counterparties" : "supplier_payouts_profile", "high", /(?:хвост|задержк|проблем)/iu.test(normalized) ? "supplier_tail_risk_signal_detected" : "unicode_supplier_payouts_bridge_signal_detected"); @@ -2003,7 +2005,7 @@ function resolveUnicodeAddressIntentBridge(text) { return unicodeBridgeResolution("contract_usage_and_value", "high", "unicode_contract_usage_value_bridge_signal_detected"); } if (/(?:поставщик|vendor|supplier|кому\s+(?:ушло|платили|заплатили)|выплат|исходящ|списан|сгрузил)/iu.test(normalized) && - (hasMoneyCue || hasRankingCue || /плат[её]ж|оплат|выплат|outflow|payout/iu.test(normalized))) { + (hasMoneyCue || hasRankingCue || /заплат|платил|платили|уплат|плат[её]ж|оплат|выплат|outflow|payout/iu.test(normalized))) { return unicodeBridgeResolution("supplier_payouts_profile", "high", "unicode_supplier_payouts_bridge_signal_detected"); } if ((/(?:клиент|покупател|заказчик|контрагент|альтернатива|свк)/iu.test(normalized) || hasRankingCue) && diff --git a/llm_normalizer/backend/dist/services/address_runtime/composeStage.js b/llm_normalizer/backend/dist/services/address_runtime/composeStage.js index aa13e40..343bfe3 100644 --- a/llm_normalizer/backend/dist/services/address_runtime/composeStage.js +++ b/llm_normalizer/backend/dist/services/address_runtime/composeStage.js @@ -306,8 +306,43 @@ function bankOperationDirectionLabel(direction) { } return "банковская операция без надежно распознанного направления"; } -function bankOperationEvidenceLine(rows) { - const sample = rows[0]; +function summarizeBankOperationDirections(rows) { + const summary = { + incoming: { count: 0, amount: 0 }, + outgoing: { count: 0, amount: 0 }, + unknown: { count: 0, amount: 0 } + }; + for (const row of rows) { + const direction = bankOperationDirection(row); + const amount = typeof row.amount === "number" && Number.isFinite(row.amount) ? Math.abs(row.amount) : 0; + summary[direction].count += 1; + summary[direction].amount += amount; + } + const parts = []; + if (summary.incoming.count > 0) { + parts.push(`входящие: ${formatMoneyRub(summary.incoming.amount)} (${summary.incoming.count} строк)`); + } + if (summary.outgoing.count > 0) { + parts.push(`исходящие: ${formatMoneyRub(summary.outgoing.amount)} (${summary.outgoing.count} строк)`); + } + if (summary.unknown.count > 0) { + parts.push(`без распознанного направления: ${formatMoneyRub(summary.unknown.amount)} (${summary.unknown.count} строк)`); + } + return parts.length > 0 + ? `Сводка по направлению: ${parts.join("; ")}.` + : "Сводка по направлению: подтвержденные строки не найдены."; +} +function preferredBankEvidenceDirection(userMessage) { + if (hasBankIncomingRoleBoundaryQuestion(userMessage)) { + return "incoming"; + } + if (hasBankOutgoingRoleBoundaryQuestion(userMessage)) { + return "outgoing"; + } + return null; +} +function bankOperationEvidenceLine(rows, preferredDirection = null) { + const sample = (preferredDirection ? rows.find((row) => bankOperationDirection(row) === preferredDirection) : null) ?? rows[0]; if (!sample) { return "Проверенная строка 1С не найдена."; } @@ -341,12 +376,12 @@ function bankRoleBoundaryLine(userMessage, rows) { const hasOutgoingRow = directions.includes("outgoing"); if (incomingBoundary) { return hasIncomingRow - ? "Выручкой от обычного клиента это не называю автоматически: для банка/финорганизации нужен вид операции, назначение платежа и договор; кредитный, депозитный или возвратный смысл без этих полей не исключаю и не притягиваю." + ? "Это не обычный клиент и не клиентская выручка автоматически: для банка/финорганизации нужен вид операции, назначение платежа и договор; кредитный, депозитный или возвратный смысл без этих полей не исключаю и не притягиваю." : hasOutgoingRow - ? "В найденных строках по банку подтверждено исходящее списание, а входящее поступление от банка в этом срезе не подтверждено; клиентскую выручку, кредит или депозит по этой строке не доказываю." - : "Входящее поступление от банка в найденных строках не подтверждено; клиентскую выручку, кредитный или депозитный смысл без вида операции/назначения платежа не доказываю."; + ? "В найденных строках по банку подтверждено исходящее списание, а входящее поступление от банка в этом срезе не подтверждено; это не подтвержденная клиентская выручка, кредит или депозит." + : "Входящее поступление от банка в найденных строках не подтверждено; это не подтвержденная клиентская выручка, кредитный или депозитный смысл."; } - return "Обычным поставщиком это не называю автоматически: для банка/финорганизации нужен вид операции, назначение платежа и договор; текущий срез подтверждает банковский платежный контур, а не бизнес-роль поставщика."; + return "Это не обычный поставщик автоматически: для банка/финорганизации нужен вид операции, назначение платежа и договор; текущий срез подтверждает банковский платежный контур, а не бизнес-роль поставщика."; } function hasInventoryPurchaseDateActionFocus(userMessage) { const text = normalizeQuestionText(userMessage); @@ -3896,23 +3931,34 @@ function composeFactualReplyBody(intent, rows, options = {}) { .filter((item) => Boolean(item))); const counterparty = resolvePreferredCounterpartyDisplayLabel(options.counterpartyHint, rowCounterparties); const roleBoundary = bankRoleBoundaryLine(options.userMessage, rows); + const visibleRows = rows.slice(0, Math.min(rows.length, 5)); const lines = [ `Коротко: найдено банковских операций${counterparty ? ` по ${counterparty}` : " по контрагенту"} — ${rows.length}.`, + summarizeBankOperationDirections(rows), roleBoundary ?? "Показываю подтвержденные банковские операции из текущего среза.", - bankOperationEvidenceLine(rows), - ...formatTopRows(rows, rows.length) + bankOperationEvidenceLine(rows, preferredBankEvidenceDirection(options.userMessage)), + ...formatTopRows(visibleRows, visibleRows.length) ]; + if (rows.length > visibleRows.length) { + lines.push(`Показаны первые ${visibleRows.length} из ${rows.length}; полный список остается в подтвержденном срезе.`); + } return { responseType: "FACTUAL_LIST", text: lines.join("\n") }; } if (intent === "bank_operations_by_contract") { + const visibleRows = rows.slice(0, Math.min(rows.length, 5)); const lines = [ `Коротко: найдено банковских операций по договору — ${rows.length}.`, + summarizeBankOperationDirections(rows), "Показываю подтвержденные банковские операции из текущего среза.", - ...formatTopRows(rows, rows.length) + bankOperationEvidenceLine(rows), + ...formatTopRows(visibleRows, visibleRows.length) ]; + if (rows.length > visibleRows.length) { + lines.push(`Показаны первые ${visibleRows.length} из ${rows.length}; полный список остается в подтвержденном срезе.`); + } return { responseType: "FACTUAL_LIST", text: lines.join("\n") diff --git a/llm_normalizer/backend/dist/services/address_runtime/decomposeStage.js b/llm_normalizer/backend/dist/services/address_runtime/decomposeStage.js index 8e4f3e9..0be93c3 100644 --- a/llm_normalizer/backend/dist/services/address_runtime/decomposeStage.js +++ b/llm_normalizer/backend/dist/services/address_runtime/decomposeStage.js @@ -139,6 +139,7 @@ const FOLLOWUP_LOW_QUALITY_COUNTERPARTY_TOKENS = new Set([ "что", "все", "всё", + "или", "кроме", "помимо", "этого", @@ -157,6 +158,30 @@ const FOLLOWUP_LOW_QUALITY_COUNTERPARTY_TOKENS = new Set([ "договора", "контрагент", "контрагента", + "контрагенту", + "клиент", + "клиента", + "клиенту", + "клиентом", + "клиенты", + "поставщик", + "поставщика", + "поставщику", + "поставщиком", + "поставщики", + "покупатель", + "покупателя", + "покупателю", + "заказчик", + "заказчика", + "заказчику", + "обычный", + "обычная", + "обычное", + "обычные", + "обычного", + "обычному", + "обычным", "еще", "ещё", "другие", @@ -654,6 +679,19 @@ function hasBroadCounterpartyRankingCue(text) { } return /(?:\bкто\b|\bкакие\b|\bкакой\b|\bтоп\b|\bсписок\b|\bвсе\b|\bвсех\b|\bвсего\b|\bclients?\b|\bcounterpart(?:y|ies)\b|контрагент|клиент|заказчик)/iu.test(normalized); } +function isBroadDebtPolarityQuestion(intent, text) { + if (intent !== "payables_confirmed_as_of_date" && intent !== "receivables_confirmed_as_of_date") { + return false; + } + const normalized = textWithRepairedVariant(String(text ?? "")).toLowerCase().replace(/ё/g, "е"); + if (!/(?:долж|задолж|дебитор|кредитор|обязательств)/iu.test(normalized)) { + return false; + } + if (/(?:по\s+(?:нему|ней|ним|этому|этой|этому\s+контрагенту|этой\s+компании|поставщику|клиенту|покупателю|заказчику)|\bон\b|\bона\b)/iu.test(normalized)) { + return false; + } + return /(?:^|[\s,.;:!?()\-])(?:кто|кому|какие|какой|список|топ|все|всех|всего)(?=$|[\s,.;:!?()\-])/iu.test(normalized); +} function mergeFollowupFilters(current, intent, userMessage, followupContext) { const merged = { ...current }; const reasons = []; @@ -823,10 +861,15 @@ function mergeFollowupFilters(current, intent, userMessage, followupContext) { const inheritedCounterparty = previousCounterparty ?? (followupContext.previous_anchor_type === "counterparty" ? previousAnchorValue : null); const currentCounterparty = toNonEmptyString(merged.counterparty); - const shouldInheritCounterparty = !currentCounterparty || - (Boolean(inheritedCounterparty) && - isLowQualityCounterpartyAnchor(currentCounterparty) && - !isLowQualityCounterpartyAnchor(inheritedCounterparty)); + const suppressCounterpartyForBroadDebtQuestion = isBroadDebtPolarityQuestion(intent, userMessage) && !currentCounterparty; + const shouldInheritCounterparty = !suppressCounterpartyForBroadDebtQuestion && + (!currentCounterparty || + (Boolean(inheritedCounterparty) && + isLowQualityCounterpartyAnchor(currentCounterparty) && + !isLowQualityCounterpartyAnchor(inheritedCounterparty))); + if (inheritedCounterparty && suppressCounterpartyForBroadDebtQuestion) { + reasons.push("counterparty_carryover_suppressed_for_broad_debt_polarity_question"); + } if (inheritedCounterparty && shouldInheritCounterparty) { merged.counterparty = inheritedCounterparty; reasons.push(currentCounterparty ? "counterparty_replaced_from_followup_context" : "counterparty_from_followup_context"); diff --git a/llm_normalizer/backend/dist/services/assistantAddressAttemptRuntimeAdapter.js b/llm_normalizer/backend/dist/services/assistantAddressAttemptRuntimeAdapter.js index f76f99d..5958f1f 100644 --- a/llm_normalizer/backend/dist/services/assistantAddressAttemptRuntimeAdapter.js +++ b/llm_normalizer/backend/dist/services/assistantAddressAttemptRuntimeAdapter.js @@ -69,6 +69,7 @@ async function runAssistantAddressAttemptRuntime(input) { hasLivingChatSignal: input.hasLivingChatSignal, shouldEmitOrganizationSelectionReply: input.shouldEmitOrganizationSelectionReply, hasAssistantCapabilityQuestionSignal: input.hasAssistantCapabilityQuestionSignal, + resolveOrganizationSelectionFromMessage: input.resolveOrganizationSelectionFromMessage, resolveDataScopeProbe: input.resolveDataScopeProbe, applyScriptGuard: input.applyScriptGuard, applyGroundingGuard: input.applyGroundingGuard, diff --git a/llm_normalizer/backend/dist/services/assistantLivingChatAttemptInputBuilder.js b/llm_normalizer/backend/dist/services/assistantLivingChatAttemptInputBuilder.js index 030c0ba..2e4bf7f 100644 --- a/llm_normalizer/backend/dist/services/assistantLivingChatAttemptInputBuilder.js +++ b/llm_normalizer/backend/dist/services/assistantLivingChatAttemptInputBuilder.js @@ -26,6 +26,7 @@ function buildAssistantLivingChatAttemptRuntimeInput(input) { hasLivingChatSignal: input.hasLivingChatSignal, shouldEmitOrganizationSelectionReply: input.shouldEmitOrganizationSelectionReply, hasAssistantCapabilityQuestionSignal: input.hasAssistantCapabilityQuestionSignal, + resolveOrganizationSelectionFromMessage: input.resolveOrganizationSelectionFromMessage, resolveDataScopeProbe: input.resolveDataScopeProbe, applyScriptGuard: input.applyScriptGuard, applyGroundingGuard: input.applyGroundingGuard, diff --git a/llm_normalizer/backend/dist/services/assistantLivingChatAttemptRuntimeAdapter.js b/llm_normalizer/backend/dist/services/assistantLivingChatAttemptRuntimeAdapter.js index 79a31eb..67bb1c3 100644 --- a/llm_normalizer/backend/dist/services/assistantLivingChatAttemptRuntimeAdapter.js +++ b/llm_normalizer/backend/dist/services/assistantLivingChatAttemptRuntimeAdapter.js @@ -41,6 +41,7 @@ async function runAssistantLivingChatAttemptRuntime(input) { hasLivingChatSignal: input.hasLivingChatSignal, shouldEmitOrganizationSelectionReply: input.shouldEmitOrganizationSelectionReply, hasAssistantCapabilityQuestionSignal: input.hasAssistantCapabilityQuestionSignal, + resolveOrganizationSelectionFromMessage: input.resolveOrganizationSelectionFromMessage, resolveDataScopeProbe: input.resolveDataScopeProbe, executeLlmChat, applyScriptGuard: input.applyScriptGuard, diff --git a/llm_normalizer/backend/dist/services/assistantLivingChatAttemptRuntimeInputBuilder.js b/llm_normalizer/backend/dist/services/assistantLivingChatAttemptRuntimeInputBuilder.js index 4215072..525300c 100644 --- a/llm_normalizer/backend/dist/services/assistantLivingChatAttemptRuntimeInputBuilder.js +++ b/llm_normalizer/backend/dist/services/assistantLivingChatAttemptRuntimeInputBuilder.js @@ -36,6 +36,7 @@ function buildAssistantLivingChatHandlerRuntimeInput(input) { hasLivingChatSignal: input.hasLivingChatSignal, shouldEmitOrganizationSelectionReply: input.shouldEmitOrganizationSelectionReply, hasAssistantCapabilityQuestionSignal: input.hasAssistantCapabilityQuestionSignal, + resolveOrganizationSelectionFromMessage: input.resolveOrganizationSelectionFromMessage, resolveDataScopeProbe: input.resolveDataScopeProbe, executeLlmChat: input.executeLlmChat, applyScriptGuard: input.applyScriptGuard, diff --git a/llm_normalizer/backend/dist/services/assistantLivingChatHandlerRuntimeAdapter.js b/llm_normalizer/backend/dist/services/assistantLivingChatHandlerRuntimeAdapter.js index 0fe990c..3b2361d 100644 --- a/llm_normalizer/backend/dist/services/assistantLivingChatHandlerRuntimeAdapter.js +++ b/llm_normalizer/backend/dist/services/assistantLivingChatHandlerRuntimeAdapter.js @@ -26,6 +26,7 @@ async function tryHandleAssistantLivingChatRuntime(input) { hasLivingChatSignal: input.hasLivingChatSignal, shouldEmitOrganizationSelectionReply: input.shouldEmitOrganizationSelectionReply, hasAssistantCapabilityQuestionSignal: input.hasAssistantCapabilityQuestionSignal, + resolveOrganizationSelectionFromMessage: input.resolveOrganizationSelectionFromMessage, resolveDataScopeProbe: input.resolveDataScopeProbe, executeLlmChat: input.executeLlmChat, applyScriptGuard: input.applyScriptGuard, diff --git a/llm_normalizer/backend/dist/services/assistantLivingChatRuntimeAdapter.js b/llm_normalizer/backend/dist/services/assistantLivingChatRuntimeAdapter.js index c040a77..b048186 100644 --- a/llm_normalizer/backend/dist/services/assistantLivingChatRuntimeAdapter.js +++ b/llm_normalizer/backend/dist/services/assistantLivingChatRuntimeAdapter.js @@ -11,6 +11,34 @@ function hasPriorAssistantTurn(items) { } return items.some((item) => item && typeof item === "object" && item.role === "assistant"); } +function shouldProbeBareOrganizationScopeCandidate(input) { + if (input.selectedOrganization || + input.activeOrganization || + input.dataScopeMetaQuery || + input.capabilityMetaQuery || + input.destructiveSignal || + input.dangerSignal || + input.operationalSignal) { + return false; + } + const raw = String(input.userMessage ?? "").trim(); + if (!raw || raw.length > 80 || /[?!]/u.test(raw) || /\d/u.test(raw) || !/\p{L}/u.test(raw)) { + return false; + } + const tokenCount = raw.split(/\s+/u).filter(Boolean).length; + if (tokenCount < 1 || tokenCount > 5) { + return false; + } + const normalized = raw + .toLowerCase() + .replace(/\u0451/gu, "\u0435") + .replace(/\s+/gu, " ") + .trim(); + if (/^(?:\u043f\u0440\u0438\u0432\u0435\u0442|\u0437\u0434\u0440\u0430\u0432\u0441\u0442\u0432\u0443\u0439|\u0437\u0434\u0440\u0430\u0432\u0441\u0442\u0432\u0443\u0439\u0442\u0435|\u0434\u0430|\u043d\u0435\u0442|\u043e\u043a|\u043e\u043a\u0435\u0439|\u0441\u043f\u0430\u0441\u0438\u0431\u043e|\u043f\u043e\u043a\u0430|\u0433\u043e|\u0434\u0430\u043b\u044c\u0448\u0435|\u043f\u043e\u043d\u044f\u043b|\u043f\u043e\u043d\u044f\u043b\u0430)(?:\s|$)/iu.test(normalized)) { + return false; + } + return !/(?:\u0441\u043a\u043e\u043b\u044c\u043a\u043e|\u043f\u043e\u043a\u0430\u0436\u0438|\u0434\u0430\u0439|\u0440\u0430\u0441\u0441\u043a\u0430\u0436\u0438|\u0447\u0442\u043e|\u043a\u0430\u043a|\u0433\u0434\u0435|\u043a\u043e\u0433\u0434\u0430|\u043f\u043e\u0447\u0435\u043c\u0443|\u0437\u0430\u0447\u0435\u043c|\u043c\u043e\u0436\u0435\u0448\u044c|\u0443\u043c\u0435\u0435\u0448\u044c|\u043d\u0430\u0434\u043e|\u043d\u0443\u0436\u043d\u043e|\u0445\u043e\u0447\u0443|\u043e\u0441\u0442\u0430\u0442\u043a|\u043d\u0434\u0441|\u0434\u043e\u043b\u0433|\u0434\u0435\u0431\u0438\u0442\u043e\u0440|\u043a\u0440\u0435\u0434\u0438\u0442\u043e\u0440|\u0441\u043a\u043b\u0430\u0434|\u0442\u043e\u0432\u0430\u0440|\u043a\u043e\u043d\u0442\u0440\u0430\u0433\u0435\u043d\u0442|\u043e\u0431\u043e\u0440\u043e\u0442|\u0432\u044b\u0440\u0443\u0447\u043a|\u043f\u0440\u0438\u0431\u044b\u043b)/iu.test(normalized); +} function buildDeterministicSmalltalkLeadReply() { return "\u041f\u0440\u0438\u0432\u0435\u0442! \u0412\u0441\u0451 \u043d\u043e\u0440\u043c\u0430\u043b\u044c\u043d\u043e."; } @@ -80,6 +108,8 @@ async function runAssistantLivingChatRuntime(input) { let livingChatGroundingGuardApplied = false; let livingChatGroundingGuardReason = null; let livingChatProactiveScopeOfferApplied = false; + let livingChatBareScopeProbeAttempted = false; + let livingChatBareScopeProbeMatchedOrganization = null; const continuityActiveOrganization = organizationAuthority.continuityActiveOrganization; let knownOrganizations = [...organizationAuthority.knownOrganizations]; let selectedOrganization = organizationAuthority.selectedOrganization; @@ -101,6 +131,29 @@ async function runAssistantLivingChatRuntime(input) { const lastGroundedInventoryAddressDebug = memoryRecapContext.lastGroundedInventoryAddressDebug; const lastMemoryAddressDebug = memoryRecapContext.lastMemoryAddressDebug; const lastAnswerInspectionAddressDebug = memoryRecapContext.lastAnswerInspectionAddressDebug; + if (shouldProbeBareOrganizationScopeCandidate({ + userMessage, + selectedOrganization, + activeOrganization, + dataScopeMetaQuery, + capabilityMetaQuery, + destructiveSignal, + dangerSignal, + operationalSignal + })) { + dataScopeProbe = await input.resolveDataScopeProbe(); + livingChatBareScopeProbeAttempted = true; + knownOrganizations = input.mergeKnownOrganizations([ + ...knownOrganizations, + ...(Array.isArray(dataScopeProbe?.organizations) ? dataScopeProbe.organizations : []) + ]); + const probedOrganization = input.resolveOrganizationSelectionFromMessage(userMessage, knownOrganizations); + if (probedOrganization) { + selectedOrganization = probedOrganization; + activeOrganization = probedOrganization; + livingChatBareScopeProbeMatchedOrganization = probedOrganization; + } + } if (capabilityMetaQuery && (destructiveSignal || dangerSignal)) { chatText = input.buildAssistantSafetyRefusalReply(); livingChatSource = "deterministic_safety_refusal"; @@ -302,6 +355,8 @@ async function runAssistantLivingChatRuntime(input) { living_chat_grounding_guard_applied: livingChatGroundingGuardApplied, living_chat_grounding_guard_reason: livingChatGroundingGuardReason, living_chat_proactive_scope_offer_applied: livingChatProactiveScopeOfferApplied, + living_chat_bare_scope_probe_attempted: livingChatBareScopeProbeAttempted, + living_chat_bare_scope_probe_matched_organization: livingChatBareScopeProbeMatchedOrganization, living_chat_data_scope_probe_status: dataScopeProbe?.status ?? null, living_chat_data_scope_probe_channel: dataScopeProbe?.channel ?? null, living_chat_data_scope_probe_org_count: Array.isArray(dataScopeProbe?.organizations) diff --git a/llm_normalizer/backend/dist/services/assistantMcpDiscoveryAnswerAdapter.js b/llm_normalizer/backend/dist/services/assistantMcpDiscoveryAnswerAdapter.js index e955ec8..5a7e81c 100644 --- a/llm_normalizer/backend/dist/services/assistantMcpDiscoveryAnswerAdapter.js +++ b/llm_normalizer/backend/dist/services/assistantMcpDiscoveryAnswerAdapter.js @@ -505,7 +505,7 @@ function businessOverviewOutgoingLeaderLine(overview) { function businessOverviewSupplierBoundaryBasis(overview) { const leader = overview.top_suppliers?.[0] ?? null; if (!leader) { - return "есть только общий срез исходящих платежей без надежного vendor-risk профиля"; + return "есть только общий срез исходящих платежей без надежного профиля поставщицкого риска"; } const share = percentText(leader.total_amount, overview.outgoing_supplier_payout.total_amount); if (isFinancialInstitutionBucket(leader)) { @@ -540,9 +540,9 @@ function businessOverviewHeadlineMetricsLine(overview) { ? `минус ${inlineBusinessOverviewAmount(result.final_result_amount_human_ru)}` : inlineBusinessOverviewAmount(result.final_result_amount_human_ru); const margin = result.net_margin_to_revenue_pct === null - ? "маржа к выручке 90.01 не рассчитана" - : `маржа к выручке 90.01 ${result.net_margin_to_revenue_pct}%`; - parts.push(`${direction} 90/91/99 ${amount}; ${margin}`); + ? "маржа к подтвержденной выручке не рассчитана" + : `маржа к подтвержденной выручке ${result.net_margin_to_revenue_pct}%`; + parts.push(`${direction} по закрытию счетов 90/91/99 ${amount}; ${margin}`); } const strongestIncomingYear = businessOverviewStrongestIncomingYear(overview); if (strongestIncomingYear) { @@ -551,7 +551,7 @@ function businessOverviewHeadlineMetricsLine(overview) { return parts.length > 0 ? overview.accounting_financial_result ? `${parts.join("; ")}. Финрезультат ограничен найденными строками 1С и не является внешним аудитом или юридически подтвержденной отчетностью` - : `${parts.join("; ")}. Это operating-flow proxy по найденным строкам, не бухгалтерская прибыль и не финрезультат` + : `${parts.join("; ")}. Это операционный денежный сигнал по найденным строкам, не бухгалтерская прибыль и не финрезультат` : null; } function businessOverviewAccountingFinancialResultText(overview) { @@ -568,12 +568,12 @@ function businessOverviewAccountingFinancialResultText(overview) { ? `минус ${result.final_result_amount_human_ru}` : result.final_result_amount_human_ru; const marginText = result.net_margin_to_revenue_pct === null - ? "маржа к выручке 90.01 не рассчитана" - : `маржа к выручке 90.01 ${result.net_margin_to_revenue_pct}%`; + ? "маржа к подтвержденной выручке не рассчитана" + : `маржа к подтвержденной выручке ${result.net_margin_to_revenue_pct}%`; const basis = result.final_transfer_basis === "account_99_to_84_period_close" ? "по закрытию 99 на 84" : "по закрытию 90/91 на 99"; - return `По бухгалтерскому маршруту 90/91/99 за ${result.period_scope} подтвержден ${direction}: ${signedAmount}; ${marginText}. Основа: ${basis}, ${result.period_close_rows_with_amount} строк(и) закрытия периода с суммой. Это учетный финрезультат по найденным строкам 1С, не внешний аудит и не юридически подтвержденная отчетность.`; + return `Нет: денежное операционное нетто не стоит считать чистой прибылью. Отдельно по закрытию счетов 90/91/99 в 1С за ${result.period_scope} подтвержден ${direction}: ${signedAmount}; ${marginText}. Основа: ${basis}, ${result.period_close_rows_with_amount} строк(и) закрытия периода с суммой. Это учетный финрезультат по найденным строкам 1С, не внешний аудит и не юридически подтвержденная отчетность.`; } function businessOverviewDebtDueDateAgingText(overview) { const aging = overview.debt_due_date_aging; @@ -637,12 +637,12 @@ function businessOverviewVendorProcurementQualityText(overview) { ? ` Договорный профиль: используется ${quality.used_contracts} договоров.` : ` Договорный профиль: используется ${quality.used_contracts}/${quality.total_contracts} договоров${quality.used_contract_share_pct === null ? "" : ` (${quality.used_contract_share_pct}%)`}.`; if (quality.evidence_status === "financial_institution_leads_outgoing_cash") { - return `Проверенный procurement-concentration route за ${period}: крупнейший получатель исходящих денег ${topName}${topShare}${topAmount}, всего исходящих платежей ${total}. По названию это банк/финансовая организация, поэтому зависимость от обычного поставщика этим не подтверждается.${financialFlowHintTextRuFromBucket(top)}${nonFinancialText}${contractText} Надежность поставщиков, качество поставок, назначение каждого платежа и полная структура расходов этим маршрутом не доказаны.`; + return `Проверка концентрации закупок/исходящих платежей за ${period}: крупнейший получатель исходящих денег ${topName}${topShare}${topAmount}, всего исходящих платежей ${total}. По названию это банк/финансовая организация, поэтому зависимость от обычного поставщика этим не подтверждается.${financialFlowHintTextRuFromBucket(top)}${nonFinancialText}${contractText} Надежность поставщиков, качество поставок, назначение каждого платежа и полная структура расходов этим срезом не доказаны.`; } if (quality.evidence_status === "reviewed_procurement_concentration") { - return `Проверенный procurement-concentration route за ${period}: крупнейший поставщик/получатель исходящих платежей ${topName}${topShare}${topAmount}, всего исходящих платежей ${total}.${contractText} Это проверенный сигнал концентрации закупок/исходящих платежей, но не аудит надежности поставщика, качества поставок и полной структуры расходов.`; + return `Проверка концентрации закупок/исходящих платежей за ${period}: крупнейший поставщик/получатель исходящих платежей ${topName}${topShare}${topAmount}, всего исходящих платежей ${total}.${contractText} Это проверенный сигнал концентрации закупок/исходящих платежей, но не аудит надежности поставщика, качества поставок и полной структуры расходов.`; } - return `Procurement-concentration route за ${period} отработал по исходящим платежам на ${total}, но надежной небанковской концентрации поставщика по найденным строкам не хватает.${contractText} Полный vendor-risk аудит не подтвержден.`; + return `Проверка концентрации закупок/исходящих платежей за ${period} нашла исходящие платежи на ${total}, но надежной небанковской концентрации поставщика по найденным строкам не хватает.${contractText} Полный аудит поставщицкого риска не подтвержден.`; } function businessOverviewInventoryQualityEventsText(overview) { const quality = overview.inventory_quality_events; @@ -684,7 +684,7 @@ function headlineFor(mode, pilot) { if (accountingFinancialResultText) { return accountingFinancialResultText; } - return "Нельзя точно подтвердить чистую прибыль и маржу по текущему срезу 1С; есть только bounded operating-flow/trading-margin proxy, не P&L и не бухгалтерский финрезультат."; + return "Нельзя точно подтвердить чистую прибыль и маржу по текущему срезу 1С; есть только ограниченный операционный денежный/товарный сигнал, а не полный отчет о прибыли и не бухгалтерский финрезультат."; } if (isDebtDueDateBoundaryTurn(pilot)) { const dueDateText = businessOverviewDebtDueDateAgingText(overview); @@ -1375,6 +1375,10 @@ function derivedBusinessOverviewConfirmedLines(pilot) { if (overview.yearly_breakdown?.length) { lines.push(`Годовая раскладка операционного денежного потока построена по подтвержденным строкам 1С за ${yearCountHumanRu(overview.yearly_breakdown.length)}.`); } + if (overview.incoming_customer_revenue.coverage_recovered_by_period_chunking || + overview.outgoing_supplier_payout.coverage_recovered_by_period_chunking) { + lines.push("Денежное покрытие бизнес-обзора за год восстановлено через помесячные 1С-проверки, а не только через широкий общий запрос."); + } if (overview.activity_period) { lines.push(`Окно подтвержденной активности в 1С: ${overview.activity_period.first_activity_date} — ${overview.activity_period.latest_activity_date}; ориентировочно ${overview.activity_period.duration_human_ru}.`); } @@ -1536,7 +1540,7 @@ function businessOverviewSupplierConcentrationLine(overview) { return `${base}. По названию это банк/финансовая организация, поэтому это не доказательство зависимости от обычного поставщика без проверки назначения платежа/договора.${nonFinancial ? ` Крупнейший небанковский получатель исходящих денег: ${rankedBucketAmountLabel(nonFinancial)}.` : ""}`; } return share - ? `Концентрация исходящего потока: крупнейший подтвержденный поставщик/получатель исходящих платежей ${leader.axis_value} держит около ${share} проверенных исходящих платежей (${leader.total_amount_human_ru}). Это сигнал procurement concentration по найденным строкам, а не полный vendor-risk аудит или структура всех расходов.` + ? `Концентрация исходящего потока: крупнейший подтвержденный поставщик/получатель исходящих платежей ${leader.axis_value} держит около ${share} проверенных исходящих платежей (${leader.total_amount_human_ru}). Это сигнал концентрации закупок/исходящих платежей по найденным строкам, а не полный аудит поставщицкого риска или структура всех расходов.` : `Крупнейший подтвержденный поставщик/получатель исходящих платежей в проверенном срезе: ${leader.axis_value} — ${leader.total_amount_human_ru}.`; } function businessOverviewYearlyOperatingLine(overview) { @@ -1561,7 +1565,7 @@ function businessOverviewYearlyOperatingLine(overview) { : `нетто в плюс ${strongestNetYear.net_amount_human_ru}`; parts.push(`лучший год по расчетному операционному нетто ${strongestNetYear.year_bucket}: ${netText}`); } - return `Годовая динамика по проверенным строкам: ${parts.join("; ")}. Это operating-flow proxy, не бухгалтерская прибыль и не финрезультат.`; + return `Годовая динамика по проверенным строкам: ${parts.join("; ")}. Это операционный денежный сигнал, не бухгалтерская прибыль и не финрезультат.`; } function businessOverviewRiskSynthesisLine(overview) { const signals = []; @@ -1587,9 +1591,9 @@ function businessOverviewRiskSynthesisLine(overview) { ? "учетный убыток" : "нулевой учетный финрезультат"; const marginText = result.net_margin_to_revenue_pct === null - ? "маржа к выручке 90.01 не рассчитана" - : `маржа к выручке 90.01 ${result.net_margin_to_revenue_pct}%`; - signals.push(`${direction} 90/91/99 ${result.final_result_amount_human_ru}, ${marginText}`); + ? "маржа к подтвержденной выручке не рассчитана" + : `маржа к подтвержденной выручке ${result.net_margin_to_revenue_pct}%`; + signals.push(`${direction} по закрытию счетов 90/91/99 ${result.final_result_amount_human_ru}, ${marginText}`); } if (overview.debt_position) { const debtDirection = overview.debt_position.net_debt_position_direction === "net_receivable" diff --git a/llm_normalizer/backend/dist/services/assistantMcpDiscoveryPilotExecutor.js b/llm_normalizer/backend/dist/services/assistantMcpDiscoveryPilotExecutor.js index 5860977..c656e3a 100644 --- a/llm_normalizer/backend/dist/services/assistantMcpDiscoveryPilotExecutor.js +++ b/llm_normalizer/backend/dist/services/assistantMcpDiscoveryPilotExecutor.js @@ -3584,6 +3584,10 @@ function buildBusinessOverviewConfirmedFacts(derived) { if (derived.yearly_breakdown.length > 0) { facts.push(`Годовая раскладка операционного денежного потока построена по подтвержденным строкам 1С за ${yearCountHumanRu(derived.yearly_breakdown.length)}.`); } + if (derived.incoming_customer_revenue.coverage_recovered_by_period_chunking || + derived.outgoing_supplier_payout.coverage_recovered_by_period_chunking) { + facts.push("Денежное покрытие бизнес-обзора за год восстановлено через помесячные 1С-проверки, а не только через широкий общий запрос."); + } if (derived.activity_period) { facts.push(`Подтвержденное окно активности в 1С: ${derived.activity_period.first_activity_date} — ${derived.activity_period.latest_activity_date}.`); } @@ -3820,7 +3824,7 @@ function buildBusinessOverviewUnknownFacts(derived) { : null ].filter((item) => Boolean(item)); if (derived?.coverage_limited_by_probe_limit) { - unknowns.unshift("Полное покрытие бизнес-обзора не подтверждено: хотя бы один денежный probe достиг лимита строк."); + unknowns.unshift("Полное покрытие бизнес-обзора не подтверждено: хотя бы один денежный запрос достиг верхней границы выборки."); } return unknowns; } @@ -4735,6 +4739,12 @@ async function executeAssistantMcpDiscoveryPilot(planner, deps = DEFAULT_DEPS) { if (!incomingResult?.error || !outgoingResult?.error) { pushReason(reasonCodes, "pilot_business_overview_query_movements_mcp_executed"); } + if (incomingResult?.coverage_recovered_by_period_chunking) { + pushReason(reasonCodes, "pilot_business_overview_incoming_monthly_period_chunking_recovered_coverage"); + } + if (outgoingResult?.coverage_recovered_by_period_chunking) { + pushReason(reasonCodes, "pilot_business_overview_outgoing_monthly_period_chunking_recovered_coverage"); + } if (taxResult?.error) { pushUnique(queryLimitations, taxResult.error); pushReason(reasonCodes, "pilot_business_overview_tax_query_mcp_error"); diff --git a/llm_normalizer/backend/dist/services/assistantMcpDiscoveryPlanner.js b/llm_normalizer/backend/dist/services/assistantMcpDiscoveryPlanner.js index e08fdb9..2c7ac9e 100644 --- a/llm_normalizer/backend/dist/services/assistantMcpDiscoveryPlanner.js +++ b/llm_normalizer/backend/dist/services/assistantMcpDiscoveryPlanner.js @@ -5,6 +5,7 @@ exports.planAssistantMcpDiscovery = planAssistantMcpDiscovery; const assistantMcpDiscoveryPolicy_1 = require("./assistantMcpDiscoveryPolicy"); const assistantMcpCatalogIndex_1 = require("./assistantMcpCatalogIndex"); exports.ASSISTANT_MCP_DISCOVERY_PLANNER_SCHEMA_VERSION = "assistant_mcp_discovery_planner_v1"; +const CHUNKED_COVERAGE_PROBE_BUDGET = 30; function toNonEmptyString(value) { if (value === null || value === undefined) { return null; @@ -385,12 +386,14 @@ function budgetOverrideFor(input, recipe) { (recipe.semanticDataNeed === "counterparty value-flow evidence" || recipe.semanticDataNeed === "bidirectional value-flow comparison evidence" || recipe.semanticDataNeed === "ranked value-flow evidence"); - if (!isValueFlowRecipe) { + const isBusinessOverviewRecipe = recipe.primitives.includes("query_movements") && + recipe.chainId === "business_overview"; + if (!isValueFlowRecipe && !isBusinessOverviewRecipe) { return {}; } if (requestedAggregationAxis === "month" || isYearDateScope(meaning)) { return { - maxProbeCount: 30 + maxProbeCount: CHUNKED_COVERAGE_PROBE_BUDGET }; } return {}; diff --git a/llm_normalizer/backend/dist/services/assistantMcpDiscoveryResponseCandidate.js b/llm_normalizer/backend/dist/services/assistantMcpDiscoveryResponseCandidate.js index e1e8f04..07cd323 100644 --- a/llm_normalizer/backend/dist/services/assistantMcpDiscoveryResponseCandidate.js +++ b/llm_normalizer/backend/dist/services/assistantMcpDiscoveryResponseCandidate.js @@ -403,8 +403,9 @@ function businessOverviewCoverageLimitLine(overview) { if (outgoing?.coverage_limited_by_probe_limit === true) { limited.push("исходящие"); } + const continuation = "Если нужен полный сквозной ответ, безопасный следующий шаг — выбрать конкретный год или квартал для дозапроса: тогда широкий срез можно собрать частями без выдачи непроверенного итога."; return limited.length > 0 - ? `Важно: по направлению ${limited.join(" и ")} проверка достигла лимита строк; это расширенный проверенный срез найденных строк, но не гарантия полного бухгалтерского оборота без отдельной полной выгрузки.` + ? `Важно: по направлению ${limited.join(" и ")} проверка достигла лимита строк; это расширенный проверенный срез найденных строк, но не гарантия полного бухгалтерского оборота без отдельной полной выгрузки. ${continuation}` : null; } function joinBusinessReplyLines(lines) { @@ -560,6 +561,8 @@ function bidirectionalNetLabel(direction) { return "нетто в нашу сторону"; } function buildCompactBidirectionalValueFlowReply(entryPoint, draft) { + const turnInput = toRecordObject(entryPoint.turn_input); + const turnMeaning = toRecordObject(turnInput?.turn_meaning_ref); const bridge = toRecordObject(entryPoint.bridge); const pilot = toRecordObject(bridge?.pilot); const flow = toRecordObject(pilot?.derived_bidirectional_value_flow); @@ -574,7 +577,13 @@ function buildCompactBidirectionalValueFlowReply(entryPoint, draft) { if (!incomingAmount && !outgoingAmount && !netAmount) { return null; } - const counterparty = toNonEmptyString(flow.counterparty) ?? "запрошенному контрагенту"; + const counterparty = toNonEmptyString(flow.counterparty); + const organizationScope = toNonEmptyString(turnMeaning?.explicit_organization_scope); + const subjectLead = counterparty + ? `по контрагенту ${counterparty}` + : organizationScope + ? `по компании ${organizationScope}` + : "по выбранному контуру"; const period = toNonEmptyString(flow.period_scope); const periodText = period ? ` за период ${period}` : " в проверенном окне"; const incomingRows = sideRowsText(incoming); @@ -583,7 +592,7 @@ function buildCompactBidirectionalValueFlowReply(entryPoint, draft) { const outgoingDates = sideDateText(outgoing); const netLabel = bidirectionalNetLabel(flow.net_direction); const lines = [ - `Коротко: по контрагенту ${counterparty}${periodText} по найденным строкам 1С получили ${incomingAmount ?? "0 руб."}, заплатили ${outgoingAmount ?? "0 руб."}; расчетное ${netLabel}: ${sentenceAmount(netAmount) ?? netAmount ?? "0 руб."}.` + `Коротко: ${subjectLead}${periodText} по найденным строкам 1С получили ${incomingAmount ?? "0 руб."}, заплатили ${outgoingAmount ?? "0 руб."}; расчетное ${netLabel}: ${sentenceAmount(netAmount) ?? netAmount ?? "0 руб."}.` ]; const basis = []; if (incomingRows) { @@ -771,7 +780,7 @@ function buildCompactBusinessOverviewReply(entryPoint, draft) { ? `минус ${amount}` : amount : "сумма не распознана"; - lines.push(`Коротко: по бухгалтерскому маршруту 90/91/99 за ${periodScope} подтвержден ${directionText}: ${amountText}${marginPct ? `; маржа к выручке 90.01 ${marginPct}` : "; маржа к выручке 90.01 не рассчитана"}.`); + lines.push(`Коротко: нет, денежное операционное нетто не стоит считать чистой прибылью. Отдельно по закрытию счетов 90/91/99 в 1С за ${periodScope} подтвержден ${directionText}: ${amountText}${marginPct ? `; маржа к подтвержденной выручке ${marginPct}` : "; маржа к подтвержденной выручке не рассчитана"}.`); lines.push("Это учетный финрезультат по найденным строкам закрытия периода в 1С, а не внешний аудит и не юридически подтвержденная отчетность."); return joinBusinessReplyLines(lines); } @@ -779,7 +788,7 @@ function buildCompactBusinessOverviewReply(entryPoint, draft) { const cleanHeadline = headline?.replace(/^Коротко:\s*/iu, "").trim(); lines.push(cleanHeadline ? `Коротко: ${localizeLine(cleanHeadline)}` - : "Коротко: нельзя точно подтвердить чистую прибыль и маржу по текущему срезу 1С; есть только bounded operating-flow/trading-margin proxy, не P&L и не бухгалтерский финансовый результат."); + : "Коротко: нельзя точно подтвердить чистую прибыль и маржу по текущему срезу 1С; есть только ограниченный операционный денежный/товарный сигнал, а не полный отчет о прибыли и не бухгалтерский финансовый результат."); const boundaryLines = userFacingLines([ ...toStringList(draft.confirmed_lines), ...toStringList(draft.inference_lines), @@ -790,7 +799,7 @@ function buildCompactBusinessOverviewReply(entryPoint, draft) { if (boundaryLines.length > 0) { lines.push(...boundaryLines.map(localizeLine)); } - lines.push("Для точного P&L нужны отдельный маршрут по себестоимости, расходам, закрытию периода и финрезультату; текущий proxy нельзя выдавать за подтвержденную чистую прибыль или маржу."); + lines.push("Для точного отчета о прибыли нужны отдельная проверка себестоимости, расходов, закрытия периода и финрезультата; текущий ограниченный сигнал нельзя выдавать за подтвержденную чистую прибыль или маржу."); if (limitLine) { lines.push(limitLine); } @@ -890,7 +899,7 @@ function buildCompactBusinessOverviewReply(entryPoint, draft) { : `крупнейший подтвержденный поставщик/получатель исходящих платежей: ${topSupplier}` : outgoingAmount ? `исходящие платежи/закупочный поток в проверенном срезе: ${outgoingAmount}` - : "есть только ограниченный срез исходящих платежей без полного vendor-risk профиля"; + : "есть только ограниченный срез исходящих платежей без полного профиля поставщицкого риска"; const proxyLabel = topSupplierLooksFinancial ? "сигнал концентрации исходящих денег" : "сигнал концентрации закупок/исходящих платежей"; @@ -925,7 +934,7 @@ function buildCompactBusinessOverviewReply(entryPoint, draft) { if (crossScopeExecutiveSummary && separateSubject && previousCounterpartySummary && (incomingAmount || outgoingAmount || netAmount)) { lines.push(`Коротко: по компании ${organizationScope ?? "в выбранном контуре"} ${period} подтвержден денежный срез: получили ${incomingAmount ?? "0 руб."}, исходящие платежи/списания ${outgoingAmount ?? "0 руб."}, ${netDirection} ${sentenceAmount(netAmount) ?? netAmount ?? "0 руб."}${previousCounterpartySummary.lead}; можно утверждать только эти подтвержденные срезы, нельзя называть это чистой прибылью, полным оборотом или доказанной ролью главного клиента/поставщика.`); lines.push(previousCounterpartySummary.line); - lines.push(`Можно утверждать: по компании подтвержден operating-flow proxy по найденным строкам 1С; по ${separateSubject} отдельно подтверждены входящие/исходящие строки, расчетное нетто и документы из предыдущего контрагентского среза.`); + lines.push(`Можно утверждать: по компании подтвержден операционный денежный сигнал по найденным строкам 1С; по ${separateSubject} отдельно подтверждены входящие/исходящие строки, расчетное нетто и документы из предыдущего контрагентского среза.`); lines.push(`Нельзя утверждать: это не чистая прибыль, не полный бухгалтерский оборот вне проверенного окна и не доказательство, что ${separateSubject} является главным клиентом или поставщиком как бизнес-роль.`); if (limitLine) { lines.push(limitLine); @@ -962,7 +971,7 @@ function buildCompactBusinessOverviewReply(entryPoint, draft) { } else if (incomingAmount || outgoingAmount || netAmount) { lines.push(`Коротко: ${organizationPrefix}${period} по подтвержденным строкам 1С получили ${incomingAmount ?? "0 руб."}; исходящие платежи/списания ${outgoingAmount ?? "0 руб."}; ${netDirection} ${sentenceAmount(netAmount) ?? netAmount ?? "0 руб"}${topCustomerLead}${topSupplierLead}${roleBoundaryLead}${separateSubjectLead}.`); - lines.push('Метод: "заработали" здесь считаю как денежный operating-flow proxy по 1С; это не чистая прибыль и не финрезультат.'); + lines.push('Метод: "заработали" здесь считаю как операционный денежный показатель по 1С; это не чистая прибыль и не финрезультат.'); if (!directMoneyAnswer && customerName && customerAmount) { lines.push(topCustomerLooksFinancial ? `Крупнейший входящий денежный источник в этом срезе: ${customerName} — ${sentenceAmount(customerAmount) ?? customerAmount}. По названию это банк/финансовая организация, поэтому без назначения платежа не называю это клиентской выручкой.${nonFinancialCustomer ? ` Крупнейший небанковский входящий контрагент: ${nonFinancialCustomer}.` : ""}` diff --git a/llm_normalizer/backend/dist/services/assistantMcpDiscoveryResponsePolicy.js b/llm_normalizer/backend/dist/services/assistantMcpDiscoveryResponsePolicy.js index f102680..2300503 100644 --- a/llm_normalizer/backend/dist/services/assistantMcpDiscoveryResponsePolicy.js +++ b/llm_normalizer/backend/dist/services/assistantMcpDiscoveryResponsePolicy.js @@ -317,6 +317,34 @@ function hasExactValueFlowReplyForBusinessOverviewDirectMoneyNeed(input, entryPo hasConfirmedAddressExecution(input) && hasBusinessOverviewDirectMoneyClarification(entryPoint)); } +function hasExactBankOperationsAddressReply(input, entryPoint) { + if (!isDiscoveryReadyAddressCandidate(input, entryPoint)) { + return false; + } + if (!hasEffectivelyFactualAddressReply(input)) { + return false; + } + const source = String(input.currentReplySource ?? input.livingChatSource ?? "").trim().toLowerCase(); + if (source !== "address_query_runtime_v1" && source !== "address_exact" && source !== "address_lane") { + return false; + } + const detectedIntent = toNonEmptyString(input.addressRuntimeMeta?.detected_intent); + const selectedRecipe = toNonEmptyString(input.addressRuntimeMeta?.selected_recipe); + const isBankIntent = detectedIntent === "bank_operations_by_counterparty" || detectedIntent === "bank_operations_by_contract"; + const isBankRecipe = selectedRecipe === "address_bank_operations_by_counterparty_v1" || + selectedRecipe === "address_bank_operations_by_contract_v1"; + if (!isBankIntent || !isBankRecipe) { + return false; + } + const grounding = toRecordObject(input.addressRuntimeMeta?.answer_grounding_check); + const groundingStatus = toNonEmptyString(grounding?.status); + const mcpCallStatus = toNonEmptyString(input.addressRuntimeMeta?.mcp_call_status); + const routeMode = toNonEmptyString(input.addressRuntimeMeta?.capability_route_mode); + return Boolean(mcpCallStatus === "matched_non_empty" || + groundingStatus === "grounded" || + routeMode === "exact" || + hasFullConfirmedTruth(input)); +} function hasValueFlowActionConflictWithDiscoveryTurnMeaning(input, entryPoint) { if (!isDiscoveryReadyAddressCandidate(input, entryPoint)) { return false; @@ -330,6 +358,9 @@ function hasValueFlowActionConflictWithDiscoveryTurnMeaning(input, entryPoint) { if (askedDomain !== "counterparty_value") { return false; } + if (hasExactBankOperationsAddressReply(input, entryPoint)) { + return false; + } const detectedIntent = toNonEmptyString(input.addressRuntimeMeta?.detected_intent); if (askedAction === "payout") { return detectedIntent !== "supplier_payouts_profile"; @@ -470,6 +501,9 @@ function hasSemanticConflictWithDiscoveryTurnMeaning(input, entryPoint) { if (hasRuntimeMatchedExactReply(input, entryPoint)) { return false; } + if (hasExactBankOperationsAddressReply(input, entryPoint)) { + return false; + } const detectedIntent = toNonEmptyString(input.addressRuntimeMeta?.detected_intent); const turnMeaning = readDiscoveryTurnMeaning(entryPoint); const askedDomain = toNonEmptyString(turnMeaning?.asked_domain_family); @@ -566,6 +600,7 @@ function applyAssistantMcpDiscoveryResponsePolicy(input) { const runtimeMatchedExactReply = hasRuntimeMatchedExactReply(input, entryPoint); const staleMetadataDiscoveryFallbackAgainstExactAddressReply = hasStaleMetadataDiscoveryFallbackAgainstExactAddressReply(input, entryPoint); const exactValueFlowReplyForBusinessOverviewDirectMoneyNeed = hasExactValueFlowReplyForBusinessOverviewDirectMoneyNeed(input, entryPoint); + const exactBankOperationsAddressReply = hasExactBankOperationsAddressReply(input, entryPoint); const openScopeValueFlowDiscoveryPriority = hasOpenScopeValueFlowDiscoveryPriority(input, entryPoint); const metadataDiscoveryPriority = hasMetadataDiscoveryPriority(input, entryPoint); const valueFlowActionConflictWithDiscoveryTurnMeaning = hasValueFlowActionConflictWithDiscoveryTurnMeaning(input, entryPoint); @@ -627,6 +662,9 @@ function applyAssistantMcpDiscoveryResponsePolicy(input) { if (exactValueFlowReplyForBusinessOverviewDirectMoneyNeed) { pushReason(reasonCodes, "mcp_discovery_response_policy_keep_exact_value_flow_reply_over_business_overview_direct_money_clarification"); } + if (exactBankOperationsAddressReply) { + pushReason(reasonCodes, "mcp_discovery_response_policy_keep_exact_bank_operations_address_reply"); + } if (deterministicBroadBusinessEvaluationReply && candidate.candidate_status === "clarification_candidate") { pushReason(reasonCodes, "mcp_discovery_response_policy_keep_broad_business_summary_over_clarification_candidate"); } @@ -653,6 +691,7 @@ function applyAssistantMcpDiscoveryResponsePolicy(input) { !runtimeMatchedExactReply && !staleMetadataDiscoveryFallbackAgainstExactAddressReply && !exactValueFlowReplyForBusinessOverviewDirectMoneyNeed && + !exactBankOperationsAddressReply && !(deterministicBroadBusinessEvaluationReply && candidate.candidate_status === "clarification_candidate") && ALLOWED_CANDIDATE_STATUSES.has(candidate.candidate_status) && candidate.eligible_for_future_hot_runtime && diff --git a/llm_normalizer/backend/dist/services/assistantMcpDiscoveryTurnInputAdapter.js b/llm_normalizer/backend/dist/services/assistantMcpDiscoveryTurnInputAdapter.js index 36cb592..6913407 100644 --- a/llm_normalizer/backend/dist/services/assistantMcpDiscoveryTurnInputAdapter.js +++ b/llm_normalizer/backend/dist/services/assistantMcpDiscoveryTurnInputAdapter.js @@ -168,6 +168,7 @@ function isGarbageSemanticAnchorCandidate(value) { } if (/^(?:по\s+)?(?:этим|этими)\s+данн\p{L}*$/iu.test(text) || /^(?:и\s+)?кто\s+(?:главн\p{L}*|основн\p{L}*|крупн\p{L}*)\s+(?:клиент|покупател|поставщик|контрагент)(?:\s+в)?$/iu.test(text) || + /^(?:или\s+)?(?:обычн\p{L}*\s+)?(?:клиент|поставщик|покупател\p{L}*|заказчик|контрагент)(?:\s+или\s+(?:клиент|поставщик|покупател\p{L}*|заказчик|контрагент))?$/iu.test(text) || /^(?:что|чего)\s+(?:подтвержден\p{L}*|не\s+хватает)/iu.test(text) || /^(?:можно\s+ли|если\s+нет|дай\s+proxy|дай\s+прокси)/iu.test(text)) { return true; @@ -899,6 +900,7 @@ function rawEntityResolutionCandidate(text) { function rawScopedEntityCandidateFromText(text) { const source = (0, addressTextRepair_1.repairAddressMojibakeText)(String(text ?? "")); const patterns = [ + /(?:^|[\s,.;:!?])(?:по|у|для|for|by)\s+(.+?)(?=$|[,.;:!?]|\s+(?:за|на|в|во|к|по|сколько|скок|как|какое|какой|какая|какие|получ\p{L}*|заплат\p{L}*|нетто|документ\p{L}*|движени\p{L}*|операц\p{L}*|плат[её]ж\p{L}*)(?=$|[\s,.;:!?]))/iu, /(?:^|[\s,.;:!?])(?:по|у|для|for|by)\s+([\p{L}\d._-]{2,})(?=$|[\s,.;:!?])/iu, /(?:документ(?:ам|ы)?|движени(?:ям|я)?|операци(?:ям|и)?|плат[её]ж(?:ам|и)?)\s+([\p{L}\d._-]{2,})(?=$|[\s,.;:!?])/iu ]; diff --git a/llm_normalizer/backend/dist/services/assistantService.js b/llm_normalizer/backend/dist/services/assistantService.js index 43c7ca9..006ad7b 100644 --- a/llm_normalizer/backend/dist/services/assistantService.js +++ b/llm_normalizer/backend/dist/services/assistantService.js @@ -4955,6 +4955,7 @@ class AssistantService { hasLivingChatSignal, shouldEmitOrganizationSelectionReply, hasAssistantCapabilityQuestionSignal, + resolveOrganizationSelectionFromMessage, resolveDataScopeProbe: () => resolveAssistantDataScopeProbe(), applyScriptGuard: applyLivingChatScriptGuardFromPolicy, applyGroundingGuard: applyLivingChatGroundingGuardFromPolicy, diff --git a/llm_normalizer/backend/dist/services/assistantTransitionPolicy.js b/llm_normalizer/backend/dist/services/assistantTransitionPolicy.js index 4535dff..21858f9 100644 --- a/llm_normalizer/backend/dist/services/assistantTransitionPolicy.js +++ b/llm_normalizer/backend/dist/services/assistantTransitionPolicy.js @@ -955,7 +955,8 @@ function createAssistantTransitionPolicy(deps) { hasInventoryRootRestatementAlternate || hasSelectedObjectInventorySignalPrimary || hasSelectedObjectInventorySignalAlternate)); - const carryoverTargetIntent = (0, assistantContinuityPolicy_1.resolveFollowupTargetIntent)(inventoryPurchaseDateVatBridge, selectedObjectRetargetIntent, explicitIntent, sourceIntent, followupSelectionMode, deps.toNonEmptyString(inventoryRootFrame?.intent), displayedEntityTargetIntent, previousIntent, explicitInventorySameDatePivot); + const explicitIntentForCarryover = debtRoleSwapIntent ? debtRoleSwapIntent : explicitIntent; + const carryoverTargetIntent = (0, assistantContinuityPolicy_1.resolveFollowupTargetIntent)(inventoryPurchaseDateVatBridge, selectedObjectRetargetIntent, explicitIntentForCarryover, sourceIntent, followupSelectionMode, deps.toNonEmptyString(inventoryRootFrame?.intent), displayedEntityTargetIntent, previousIntent, explicitInventorySameDatePivot); return { followupContext: { previous_intent: previousIntent ?? undefined, diff --git a/llm_normalizer/backend/dist/services/assistantTurnRuntimeInputBuilder.js b/llm_normalizer/backend/dist/services/assistantTurnRuntimeInputBuilder.js index 54b60b1..7fbea6c 100644 --- a/llm_normalizer/backend/dist/services/assistantTurnRuntimeInputBuilder.js +++ b/llm_normalizer/backend/dist/services/assistantTurnRuntimeInputBuilder.js @@ -54,6 +54,7 @@ function buildAssistantAddressAttemptRuntimeInput(runtimeInput, deps) { hasLivingChatSignal: deps.hasLivingChatSignal, shouldEmitOrganizationSelectionReply: deps.shouldEmitOrganizationSelectionReply, hasAssistantCapabilityQuestionSignal: deps.hasAssistantCapabilityQuestionSignal, + resolveOrganizationSelectionFromMessage: deps.resolveOrganizationSelectionFromMessage, resolveDataScopeProbe: deps.resolveDataScopeProbe, applyScriptGuard: deps.applyScriptGuard, applyGroundingGuard: deps.applyGroundingGuard, diff --git a/llm_normalizer/backend/src/services/addressFilterExtractor.ts b/llm_normalizer/backend/src/services/addressFilterExtractor.ts index 0ffaf9b..22c1b55 100644 --- a/llm_normalizer/backend/src/services/addressFilterExtractor.ts +++ b/llm_normalizer/backend/src/services/addressFilterExtractor.ts @@ -818,6 +818,33 @@ function isLowQualityCounterpartyAnchorValue(rawValue: string): boolean { lowQualityTimeTokens.has(token) || /^(?:январ|феврал|март|апрел|ма(?:й|я|е)|июн|июл|август|сентябр|октябр|ноябр|декабр)/iu.test(token); const lowQualityGenericTokens = new Set([ + "или", + "обычный", + "обычная", + "обычное", + "обычные", + "обычного", + "обычному", + "обычным", + "контрагент", + "контрагента", + "контрагенту", + "клиент", + "клиента", + "клиенту", + "клиентом", + "клиенты", + "поставщик", + "поставщика", + "поставщику", + "поставщиком", + "поставщики", + "покупатель", + "покупателя", + "покупателю", + "заказчик", + "заказчика", + "заказчику", "деньги", "денег", "деньгам", @@ -1427,6 +1454,10 @@ function isLowQualityWarehouseAnchorValue(rawValue: string): boolean { "лежали", "на", "по", + "остатка", + "остаткам", + "остатками", + "остатков", "компания", "компании", "компанию", @@ -1524,7 +1555,7 @@ function extractInventoryWarehouseAnchor(text: string): string | undefined { isLowQualityWarehouseAnchorValue(candidate) || normalizedCandidate.startsWith("по состоянию") || isTemporalWarehousePhrase(candidate) || - /^(?:сейчас|на|дату|дате|остаток|остатки)$/iu.test(candidate) + /^(?:сейчас|на|дату|дате|остат(?:ок|ки|ка|кам|ками|ков)|по\s+остат(?:кам|ки|ку|ка|ков))$/iu.test(candidate) ) { continue; } diff --git a/llm_normalizer/backend/src/services/addressIntentResolver.ts b/llm_normalizer/backend/src/services/addressIntentResolver.ts index 63c7be3..08c1277 100644 --- a/llm_normalizer/backend/src/services/addressIntentResolver.ts +++ b/llm_normalizer/backend/src/services/addressIntentResolver.ts @@ -2586,7 +2586,11 @@ function resolveUnicodeAddressIntentBridge(text: string): AddressIntentResolutio if ( /(?:поставщик|vendor|supplier|кому\s+(?:ушло|платили|заплатили)|выплат|исходящ|списан|сгрузил)/iu.test(normalized) && !/(?:аванс.*(?:не\s+)?закрыт|закрыт.*аванс)/iu.test(normalized) && - (hasMoneyCue || hasRankingCue || /плат[её]ж|оплат|выплат|outflow|payout|хвост|задержк|проблем/iu.test(normalized)) + (hasMoneyCue || + hasRankingCue || + /заплат|платил|платили|уплат|плат[её]ж|оплат|выплат|outflow|payout|хвост|задержк|проблем/iu.test( + normalized + )) ) { return unicodeBridgeResolution( /(?:хвост|задержк|проблем)/iu.test(normalized) ? "list_payables_counterparties" : "supplier_payouts_profile", @@ -2955,7 +2959,7 @@ function resolveUnicodeAddressIntentBridge(text: string): AddressIntentResolutio if ( /(?:поставщик|vendor|supplier|кому\s+(?:ушло|платили|заплатили)|выплат|исходящ|списан|сгрузил)/iu.test(normalized) && - (hasMoneyCue || hasRankingCue || /плат[её]ж|оплат|выплат|outflow|payout/iu.test(normalized)) + (hasMoneyCue || hasRankingCue || /заплат|платил|платили|уплат|плат[её]ж|оплат|выплат|outflow|payout/iu.test(normalized)) ) { return unicodeBridgeResolution( "supplier_payouts_profile", diff --git a/llm_normalizer/backend/src/services/address_runtime/composeStage.ts b/llm_normalizer/backend/src/services/address_runtime/composeStage.ts index 4244234..ab81064 100644 --- a/llm_normalizer/backend/src/services/address_runtime/composeStage.ts +++ b/llm_normalizer/backend/src/services/address_runtime/composeStage.ts @@ -456,8 +456,51 @@ function bankOperationDirectionLabel(direction: "incoming" | "outgoing" | "unkno return "банковская операция без надежно распознанного направления"; } -function bankOperationEvidenceLine(rows: ComposeStageRow[]): string { - const sample = rows[0]; +function summarizeBankOperationDirections(rows: ComposeStageRow[]): string { + const summary = { + incoming: { count: 0, amount: 0 }, + outgoing: { count: 0, amount: 0 }, + unknown: { count: 0, amount: 0 } + }; + for (const row of rows) { + const direction = bankOperationDirection(row); + const amount = typeof row.amount === "number" && Number.isFinite(row.amount) ? Math.abs(row.amount) : 0; + summary[direction].count += 1; + summary[direction].amount += amount; + } + const parts: string[] = []; + if (summary.incoming.count > 0) { + parts.push(`входящие: ${formatMoneyRub(summary.incoming.amount)} (${summary.incoming.count} строк)`); + } + if (summary.outgoing.count > 0) { + parts.push(`исходящие: ${formatMoneyRub(summary.outgoing.amount)} (${summary.outgoing.count} строк)`); + } + if (summary.unknown.count > 0) { + parts.push(`без распознанного направления: ${formatMoneyRub(summary.unknown.amount)} (${summary.unknown.count} строк)`); + } + return parts.length > 0 + ? `Сводка по направлению: ${parts.join("; ")}.` + : "Сводка по направлению: подтвержденные строки не найдены."; +} + +function preferredBankEvidenceDirection( + userMessage: string | null | undefined +): "incoming" | "outgoing" | null { + if (hasBankIncomingRoleBoundaryQuestion(userMessage)) { + return "incoming"; + } + if (hasBankOutgoingRoleBoundaryQuestion(userMessage)) { + return "outgoing"; + } + return null; +} + +function bankOperationEvidenceLine( + rows: ComposeStageRow[], + preferredDirection: "incoming" | "outgoing" | null = null +): string { + const sample = + (preferredDirection ? rows.find((row) => bankOperationDirection(row) === preferredDirection) : null) ?? rows[0]; if (!sample) { return "Проверенная строка 1С не найдена."; } @@ -494,13 +537,13 @@ function bankRoleBoundaryLine(userMessage: string | null | undefined, rows: Comp if (incomingBoundary) { return hasIncomingRow - ? "Выручкой от обычного клиента это не называю автоматически: для банка/финорганизации нужен вид операции, назначение платежа и договор; кредитный, депозитный или возвратный смысл без этих полей не исключаю и не притягиваю." + ? "Это не обычный клиент и не клиентская выручка автоматически: для банка/финорганизации нужен вид операции, назначение платежа и договор; кредитный, депозитный или возвратный смысл без этих полей не исключаю и не притягиваю." : hasOutgoingRow - ? "В найденных строках по банку подтверждено исходящее списание, а входящее поступление от банка в этом срезе не подтверждено; клиентскую выручку, кредит или депозит по этой строке не доказываю." - : "Входящее поступление от банка в найденных строках не подтверждено; клиентскую выручку, кредитный или депозитный смысл без вида операции/назначения платежа не доказываю."; + ? "В найденных строках по банку подтверждено исходящее списание, а входящее поступление от банка в этом срезе не подтверждено; это не подтвержденная клиентская выручка, кредит или депозит." + : "Входящее поступление от банка в найденных строках не подтверждено; это не подтвержденная клиентская выручка, кредитный или депозитный смысл."; } - return "Обычным поставщиком это не называю автоматически: для банка/финорганизации нужен вид операции, назначение платежа и договор; текущий срез подтверждает банковский платежный контур, а не бизнес-роль поставщика."; + return "Это не обычный поставщик автоматически: для банка/финорганизации нужен вид операции, назначение платежа и договор; текущий срез подтверждает банковский платежный контур, а не бизнес-роль поставщика."; } function hasInventoryPurchaseDateActionFocus(userMessage: string | null | undefined): boolean { @@ -4970,12 +5013,17 @@ function composeFactualReplyBody( ); const counterparty = resolvePreferredCounterpartyDisplayLabel(options.counterpartyHint, rowCounterparties); const roleBoundary = bankRoleBoundaryLine(options.userMessage, rows); + const visibleRows = rows.slice(0, Math.min(rows.length, 5)); const lines = [ `Коротко: найдено банковских операций${counterparty ? ` по ${counterparty}` : " по контрагенту"} — ${rows.length}.`, + summarizeBankOperationDirections(rows), roleBoundary ?? "Показываю подтвержденные банковские операции из текущего среза.", - bankOperationEvidenceLine(rows), - ...formatTopRows(rows, rows.length) + bankOperationEvidenceLine(rows, preferredBankEvidenceDirection(options.userMessage)), + ...formatTopRows(visibleRows, visibleRows.length) ]; + if (rows.length > visibleRows.length) { + lines.push(`Показаны первые ${visibleRows.length} из ${rows.length}; полный список остается в подтвержденном срезе.`); + } return { responseType: "FACTUAL_LIST", text: lines.join("\n") @@ -4983,11 +5031,17 @@ function composeFactualReplyBody( } if (intent === "bank_operations_by_contract") { + const visibleRows = rows.slice(0, Math.min(rows.length, 5)); const lines = [ `Коротко: найдено банковских операций по договору — ${rows.length}.`, + summarizeBankOperationDirections(rows), "Показываю подтвержденные банковские операции из текущего среза.", - ...formatTopRows(rows, rows.length) + bankOperationEvidenceLine(rows), + ...formatTopRows(visibleRows, visibleRows.length) ]; + if (rows.length > visibleRows.length) { + lines.push(`Показаны первые ${visibleRows.length} из ${rows.length}; полный список остается в подтвержденном срезе.`); + } return { responseType: "FACTUAL_LIST", text: lines.join("\n") diff --git a/llm_normalizer/backend/src/services/address_runtime/decomposeStage.ts b/llm_normalizer/backend/src/services/address_runtime/decomposeStage.ts index 597d613..a80f66b 100644 --- a/llm_normalizer/backend/src/services/address_runtime/decomposeStage.ts +++ b/llm_normalizer/backend/src/services/address_runtime/decomposeStage.ts @@ -233,6 +233,7 @@ const FOLLOWUP_LOW_QUALITY_COUNTERPARTY_TOKENS = new Set([ "что", "все", "всё", + "или", "кроме", "помимо", "этого", @@ -251,6 +252,30 @@ const FOLLOWUP_LOW_QUALITY_COUNTERPARTY_TOKENS = new Set([ "договора", "контрагент", "контрагента", + "контрагенту", + "клиент", + "клиента", + "клиенту", + "клиентом", + "клиенты", + "поставщик", + "поставщика", + "поставщику", + "поставщиком", + "поставщики", + "покупатель", + "покупателя", + "покупателю", + "заказчик", + "заказчика", + "заказчику", + "обычный", + "обычная", + "обычное", + "обычные", + "обычного", + "обычному", + "обычным", "еще", "ещё", "другие", @@ -853,6 +878,22 @@ function hasBroadCounterpartyRankingCue(text: string): boolean { ); } +function isBroadDebtPolarityQuestion(intent: AddressIntent, text: string): boolean { + if (intent !== "payables_confirmed_as_of_date" && intent !== "receivables_confirmed_as_of_date") { + return false; + } + const normalized = textWithRepairedVariant(String(text ?? "")).toLowerCase().replace(/ё/g, "е"); + if (!/(?:долж|задолж|дебитор|кредитор|обязательств)/iu.test(normalized)) { + return false; + } + if (/(?:по\s+(?:нему|ней|ним|этому|этой|этому\s+контрагенту|этой\s+компании|поставщику|клиенту|покупателю|заказчику)|\bон\b|\bона\b)/iu.test(normalized)) { + return false; + } + return /(?:^|[\s,.;:!?()\-])(?:кто|кому|какие|какой|список|топ|все|всех|всего)(?=$|[\s,.;:!?()\-])/iu.test( + normalized + ); +} + function mergeFollowupFilters( current: AddressFilterSet, intent: AddressIntent, @@ -1062,11 +1103,16 @@ function mergeFollowupFilters( previousCounterparty ?? (followupContext.previous_anchor_type === "counterparty" ? previousAnchorValue : null); const currentCounterparty = toNonEmptyString(merged.counterparty); + const suppressCounterpartyForBroadDebtQuestion = isBroadDebtPolarityQuestion(intent, userMessage) && !currentCounterparty; const shouldInheritCounterparty = - !currentCounterparty || - (Boolean(inheritedCounterparty) && - isLowQualityCounterpartyAnchor(currentCounterparty) && - !isLowQualityCounterpartyAnchor(inheritedCounterparty)); + !suppressCounterpartyForBroadDebtQuestion && + (!currentCounterparty || + (Boolean(inheritedCounterparty) && + isLowQualityCounterpartyAnchor(currentCounterparty) && + !isLowQualityCounterpartyAnchor(inheritedCounterparty))); + if (inheritedCounterparty && suppressCounterpartyForBroadDebtQuestion) { + reasons.push("counterparty_carryover_suppressed_for_broad_debt_polarity_question"); + } if (inheritedCounterparty && shouldInheritCounterparty) { merged.counterparty = inheritedCounterparty; reasons.push(currentCounterparty ? "counterparty_replaced_from_followup_context" : "counterparty_from_followup_context"); diff --git a/llm_normalizer/backend/src/services/assistantAddressAttemptRuntimeAdapter.ts b/llm_normalizer/backend/src/services/assistantAddressAttemptRuntimeAdapter.ts index 9c2c66c..01b52e5 100644 --- a/llm_normalizer/backend/src/services/assistantAddressAttemptRuntimeAdapter.ts +++ b/llm_normalizer/backend/src/services/assistantAddressAttemptRuntimeAdapter.ts @@ -65,6 +65,7 @@ export interface RunAssistantAddressAttemptRuntimeInput hasLivingChatSignal: RunAssistantLivingChatAttemptRuntimeInput["hasLivingChatSignal"]; shouldEmitOrganizationSelectionReply: RunAssistantLivingChatAttemptRuntimeInput["shouldEmitOrganizationSelectionReply"]; hasAssistantCapabilityQuestionSignal: RunAssistantLivingChatAttemptRuntimeInput["hasAssistantCapabilityQuestionSignal"]; + resolveOrganizationSelectionFromMessage: RunAssistantLivingChatAttemptRuntimeInput["resolveOrganizationSelectionFromMessage"]; resolveDataScopeProbe: RunAssistantLivingChatAttemptRuntimeInput["resolveDataScopeProbe"]; applyScriptGuard: RunAssistantLivingChatAttemptRuntimeInput["applyScriptGuard"]; applyGroundingGuard: RunAssistantLivingChatAttemptRuntimeInput["applyGroundingGuard"]; @@ -185,6 +186,7 @@ export async function runAssistantAddressAttemptRuntime( hasLivingChatSignal: input.hasLivingChatSignal, shouldEmitOrganizationSelectionReply: input.shouldEmitOrganizationSelectionReply, hasAssistantCapabilityQuestionSignal: input.hasAssistantCapabilityQuestionSignal, + resolveOrganizationSelectionFromMessage: input.resolveOrganizationSelectionFromMessage, resolveDataScopeProbe: input.resolveDataScopeProbe, applyScriptGuard: input.applyScriptGuard, applyGroundingGuard: input.applyGroundingGuard, diff --git a/llm_normalizer/backend/src/services/assistantLivingChatAttemptInputBuilder.ts b/llm_normalizer/backend/src/services/assistantLivingChatAttemptInputBuilder.ts index 094b2fd..c7281b3 100644 --- a/llm_normalizer/backend/src/services/assistantLivingChatAttemptInputBuilder.ts +++ b/llm_normalizer/backend/src/services/assistantLivingChatAttemptInputBuilder.ts @@ -25,6 +25,7 @@ export interface BuildAssistantLivingChatAttemptRuntimeInputInput["hasLivingChatSignal"]; shouldEmitOrganizationSelectionReply: RunAssistantLivingChatAttemptRuntimeInput["shouldEmitOrganizationSelectionReply"]; hasAssistantCapabilityQuestionSignal: RunAssistantLivingChatAttemptRuntimeInput["hasAssistantCapabilityQuestionSignal"]; + resolveOrganizationSelectionFromMessage: RunAssistantLivingChatAttemptRuntimeInput["resolveOrganizationSelectionFromMessage"]; resolveDataScopeProbe: RunAssistantLivingChatAttemptRuntimeInput["resolveDataScopeProbe"]; applyScriptGuard: RunAssistantLivingChatAttemptRuntimeInput["applyScriptGuard"]; applyGroundingGuard: RunAssistantLivingChatAttemptRuntimeInput["applyGroundingGuard"]; @@ -79,6 +80,7 @@ export function buildAssistantLivingChatAttemptRuntimeInput boolean; shouldEmitOrganizationSelectionReply: (message: string, activeOrganization: string | null) => boolean; hasAssistantCapabilityQuestionSignal: (message: string) => boolean; + resolveOrganizationSelectionFromMessage: (message: string, knownOrganizations: unknown[]) => string | null; resolveDataScopeProbe: () => Promise | null>; executeLlmChat: () => Promise; applyScriptGuard: (chatText: string, userMessage: string) => { @@ -78,6 +79,54 @@ function hasPriorAssistantTurn(items: unknown[]): boolean { return items.some((item) => item && typeof item === "object" && (item as { role?: string }).role === "assistant"); } +function shouldProbeBareOrganizationScopeCandidate(input: { + userMessage: string; + selectedOrganization: string | null; + activeOrganization: string | null; + dataScopeMetaQuery: boolean; + capabilityMetaQuery: boolean; + destructiveSignal: boolean; + dangerSignal: boolean; + operationalSignal: boolean; +}): boolean { + if ( + input.selectedOrganization || + input.activeOrganization || + input.dataScopeMetaQuery || + input.capabilityMetaQuery || + input.destructiveSignal || + input.dangerSignal || + input.operationalSignal + ) { + return false; + } + + const raw = String(input.userMessage ?? "").trim(); + if (!raw || raw.length > 80 || /[?!]/u.test(raw) || /\d/u.test(raw) || !/\p{L}/u.test(raw)) { + return false; + } + const tokenCount = raw.split(/\s+/u).filter(Boolean).length; + if (tokenCount < 1 || tokenCount > 5) { + return false; + } + + const normalized = raw + .toLowerCase() + .replace(/\u0451/gu, "\u0435") + .replace(/\s+/gu, " ") + .trim(); + if ( + /^(?:\u043f\u0440\u0438\u0432\u0435\u0442|\u0437\u0434\u0440\u0430\u0432\u0441\u0442\u0432\u0443\u0439|\u0437\u0434\u0440\u0430\u0432\u0441\u0442\u0432\u0443\u0439\u0442\u0435|\u0434\u0430|\u043d\u0435\u0442|\u043e\u043a|\u043e\u043a\u0435\u0439|\u0441\u043f\u0430\u0441\u0438\u0431\u043e|\u043f\u043e\u043a\u0430|\u0433\u043e|\u0434\u0430\u043b\u044c\u0448\u0435|\u043f\u043e\u043d\u044f\u043b|\u043f\u043e\u043d\u044f\u043b\u0430)(?:\s|$)/iu.test( + normalized + ) + ) { + return false; + } + return !/(?:\u0441\u043a\u043e\u043b\u044c\u043a\u043e|\u043f\u043e\u043a\u0430\u0436\u0438|\u0434\u0430\u0439|\u0440\u0430\u0441\u0441\u043a\u0430\u0436\u0438|\u0447\u0442\u043e|\u043a\u0430\u043a|\u0433\u0434\u0435|\u043a\u043e\u0433\u0434\u0430|\u043f\u043e\u0447\u0435\u043c\u0443|\u0437\u0430\u0447\u0435\u043c|\u043c\u043e\u0436\u0435\u0448\u044c|\u0443\u043c\u0435\u0435\u0448\u044c|\u043d\u0430\u0434\u043e|\u043d\u0443\u0436\u043d\u043e|\u0445\u043e\u0447\u0443|\u043e\u0441\u0442\u0430\u0442\u043a|\u043d\u0434\u0441|\u0434\u043e\u043b\u0433|\u0434\u0435\u0431\u0438\u0442\u043e\u0440|\u043a\u0440\u0435\u0434\u0438\u0442\u043e\u0440|\u0441\u043a\u043b\u0430\u0434|\u0442\u043e\u0432\u0430\u0440|\u043a\u043e\u043d\u0442\u0440\u0430\u0433\u0435\u043d\u0442|\u043e\u0431\u043e\u0440\u043e\u0442|\u0432\u044b\u0440\u0443\u0447\u043a|\u043f\u0440\u0438\u0431\u044b\u043b)/iu.test( + normalized + ); +} + function buildDeterministicSmalltalkLeadReply(): string { return "\u041f\u0440\u0438\u0432\u0435\u0442! \u0412\u0441\u0451 \u043d\u043e\u0440\u043c\u0430\u043b\u044c\u043d\u043e."; } @@ -160,6 +209,8 @@ export async function runAssistantLivingChatRuntime( let livingChatGroundingGuardApplied = false; let livingChatGroundingGuardReason: string | null = null; let livingChatProactiveScopeOfferApplied = false; + let livingChatBareScopeProbeAttempted = false; + let livingChatBareScopeProbeMatchedOrganization: string | null = null; const continuityActiveOrganization = organizationAuthority.continuityActiveOrganization; let knownOrganizations = [...organizationAuthority.knownOrganizations]; let selectedOrganization = organizationAuthority.selectedOrganization; @@ -186,6 +237,32 @@ export async function runAssistantLivingChatRuntime( const lastMemoryAddressDebug = memoryRecapContext.lastMemoryAddressDebug; const lastAnswerInspectionAddressDebug = memoryRecapContext.lastAnswerInspectionAddressDebug; + if ( + shouldProbeBareOrganizationScopeCandidate({ + userMessage, + selectedOrganization, + activeOrganization, + dataScopeMetaQuery, + capabilityMetaQuery, + destructiveSignal, + dangerSignal, + operationalSignal + }) + ) { + dataScopeProbe = await input.resolveDataScopeProbe(); + livingChatBareScopeProbeAttempted = true; + knownOrganizations = input.mergeKnownOrganizations([ + ...knownOrganizations, + ...(Array.isArray(dataScopeProbe?.organizations) ? (dataScopeProbe.organizations as unknown[]) : []) + ]); + const probedOrganization = input.resolveOrganizationSelectionFromMessage(userMessage, knownOrganizations); + if (probedOrganization) { + selectedOrganization = probedOrganization; + activeOrganization = probedOrganization; + livingChatBareScopeProbeMatchedOrganization = probedOrganization; + } + } + if (capabilityMetaQuery && (destructiveSignal || dangerSignal)) { chatText = input.buildAssistantSafetyRefusalReply(); livingChatSource = "deterministic_safety_refusal"; @@ -388,6 +465,8 @@ export async function runAssistantLivingChatRuntime( living_chat_grounding_guard_applied: livingChatGroundingGuardApplied, living_chat_grounding_guard_reason: livingChatGroundingGuardReason, living_chat_proactive_scope_offer_applied: livingChatProactiveScopeOfferApplied, + living_chat_bare_scope_probe_attempted: livingChatBareScopeProbeAttempted, + living_chat_bare_scope_probe_matched_organization: livingChatBareScopeProbeMatchedOrganization, living_chat_data_scope_probe_status: dataScopeProbe?.status ?? null, living_chat_data_scope_probe_channel: dataScopeProbe?.channel ?? null, living_chat_data_scope_probe_org_count: Array.isArray(dataScopeProbe?.organizations) diff --git a/llm_normalizer/backend/src/services/assistantMcpDiscoveryAnswerAdapter.ts b/llm_normalizer/backend/src/services/assistantMcpDiscoveryAnswerAdapter.ts index 0e66e83..d7cd584 100644 --- a/llm_normalizer/backend/src/services/assistantMcpDiscoveryAnswerAdapter.ts +++ b/llm_normalizer/backend/src/services/assistantMcpDiscoveryAnswerAdapter.ts @@ -633,7 +633,7 @@ function businessOverviewOutgoingLeaderLine(overview: BusinessOverview): string function businessOverviewSupplierBoundaryBasis(overview: BusinessOverview): string { const leader = overview.top_suppliers?.[0] ?? null; if (!leader) { - return "есть только общий срез исходящих платежей без надежного vendor-risk профиля"; + return "есть только общий срез исходящих платежей без надежного профиля поставщицкого риска"; } const share = percentText(leader.total_amount, overview.outgoing_supplier_payout.total_amount); if (isFinancialInstitutionBucket(leader)) { @@ -672,9 +672,9 @@ function businessOverviewHeadlineMetricsLine(overview: BusinessOverview): string : inlineBusinessOverviewAmount(result.final_result_amount_human_ru); const margin = result.net_margin_to_revenue_pct === null - ? "маржа к выручке 90.01 не рассчитана" - : `маржа к выручке 90.01 ${result.net_margin_to_revenue_pct}%`; - parts.push(`${direction} 90/91/99 ${amount}; ${margin}`); + ? "маржа к подтвержденной выручке не рассчитана" + : `маржа к подтвержденной выручке ${result.net_margin_to_revenue_pct}%`; + parts.push(`${direction} по закрытию счетов 90/91/99 ${amount}; ${margin}`); } const strongestIncomingYear = businessOverviewStrongestIncomingYear(overview); if (strongestIncomingYear) { @@ -685,7 +685,7 @@ function businessOverviewHeadlineMetricsLine(overview: BusinessOverview): string return parts.length > 0 ? overview.accounting_financial_result ? `${parts.join("; ")}. Финрезультат ограничен найденными строками 1С и не является внешним аудитом или юридически подтвержденной отчетностью` - : `${parts.join("; ")}. Это operating-flow proxy по найденным строкам, не бухгалтерская прибыль и не финрезультат` + : `${parts.join("; ")}. Это операционный денежный сигнал по найденным строкам, не бухгалтерская прибыль и не финрезультат` : null; } @@ -706,13 +706,13 @@ function businessOverviewAccountingFinancialResultText(overview: BusinessOvervie : result.final_result_amount_human_ru; const marginText = result.net_margin_to_revenue_pct === null - ? "маржа к выручке 90.01 не рассчитана" - : `маржа к выручке 90.01 ${result.net_margin_to_revenue_pct}%`; + ? "маржа к подтвержденной выручке не рассчитана" + : `маржа к подтвержденной выручке ${result.net_margin_to_revenue_pct}%`; const basis = result.final_transfer_basis === "account_99_to_84_period_close" ? "по закрытию 99 на 84" : "по закрытию 90/91 на 99"; - return `По бухгалтерскому маршруту 90/91/99 за ${result.period_scope} подтвержден ${direction}: ${signedAmount}; ${marginText}. Основа: ${basis}, ${result.period_close_rows_with_amount} строк(и) закрытия периода с суммой. Это учетный финрезультат по найденным строкам 1С, не внешний аудит и не юридически подтвержденная отчетность.`; + return `Нет: денежное операционное нетто не стоит считать чистой прибылью. Отдельно по закрытию счетов 90/91/99 в 1С за ${result.period_scope} подтвержден ${direction}: ${signedAmount}; ${marginText}. Основа: ${basis}, ${result.period_close_rows_with_amount} строк(и) закрытия периода с суммой. Это учетный финрезультат по найденным строкам 1С, не внешний аудит и не юридически подтвержденная отчетность.`; } function businessOverviewDebtDueDateAgingText(overview: BusinessOverview): string | null { @@ -780,12 +780,12 @@ function businessOverviewVendorProcurementQualityText(overview: BusinessOverview ? ` Договорный профиль: используется ${quality.used_contracts} договоров.` : ` Договорный профиль: используется ${quality.used_contracts}/${quality.total_contracts} договоров${quality.used_contract_share_pct === null ? "" : ` (${quality.used_contract_share_pct}%)`}.`; if (quality.evidence_status === "financial_institution_leads_outgoing_cash") { - return `Проверенный procurement-concentration route за ${period}: крупнейший получатель исходящих денег ${topName}${topShare}${topAmount}, всего исходящих платежей ${total}. По названию это банк/финансовая организация, поэтому зависимость от обычного поставщика этим не подтверждается.${financialFlowHintTextRuFromBucket(top)}${nonFinancialText}${contractText} Надежность поставщиков, качество поставок, назначение каждого платежа и полная структура расходов этим маршрутом не доказаны.`; + return `Проверка концентрации закупок/исходящих платежей за ${period}: крупнейший получатель исходящих денег ${topName}${topShare}${topAmount}, всего исходящих платежей ${total}. По названию это банк/финансовая организация, поэтому зависимость от обычного поставщика этим не подтверждается.${financialFlowHintTextRuFromBucket(top)}${nonFinancialText}${contractText} Надежность поставщиков, качество поставок, назначение каждого платежа и полная структура расходов этим срезом не доказаны.`; } if (quality.evidence_status === "reviewed_procurement_concentration") { - return `Проверенный procurement-concentration route за ${period}: крупнейший поставщик/получатель исходящих платежей ${topName}${topShare}${topAmount}, всего исходящих платежей ${total}.${contractText} Это проверенный сигнал концентрации закупок/исходящих платежей, но не аудит надежности поставщика, качества поставок и полной структуры расходов.`; + return `Проверка концентрации закупок/исходящих платежей за ${period}: крупнейший поставщик/получатель исходящих платежей ${topName}${topShare}${topAmount}, всего исходящих платежей ${total}.${contractText} Это проверенный сигнал концентрации закупок/исходящих платежей, но не аудит надежности поставщика, качества поставок и полной структуры расходов.`; } - return `Procurement-concentration route за ${period} отработал по исходящим платежам на ${total}, но надежной небанковской концентрации поставщика по найденным строкам не хватает.${contractText} Полный vendor-risk аудит не подтвержден.`; + return `Проверка концентрации закупок/исходящих платежей за ${period} нашла исходящие платежи на ${total}, но надежной небанковской концентрации поставщика по найденным строкам не хватает.${contractText} Полный аудит поставщицкого риска не подтвержден.`; } function businessOverviewInventoryQualityEventsText(overview: BusinessOverview): string | null { @@ -831,7 +831,7 @@ function headlineFor(mode: AssistantMcpDiscoveryAnswerMode, pilot: AssistantMcpD if (accountingFinancialResultText) { return accountingFinancialResultText; } - return "Нельзя точно подтвердить чистую прибыль и маржу по текущему срезу 1С; есть только bounded operating-flow/trading-margin proxy, не P&L и не бухгалтерский финрезультат."; + return "Нельзя точно подтвердить чистую прибыль и маржу по текущему срезу 1С; есть только ограниченный операционный денежный/товарный сигнал, а не полный отчет о прибыли и не бухгалтерский финрезультат."; } if (isDebtDueDateBoundaryTurn(pilot)) { const dueDateText = businessOverviewDebtDueDateAgingText(overview); @@ -1588,6 +1588,14 @@ function derivedBusinessOverviewConfirmedLines(pilot: AssistantMcpDiscoveryPilot `Годовая раскладка операционного денежного потока построена по подтвержденным строкам 1С за ${yearCountHumanRu(overview.yearly_breakdown.length)}.` ); } + if ( + overview.incoming_customer_revenue.coverage_recovered_by_period_chunking || + overview.outgoing_supplier_payout.coverage_recovered_by_period_chunking + ) { + lines.push( + "Денежное покрытие бизнес-обзора за год восстановлено через помесячные 1С-проверки, а не только через широкий общий запрос." + ); + } if (overview.activity_period) { lines.push( `Окно подтвержденной активности в 1С: ${overview.activity_period.first_activity_date} — ${overview.activity_period.latest_activity_date}; ориентировочно ${overview.activity_period.duration_human_ru}.` @@ -1782,7 +1790,7 @@ function businessOverviewSupplierConcentrationLine(overview: BusinessOverview): return `${base}. По названию это банк/финансовая организация, поэтому это не доказательство зависимости от обычного поставщика без проверки назначения платежа/договора.${nonFinancial ? ` Крупнейший небанковский получатель исходящих денег: ${rankedBucketAmountLabel(nonFinancial)}.` : ""}`; } return share - ? `Концентрация исходящего потока: крупнейший подтвержденный поставщик/получатель исходящих платежей ${leader.axis_value} держит около ${share} проверенных исходящих платежей (${leader.total_amount_human_ru}). Это сигнал procurement concentration по найденным строкам, а не полный vendor-risk аудит или структура всех расходов.` + ? `Концентрация исходящего потока: крупнейший подтвержденный поставщик/получатель исходящих платежей ${leader.axis_value} держит около ${share} проверенных исходящих платежей (${leader.total_amount_human_ru}). Это сигнал концентрации закупок/исходящих платежей по найденным строкам, а не полный аудит поставщицкого риска или структура всех расходов.` : `Крупнейший подтвержденный поставщик/получатель исходящих платежей в проверенном срезе: ${leader.axis_value} — ${leader.total_amount_human_ru}.`; } @@ -1808,7 +1816,7 @@ function businessOverviewYearlyOperatingLine(overview: BusinessOverview): string : `нетто в плюс ${strongestNetYear.net_amount_human_ru}`; parts.push(`лучший год по расчетному операционному нетто ${strongestNetYear.year_bucket}: ${netText}`); } - return `Годовая динамика по проверенным строкам: ${parts.join("; ")}. Это operating-flow proxy, не бухгалтерская прибыль и не финрезультат.`; + return `Годовая динамика по проверенным строкам: ${parts.join("; ")}. Это операционный денежный сигнал, не бухгалтерская прибыль и не финрезультат.`; } function businessOverviewRiskSynthesisLine(overview: BusinessOverview): string | null { @@ -1838,9 +1846,9 @@ function businessOverviewRiskSynthesisLine(overview: BusinessOverview): string | : "нулевой учетный финрезультат"; const marginText = result.net_margin_to_revenue_pct === null - ? "маржа к выручке 90.01 не рассчитана" - : `маржа к выручке 90.01 ${result.net_margin_to_revenue_pct}%`; - signals.push(`${direction} 90/91/99 ${result.final_result_amount_human_ru}, ${marginText}`); + ? "маржа к подтвержденной выручке не рассчитана" + : `маржа к подтвержденной выручке ${result.net_margin_to_revenue_pct}%`; + signals.push(`${direction} по закрытию счетов 90/91/99 ${result.final_result_amount_human_ru}, ${marginText}`); } if (overview.debt_position) { const debtDirection = diff --git a/llm_normalizer/backend/src/services/assistantMcpDiscoveryPilotExecutor.ts b/llm_normalizer/backend/src/services/assistantMcpDiscoveryPilotExecutor.ts index 453f4d6..566e1a4 100644 --- a/llm_normalizer/backend/src/services/assistantMcpDiscoveryPilotExecutor.ts +++ b/llm_normalizer/backend/src/services/assistantMcpDiscoveryPilotExecutor.ts @@ -4879,6 +4879,14 @@ function buildBusinessOverviewConfirmedFacts(derived: AssistantMcpDiscoveryDeriv `Годовая раскладка операционного денежного потока построена по подтвержденным строкам 1С за ${yearCountHumanRu(derived.yearly_breakdown.length)}.` ); } + if ( + derived.incoming_customer_revenue.coverage_recovered_by_period_chunking || + derived.outgoing_supplier_payout.coverage_recovered_by_period_chunking + ) { + facts.push( + "Денежное покрытие бизнес-обзора за год восстановлено через помесячные 1С-проверки, а не только через широкий общий запрос." + ); + } if (derived.activity_period) { facts.push( `Подтвержденное окно активности в 1С: ${derived.activity_period.first_activity_date} — ${derived.activity_period.latest_activity_date}.` @@ -5159,7 +5167,7 @@ function buildBusinessOverviewUnknownFacts(derived: AssistantMcpDiscoveryDerived : null ].filter((item): item is string => Boolean(item)); if (derived?.coverage_limited_by_probe_limit) { - unknowns.unshift("Полное покрытие бизнес-обзора не подтверждено: хотя бы один денежный probe достиг лимита строк."); + unknowns.unshift("Полное покрытие бизнес-обзора не подтверждено: хотя бы один денежный запрос достиг верхней границы выборки."); } return unknowns; } @@ -6183,6 +6191,12 @@ export async function executeAssistantMcpDiscoveryPilot( if (!incomingResult?.error || !outgoingResult?.error) { pushReason(reasonCodes, "pilot_business_overview_query_movements_mcp_executed"); } + if (incomingResult?.coverage_recovered_by_period_chunking) { + pushReason(reasonCodes, "pilot_business_overview_incoming_monthly_period_chunking_recovered_coverage"); + } + if (outgoingResult?.coverage_recovered_by_period_chunking) { + pushReason(reasonCodes, "pilot_business_overview_outgoing_monthly_period_chunking_recovered_coverage"); + } if (taxResult?.error) { pushUnique(queryLimitations, taxResult.error); pushReason(reasonCodes, "pilot_business_overview_tax_query_mcp_error"); diff --git a/llm_normalizer/backend/src/services/assistantMcpDiscoveryPlanner.ts b/llm_normalizer/backend/src/services/assistantMcpDiscoveryPlanner.ts index a5b8940..b836120 100644 --- a/llm_normalizer/backend/src/services/assistantMcpDiscoveryPlanner.ts +++ b/llm_normalizer/backend/src/services/assistantMcpDiscoveryPlanner.ts @@ -110,6 +110,8 @@ interface PlannerBudgetOverride { maxProbeCount?: number; } +const CHUNKED_COVERAGE_PROBE_BUDGET = 30; + function toNonEmptyString(value: unknown): string | null { if (value === null || value === undefined) { return null; @@ -607,12 +609,15 @@ function budgetOverrideFor(input: AssistantMcpDiscoveryPlannerInput, recipe: Pla (recipe.semanticDataNeed === "counterparty value-flow evidence" || recipe.semanticDataNeed === "bidirectional value-flow comparison evidence" || recipe.semanticDataNeed === "ranked value-flow evidence"); - if (!isValueFlowRecipe) { + const isBusinessOverviewRecipe = + recipe.primitives.includes("query_movements") && + recipe.chainId === "business_overview"; + if (!isValueFlowRecipe && !isBusinessOverviewRecipe) { return {}; } if (requestedAggregationAxis === "month" || isYearDateScope(meaning)) { return { - maxProbeCount: 30 + maxProbeCount: CHUNKED_COVERAGE_PROBE_BUDGET }; } return {}; diff --git a/llm_normalizer/backend/src/services/assistantMcpDiscoveryResponseCandidate.ts b/llm_normalizer/backend/src/services/assistantMcpDiscoveryResponseCandidate.ts index 540781c..aabed08 100644 --- a/llm_normalizer/backend/src/services/assistantMcpDiscoveryResponseCandidate.ts +++ b/llm_normalizer/backend/src/services/assistantMcpDiscoveryResponseCandidate.ts @@ -471,8 +471,10 @@ function businessOverviewCoverageLimitLine(overview: Record): s if (outgoing?.coverage_limited_by_probe_limit === true) { limited.push("исходящие"); } + const continuation = + "Если нужен полный сквозной ответ, безопасный следующий шаг — выбрать конкретный год или квартал для дозапроса: тогда широкий срез можно собрать частями без выдачи непроверенного итога."; return limited.length > 0 - ? `Важно: по направлению ${limited.join(" и ")} проверка достигла лимита строк; это расширенный проверенный срез найденных строк, но не гарантия полного бухгалтерского оборота без отдельной полной выгрузки.` + ? `Важно: по направлению ${limited.join(" и ")} проверка достигла лимита строк; это расширенный проверенный срез найденных строк, но не гарантия полного бухгалтерского оборота без отдельной полной выгрузки. ${continuation}` : null; } @@ -649,6 +651,8 @@ function buildCompactBidirectionalValueFlowReply( entryPoint: AssistantMcpDiscoveryRuntimeEntryPointContract, draft: Record ): string | null { + const turnInput = toRecordObject(entryPoint.turn_input); + const turnMeaning = toRecordObject(turnInput?.turn_meaning_ref); const bridge = toRecordObject(entryPoint.bridge); const pilot = toRecordObject(bridge?.pilot); const flow = toRecordObject(pilot?.derived_bidirectional_value_flow); @@ -665,7 +669,13 @@ function buildCompactBidirectionalValueFlowReply( return null; } - const counterparty = toNonEmptyString(flow.counterparty) ?? "запрошенному контрагенту"; + const counterparty = toNonEmptyString(flow.counterparty); + const organizationScope = toNonEmptyString(turnMeaning?.explicit_organization_scope); + const subjectLead = counterparty + ? `по контрагенту ${counterparty}` + : organizationScope + ? `по компании ${organizationScope}` + : "по выбранному контуру"; const period = toNonEmptyString(flow.period_scope); const periodText = period ? ` за период ${period}` : " в проверенном окне"; const incomingRows = sideRowsText(incoming); @@ -674,7 +684,7 @@ function buildCompactBidirectionalValueFlowReply( const outgoingDates = sideDateText(outgoing); const netLabel = bidirectionalNetLabel(flow.net_direction); const lines = [ - `Коротко: по контрагенту ${counterparty}${periodText} по найденным строкам 1С получили ${incomingAmount ?? "0 руб."}, заплатили ${outgoingAmount ?? "0 руб."}; расчетное ${netLabel}: ${sentenceAmount(netAmount) ?? netAmount ?? "0 руб."}.` + `Коротко: ${subjectLead}${periodText} по найденным строкам 1С получили ${incomingAmount ?? "0 руб."}, заплатили ${outgoingAmount ?? "0 руб."}; расчетное ${netLabel}: ${sentenceAmount(netAmount) ?? netAmount ?? "0 руб."}.` ]; const basis: string[] = []; @@ -904,7 +914,7 @@ function buildCompactBusinessOverviewReply( : amount : "сумма не распознана"; lines.push( - `Коротко: по бухгалтерскому маршруту 90/91/99 за ${periodScope} подтвержден ${directionText}: ${amountText}${marginPct ? `; маржа к выручке 90.01 ${marginPct}` : "; маржа к выручке 90.01 не рассчитана"}.` + `Коротко: нет, денежное операционное нетто не стоит считать чистой прибылью. Отдельно по закрытию счетов 90/91/99 в 1С за ${periodScope} подтвержден ${directionText}: ${amountText}${marginPct ? `; маржа к подтвержденной выручке ${marginPct}` : "; маржа к подтвержденной выручке не рассчитана"}.` ); lines.push( "Это учетный финрезультат по найденным строкам закрытия периода в 1С, а не внешний аудит и не юридически подтвержденная отчетность." @@ -916,7 +926,7 @@ function buildCompactBusinessOverviewReply( lines.push( cleanHeadline ? `Коротко: ${localizeLine(cleanHeadline)}` - : "Коротко: нельзя точно подтвердить чистую прибыль и маржу по текущему срезу 1С; есть только bounded operating-flow/trading-margin proxy, не P&L и не бухгалтерский финансовый результат." + : "Коротко: нельзя точно подтвердить чистую прибыль и маржу по текущему срезу 1С; есть только ограниченный операционный денежный/товарный сигнал, а не полный отчет о прибыли и не бухгалтерский финансовый результат." ); const boundaryLines = userFacingLines([ ...toStringList(draft.confirmed_lines), @@ -929,7 +939,7 @@ function buildCompactBusinessOverviewReply( lines.push(...boundaryLines.map(localizeLine)); } lines.push( - "Для точного P&L нужны отдельный маршрут по себестоимости, расходам, закрытию периода и финрезультату; текущий proxy нельзя выдавать за подтвержденную чистую прибыль или маржу." + "Для точного отчета о прибыли нужны отдельная проверка себестоимости, расходов, закрытия периода и финрезультата; текущий ограниченный сигнал нельзя выдавать за подтвержденную чистую прибыль или маржу." ); if (limitLine) { lines.push(limitLine); @@ -1056,7 +1066,7 @@ function buildCompactBusinessOverviewReply( : `крупнейший подтвержденный поставщик/получатель исходящих платежей: ${topSupplier}` : outgoingAmount ? `исходящие платежи/закупочный поток в проверенном срезе: ${outgoingAmount}` - : "есть только ограниченный срез исходящих платежей без полного vendor-risk профиля"; + : "есть только ограниченный срез исходящих платежей без полного профиля поставщицкого риска"; const proxyLabel = topSupplierLooksFinancial ? "сигнал концентрации исходящих денег" : "сигнал концентрации закупок/исходящих платежей"; @@ -1106,7 +1116,7 @@ function buildCompactBusinessOverviewReply( ); lines.push(previousCounterpartySummary.line); lines.push( - `Можно утверждать: по компании подтвержден operating-flow proxy по найденным строкам 1С; по ${separateSubject} отдельно подтверждены входящие/исходящие строки, расчетное нетто и документы из предыдущего контрагентского среза.` + `Можно утверждать: по компании подтвержден операционный денежный сигнал по найденным строкам 1С; по ${separateSubject} отдельно подтверждены входящие/исходящие строки, расчетное нетто и документы из предыдущего контрагентского среза.` ); lines.push( `Нельзя утверждать: это не чистая прибыль, не полный бухгалтерский оборот вне проверенного окна и не доказательство, что ${separateSubject} является главным клиентом или поставщиком как бизнес-роль.` @@ -1150,7 +1160,7 @@ function buildCompactBusinessOverviewReply( lines.push( `Коротко: ${organizationPrefix}${period} по подтвержденным строкам 1С получили ${incomingAmount ?? "0 руб."}; исходящие платежи/списания ${outgoingAmount ?? "0 руб."}; ${netDirection} ${sentenceAmount(netAmount) ?? netAmount ?? "0 руб"}${topCustomerLead}${topSupplierLead}${roleBoundaryLead}${separateSubjectLead}.` ); - lines.push('Метод: "заработали" здесь считаю как денежный operating-flow proxy по 1С; это не чистая прибыль и не финрезультат.'); + lines.push('Метод: "заработали" здесь считаю как операционный денежный показатель по 1С; это не чистая прибыль и не финрезультат.'); if (!directMoneyAnswer && customerName && customerAmount) { lines.push( topCustomerLooksFinancial diff --git a/llm_normalizer/backend/src/services/assistantMcpDiscoveryResponsePolicy.ts b/llm_normalizer/backend/src/services/assistantMcpDiscoveryResponsePolicy.ts index 7d5bc4d..839ccad 100644 --- a/llm_normalizer/backend/src/services/assistantMcpDiscoveryResponsePolicy.ts +++ b/llm_normalizer/backend/src/services/assistantMcpDiscoveryResponsePolicy.ts @@ -455,6 +455,42 @@ function hasExactValueFlowReplyForBusinessOverviewDirectMoneyNeed( ); } +function hasExactBankOperationsAddressReply( + input: ApplyAssistantMcpDiscoveryResponsePolicyInput, + entryPoint: AssistantMcpDiscoveryRuntimeEntryPointContract | null +): boolean { + if (!isDiscoveryReadyAddressCandidate(input, entryPoint)) { + return false; + } + if (!hasEffectivelyFactualAddressReply(input)) { + return false; + } + const source = String(input.currentReplySource ?? input.livingChatSource ?? "").trim().toLowerCase(); + if (source !== "address_query_runtime_v1" && source !== "address_exact" && source !== "address_lane") { + return false; + } + const detectedIntent = toNonEmptyString(input.addressRuntimeMeta?.detected_intent); + const selectedRecipe = toNonEmptyString(input.addressRuntimeMeta?.selected_recipe); + const isBankIntent = + detectedIntent === "bank_operations_by_counterparty" || detectedIntent === "bank_operations_by_contract"; + const isBankRecipe = + selectedRecipe === "address_bank_operations_by_counterparty_v1" || + selectedRecipe === "address_bank_operations_by_contract_v1"; + if (!isBankIntent || !isBankRecipe) { + return false; + } + const grounding = toRecordObject(input.addressRuntimeMeta?.answer_grounding_check); + const groundingStatus = toNonEmptyString(grounding?.status); + const mcpCallStatus = toNonEmptyString(input.addressRuntimeMeta?.mcp_call_status); + const routeMode = toNonEmptyString(input.addressRuntimeMeta?.capability_route_mode); + return Boolean( + mcpCallStatus === "matched_non_empty" || + groundingStatus === "grounded" || + routeMode === "exact" || + hasFullConfirmedTruth(input) + ); +} + function hasValueFlowActionConflictWithDiscoveryTurnMeaning( input: ApplyAssistantMcpDiscoveryResponsePolicyInput, entryPoint: AssistantMcpDiscoveryRuntimeEntryPointContract | null @@ -471,6 +507,9 @@ function hasValueFlowActionConflictWithDiscoveryTurnMeaning( if (askedDomain !== "counterparty_value") { return false; } + if (hasExactBankOperationsAddressReply(input, entryPoint)) { + return false; + } const detectedIntent = toNonEmptyString(input.addressRuntimeMeta?.detected_intent); if (askedAction === "payout") { return detectedIntent !== "supplier_payouts_profile"; @@ -647,6 +686,9 @@ function hasSemanticConflictWithDiscoveryTurnMeaning( if (hasRuntimeMatchedExactReply(input, entryPoint)) { return false; } + if (hasExactBankOperationsAddressReply(input, entryPoint)) { + return false; + } const detectedIntent = toNonEmptyString(input.addressRuntimeMeta?.detected_intent); const turnMeaning = readDiscoveryTurnMeaning(entryPoint); const askedDomain = toNonEmptyString(turnMeaning?.asked_domain_family); @@ -771,6 +813,7 @@ export function applyAssistantMcpDiscoveryResponsePolicy( input, entryPoint ); + const exactBankOperationsAddressReply = hasExactBankOperationsAddressReply(input, entryPoint); const openScopeValueFlowDiscoveryPriority = hasOpenScopeValueFlowDiscoveryPriority(input, entryPoint); const metadataDiscoveryPriority = hasMetadataDiscoveryPriority(input, entryPoint); const valueFlowActionConflictWithDiscoveryTurnMeaning = hasValueFlowActionConflictWithDiscoveryTurnMeaning( @@ -851,6 +894,9 @@ export function applyAssistantMcpDiscoveryResponsePolicy( "mcp_discovery_response_policy_keep_exact_value_flow_reply_over_business_overview_direct_money_clarification" ); } + if (exactBankOperationsAddressReply) { + pushReason(reasonCodes, "mcp_discovery_response_policy_keep_exact_bank_operations_address_reply"); + } if (deterministicBroadBusinessEvaluationReply && candidate.candidate_status === "clarification_candidate") { pushReason( reasonCodes, @@ -882,6 +928,7 @@ export function applyAssistantMcpDiscoveryResponsePolicy( !runtimeMatchedExactReply && !staleMetadataDiscoveryFallbackAgainstExactAddressReply && !exactValueFlowReplyForBusinessOverviewDirectMoneyNeed && + !exactBankOperationsAddressReply && !(deterministicBroadBusinessEvaluationReply && candidate.candidate_status === "clarification_candidate") && ALLOWED_CANDIDATE_STATUSES.has(candidate.candidate_status) && candidate.eligible_for_future_hot_runtime && diff --git a/llm_normalizer/backend/src/services/assistantMcpDiscoveryTurnInputAdapter.ts b/llm_normalizer/backend/src/services/assistantMcpDiscoveryTurnInputAdapter.ts index 4d6430f..e1353b3 100644 --- a/llm_normalizer/backend/src/services/assistantMcpDiscoveryTurnInputAdapter.ts +++ b/llm_normalizer/backend/src/services/assistantMcpDiscoveryTurnInputAdapter.ts @@ -222,6 +222,9 @@ function isGarbageSemanticAnchorCandidate(value: string | null): boolean { /^(?:и\s+)?кто\s+(?:главн\p{L}*|основн\p{L}*|крупн\p{L}*)\s+(?:клиент|покупател|поставщик|контрагент)(?:\s+в)?$/iu.test( text ) || + /^(?:или\s+)?(?:обычн\p{L}*\s+)?(?:клиент|поставщик|покупател\p{L}*|заказчик|контрагент)(?:\s+или\s+(?:клиент|поставщик|покупател\p{L}*|заказчик|контрагент))?$/iu.test( + text + ) || /^(?:что|чего)\s+(?:подтвержден\p{L}*|не\s+хватает)/iu.test(text) || /^(?:можно\s+ли|если\s+нет|дай\s+proxy|дай\s+прокси)/iu.test(text) ) { @@ -1303,6 +1306,7 @@ function rawEntityResolutionCandidate(text: string): string | null { function rawScopedEntityCandidateFromText(text: string): string | null { const source = repairAddressMojibakeText(String(text ?? "")); const patterns = [ + /(?:^|[\s,.;:!?])(?:по|у|для|for|by)\s+(.+?)(?=$|[,.;:!?]|\s+(?:за|на|в|во|к|по|сколько|скок|как|какое|какой|какая|какие|получ\p{L}*|заплат\p{L}*|нетто|документ\p{L}*|движени\p{L}*|операц\p{L}*|плат[её]ж\p{L}*)(?=$|[\s,.;:!?]))/iu, /(?:^|[\s,.;:!?])(?:по|у|для|for|by)\s+([\p{L}\d._-]{2,})(?=$|[\s,.;:!?])/iu, /(?:документ(?:ам|ы)?|движени(?:ям|я)?|операци(?:ям|и)?|плат[её]ж(?:ам|и)?)\s+([\p{L}\d._-]{2,})(?=$|[\s,.;:!?])/iu ]; diff --git a/llm_normalizer/backend/src/services/assistantService.ts b/llm_normalizer/backend/src/services/assistantService.ts index 5d0fb33..ce3ce19 100644 --- a/llm_normalizer/backend/src/services/assistantService.ts +++ b/llm_normalizer/backend/src/services/assistantService.ts @@ -4913,6 +4913,7 @@ export class AssistantService { hasLivingChatSignal, shouldEmitOrganizationSelectionReply, hasAssistantCapabilityQuestionSignal, + resolveOrganizationSelectionFromMessage, resolveDataScopeProbe: () => resolveAssistantDataScopeProbe(), applyScriptGuard: applyLivingChatScriptGuardFromPolicy, applyGroundingGuard: applyLivingChatGroundingGuardFromPolicy, diff --git a/llm_normalizer/backend/src/services/assistantTransitionPolicy.ts b/llm_normalizer/backend/src/services/assistantTransitionPolicy.ts index 57744d0..58ade0b 100644 --- a/llm_normalizer/backend/src/services/assistantTransitionPolicy.ts +++ b/llm_normalizer/backend/src/services/assistantTransitionPolicy.ts @@ -1332,10 +1332,11 @@ export function createAssistantTransitionPolicy(deps) { hasSelectedObjectInventorySignalPrimary || hasSelectedObjectInventorySignalAlternate) ); + const explicitIntentForCarryover = debtRoleSwapIntent ? debtRoleSwapIntent : explicitIntent; const carryoverTargetIntent = resolveFollowupTargetIntent( inventoryPurchaseDateVatBridge, selectedObjectRetargetIntent, - explicitIntent, + explicitIntentForCarryover, sourceIntent, followupSelectionMode, deps.toNonEmptyString(inventoryRootFrame?.intent), diff --git a/llm_normalizer/backend/src/services/assistantTurnRuntimeInputBuilder.ts b/llm_normalizer/backend/src/services/assistantTurnRuntimeInputBuilder.ts index 427168c..d0720f2 100644 --- a/llm_normalizer/backend/src/services/assistantTurnRuntimeInputBuilder.ts +++ b/llm_normalizer/backend/src/services/assistantTurnRuntimeInputBuilder.ts @@ -72,6 +72,8 @@ export interface AssistantTurnRuntimeBuilderDeps { AddressAttemptRuntimeInput["shouldEmitOrganizationSelectionReply"]; hasAssistantCapabilityQuestionSignal: AddressAttemptRuntimeInput["hasAssistantCapabilityQuestionSignal"]; + resolveOrganizationSelectionFromMessage: + AddressAttemptRuntimeInput["resolveOrganizationSelectionFromMessage"]; resolveDataScopeProbe: AddressAttemptRuntimeInput["resolveDataScopeProbe"]; applyScriptGuard: AddressAttemptRuntimeInput["applyScriptGuard"]; applyGroundingGuard: AddressAttemptRuntimeInput["applyGroundingGuard"]; @@ -170,6 +172,7 @@ export function buildAssistantAddressAttemptRuntimeInput hasLivingChatSignal: deps.hasLivingChatSignal, shouldEmitOrganizationSelectionReply: deps.shouldEmitOrganizationSelectionReply, hasAssistantCapabilityQuestionSignal: deps.hasAssistantCapabilityQuestionSignal, + resolveOrganizationSelectionFromMessage: deps.resolveOrganizationSelectionFromMessage, resolveDataScopeProbe: deps.resolveDataScopeProbe, applyScriptGuard: deps.applyScriptGuard, applyGroundingGuard: deps.applyGroundingGuard, diff --git a/llm_normalizer/backend/tests/addressQueryRuntimeM23.test.ts b/llm_normalizer/backend/tests/addressQueryRuntimeM23.test.ts index 940df30..be0c47e 100644 --- a/llm_normalizer/backend/tests/addressQueryRuntimeM23.test.ts +++ b/llm_normalizer/backend/tests/addressQueryRuntimeM23.test.ts @@ -584,11 +584,39 @@ describe("address compose stage utf8 headers", () => { expect(reply.text).toContain("по СБЕРБАНК"); expect(reply.text).toContain("входящее поступление от банка в этом срезе не подтверждено"); - expect(reply.text).toContain("клиентскую выручку"); + expect(reply.text).toContain("не подтвержденная клиентская выручка"); + expect(reply.text).toContain("Сводка по направлению"); expect(reply.text).toContain("Основание 1С"); expect(reply.text).toContain("вид операции/назначение платежа/договор"); }); + it("keeps bank operation drilldown compact when many rows are available", () => { + const rows = Array.from({ length: 8 }, (_, index) => ({ + period: `2020-01-${String(index + 1).padStart(2, "0")}T12:00:00Z`, + registrator: + index % 2 === 0 + ? `Поступление на расчетный счет 0000000000${index}` + : `Списание с расчетного счета 0000000000${index}`, + account_dt: "0", + account_kt: "0", + amount: 100 + index, + analytics: ["СБЕРБАНК, ПАО", "0"], + counterparty: "СБЕРБАНК, ПАО", + operation_kind: index % 2 === 0 ? "Прочее поступление" : "Прочее списание", + payment_purpose: index % 2 === 0 ? "Депозит" : "Комиссия банка" + })); + + const reply = composeFactualReply("bank_operations_by_counterparty", rows, { + counterpartyHint: "СБЕРБАНК", + userMessage: "СБЕРБАНК это поставщик или финансовые списания?" + }); + + expect(reply.text).toContain("Сводка по направлению"); + expect(reply.text).toContain("Это не обычный поставщик автоматически"); + expect(reply.text).toContain("Показаны первые 5 из 8"); + expect(reply.text).not.toContain("00000000007"); + }); + it("renders readable russian header for contracts-by-counterparty list", () => { const reply = composeFactualReply("list_contracts_by_counterparty", [ { @@ -2692,6 +2720,13 @@ describe("address intent resolver expansion (M2.3a)", () => { expect(result.intent).toBe("supplier_payouts_profile"); }); + it("resolves explicit supplier payment amount question into supplier payouts profile", () => { + const result = resolveAddressIntent( + "А теперь по поставщику Группа СВК за 2020: сколько мы ему заплатили и какой общий денежный смысл?" + ); + expect(result.intent).toBe("supplier_payouts_profile"); + }); + it("resolves contract usage and value intent", () => { const result = resolveAddressIntent("договоры по обороту ранкни и дай топ-20"); expect(result.intent).toBe("contract_usage_and_value"); @@ -2989,6 +3024,38 @@ describe("address filter extraction for balance drilldown", () => { expect(extracted.extracted_filters.counterparty).toBeUndefined(); }); + it("drops generic customer/supplier role tail from broad company overview wording", () => { + const extracted = extractAddressFilters( + "Теперь дай взрослый обзор за 2020 по компании: входящие, исходящие, нетто, топы, но банк в топах отдельно объясни как финансовый поток, если по назначению он не обычный клиент или поставщик.", + "customer_revenue_and_payments" + ); + expect(extracted.extracted_filters.counterparty).toBeUndefined(); + expect(extracted.extracted_filters.period_from).toBe("2020-01-01"); + expect(extracted.extracted_filters.period_to).toBe("2020-12-31"); + expect(extracted.warnings).toContain("counterparty_anchor_dropped_low_quality"); + }); + + it("clears generic customer/supplier role tail from inherited follow-up anchor", () => { + const result = runAddressDecomposeStage( + "Покажи документы по нему за 2020", + { + previous_intent: "list_documents_by_counterparty", + previous_anchor_type: "counterparty", + previous_anchor_value: "или поставщик" + } + ); + expect(result?.filters.extracted_filters.counterparty).toBeUndefined(); + expect(result?.filters.extracted_filters.period_from).toBe("2020-01-01"); + expect(result?.filters.extracted_filters.period_to).toBe("2020-12-31"); + expect(result?.filters.warnings).toContain("counterparty_cleared_low_quality_followup_anchor"); + }); + + it("keeps real counterparty after explicit supplier role prefix", () => { + const extracted = extractAddressFilters("Покажи платежи по поставщику Альфа за июль 2020", "supplier_payouts_profile"); + expect(extracted.extracted_filters.counterparty).toBe("Альфа"); + expect(extracted.warnings).not.toContain("counterparty_anchor_dropped_low_quality"); + }); + it("derives VAT forecast quarter-to-date window when plain date phrase is present", () => { const extracted = extractAddressFilters( "мож прикинусь плиз скока ндс надо заплатить на 15 марта 2020 года", @@ -4751,6 +4818,65 @@ describe("address decompose stage follow-up carryover", () => { expect(result?.baseReasons).toContain("address_followup_context_applied"); }); + it("does not inherit stale counterparty for broad receivables mirror question", () => { + const result = runAddressDecomposeStage("а нам кто должен на конец 2020?", { + previous_intent: "payables_confirmed_as_of_date", + previous_filters: { + organization: "ООО Альтернатива Плюс", + counterparty: "Группа СВК", + period_from: "2020-01-01", + period_to: "2020-12-31", + as_of_date: "2020-12-31" + }, + previous_anchor_type: "counterparty", + previous_anchor_value: "Группа СВК" + }); + expect(result).not.toBeNull(); + expect(result?.intent.intent).toBe("receivables_confirmed_as_of_date"); + expect(result?.filters.extracted_filters.organization).toBe("ООО Альтернатива Плюс"); + expect(result?.filters.extracted_filters.as_of_date).toBe("2020-12-31"); + expect(result?.filters.extracted_filters.counterparty).toBeUndefined(); + expect(result?.baseReasons).toContain("counterparty_carryover_suppressed_for_broad_debt_polarity_question"); + expect(result?.baseReasons).not.toContain("counterparty_from_followup_context"); + }); + + it("does not inherit stale counterparty for broad payables mirror question", () => { + const result = runAddressDecomposeStage("кому мы должны на конец 2020?", { + previous_intent: "supplier_payouts_profile", + previous_filters: { + organization: "ООО Альтернатива Плюс", + counterparty: "Группа СВК", + period_from: "2020-01-01", + period_to: "2020-12-31" + }, + previous_anchor_type: "counterparty", + previous_anchor_value: "Группа СВК" + }); + expect(result).not.toBeNull(); + expect(result?.intent.intent).toBe("payables_confirmed_as_of_date"); + expect(result?.filters.extracted_filters.organization).toBe("ООО Альтернатива Плюс"); + expect(result?.filters.extracted_filters.counterparty).toBeUndefined(); + expect(result?.baseReasons).toContain("counterparty_carryover_suppressed_for_broad_debt_polarity_question"); + }); + + it("keeps referential counterparty carryover for debt question that explicitly says по нему", () => { + const result = runAddressDecomposeStage("по нему сколько он нам должен на конец 2020?", { + previous_intent: "customer_revenue_and_payments", + previous_filters: { + organization: "ООО Альтернатива Плюс", + counterparty: "Группа СВК", + period_from: "2020-01-01", + period_to: "2020-12-31" + }, + previous_anchor_type: "counterparty", + previous_anchor_value: "Группа СВК" + }); + expect(result).not.toBeNull(); + expect(result?.intent.intent).toBe("receivables_confirmed_as_of_date"); + expect(result?.filters.extracted_filters.counterparty).toBe("Группа СВК"); + expect(result?.baseReasons).toContain("counterparty_from_followup_context"); + }); + it("keeps contract scope when follow-up asks for bank operations without explicit anchor", () => { const result = runAddressDecomposeStage("а теперь банковские операции", { previous_intent: "list_documents_by_contract", @@ -5542,6 +5668,13 @@ it("routes old purchase residue questions to aging-by-purchase-date", () => { expect(filters.as_of_date).toBe("2020-03-31"); }); + it("does not treat generic 'по остаткам' as a warehouse anchor", () => { + const result = runAddressDecomposeStage("кайф - что там на складе по остаткам?"); + expect(result).not.toBeNull(); + expect(result?.intent.intent).toBe("inventory_on_hand_as_of_date"); + expect(result?.filters.extracted_filters.warehouse).toBeUndefined(); + }); + it("builds exact balance query for inventory-on-hand snapshot", () => { const selected = selectAddressRecipe("inventory_on_hand_as_of_date", { as_of_date: "2020-03-31" diff --git a/llm_normalizer/backend/tests/assistantAddressFollowupContext.test.ts b/llm_normalizer/backend/tests/assistantAddressFollowupContext.test.ts index 599eea1..2ceb3b1 100644 --- a/llm_normalizer/backend/tests/assistantAddressFollowupContext.test.ts +++ b/llm_normalizer/backend/tests/assistantAddressFollowupContext.test.ts @@ -131,7 +131,7 @@ describe("assistant address follow-up carryover", () => { } as any); expect(second.ok).toBe(true); - expect(["factual", "factual_with_explanation"]).toContain(second.reply_type); + expect(["factual", "factual_with_explanation", "partial_coverage"]).toContain(second.reply_type); expect(second.debug?.detected_mode).toBe("address_query"); expect(second.debug?.detected_intent).toBe("list_documents_by_counterparty"); expect(second.debug?.extracted_filters?.counterparty).toBe("свк"); diff --git a/llm_normalizer/backend/tests/assistantLivingChatAttemptInputBuilder.test.ts b/llm_normalizer/backend/tests/assistantLivingChatAttemptInputBuilder.test.ts index ebadd7f..dbfa5d2 100644 --- a/llm_normalizer/backend/tests/assistantLivingChatAttemptInputBuilder.test.ts +++ b/llm_normalizer/backend/tests/assistantLivingChatAttemptInputBuilder.test.ts @@ -28,13 +28,16 @@ function buildInput(overrides: Record = {}) { hasOperationalAdminActionRequestSignal: () => false, hasOrganizationFactLookupSignal: () => false, hasOrganizationFactFollowupSignal: () => false, + hasLivingChatSignal: () => false, shouldEmitOrganizationSelectionReply: () => false, hasAssistantCapabilityQuestionSignal: () => false, + resolveOrganizationSelectionFromMessage: () => null, resolveDataScopeProbe: () => null, applyScriptGuard: (chatText: string) => chatText, applyGroundingGuard: (guardInput: Record) => guardInput, buildAssistantSafetyRefusalReply: () => "safety", buildAssistantDataScopeContractReply: () => "scope", + buildAssistantProactiveOrganizationOfferReply: () => "offer", buildAssistantOrganizationFactBoundaryReply: () => "boundary", buildAssistantDataScopeSelectionReply: () => "selection", buildAssistantOperationalBoundaryReply: () => "operational", diff --git a/llm_normalizer/backend/tests/assistantLivingChatAttemptRuntimeInputBuilder.test.ts b/llm_normalizer/backend/tests/assistantLivingChatAttemptRuntimeInputBuilder.test.ts index 6e81850..70fb337 100644 --- a/llm_normalizer/backend/tests/assistantLivingChatAttemptRuntimeInputBuilder.test.ts +++ b/llm_normalizer/backend/tests/assistantLivingChatAttemptRuntimeInputBuilder.test.ts @@ -46,8 +46,10 @@ describe("assistant living chat attempt runtime input builder", () => { hasOperationalAdminActionRequestSignal: vi.fn(() => false), hasOrganizationFactLookupSignal: vi.fn(() => false), hasOrganizationFactFollowupSignal: vi.fn(() => false), + hasLivingChatSignal: vi.fn(() => false), shouldEmitOrganizationSelectionReply: vi.fn(() => false), hasAssistantCapabilityQuestionSignal: vi.fn(() => false), + resolveOrganizationSelectionFromMessage: vi.fn(() => null), resolveDataScopeProbe: vi.fn(async () => null), executeLlmChat, applyScriptGuard: vi.fn((text: string) => ({ text, applied: false, reason: null })), @@ -58,6 +60,7 @@ describe("assistant living chat attempt runtime input builder", () => { })), buildAssistantSafetyRefusalReply: vi.fn(() => "safety"), buildAssistantDataScopeContractReply: vi.fn(() => "scope"), + buildAssistantProactiveOrganizationOfferReply: vi.fn(() => "offer"), buildAssistantOrganizationFactBoundaryReply: vi.fn(() => "boundary"), buildAssistantDataScopeSelectionReply: vi.fn(() => "selection"), buildAssistantOperationalBoundaryReply: vi.fn(() => "operational"), diff --git a/llm_normalizer/backend/tests/assistantLivingChatRuntimeAdapter.test.ts b/llm_normalizer/backend/tests/assistantLivingChatRuntimeAdapter.test.ts index b38a2d7..1dc3f6b 100644 --- a/llm_normalizer/backend/tests/assistantLivingChatRuntimeAdapter.test.ts +++ b/llm_normalizer/backend/tests/assistantLivingChatRuntimeAdapter.test.ts @@ -1,5 +1,6 @@ import { describe, expect, it, vi } from "vitest"; import { runAssistantLivingChatRuntime } from "../src/services/assistantLivingChatRuntimeAdapter"; +import { resolveOrganizationSelectionFromMessage } from "../src/services/assistantOrganizationMatcher"; function buildRuntimeInput(overrides: Record = {}) { const executeLlmChat = vi.fn(async () => "llm-text"); @@ -37,6 +38,7 @@ function buildRuntimeInput(overrides: Record = {}) { hasLivingChatSignal: () => true, shouldEmitOrganizationSelectionReply: () => false, hasAssistantCapabilityQuestionSignal: () => false, + resolveOrganizationSelectionFromMessage: () => null, resolveDataScopeProbe, executeLlmChat, applyScriptGuard: (chatText: string) => ({ @@ -87,6 +89,47 @@ describe("assistant living chat runtime adapter", () => { expect(output.debug?.living_chat_data_scope_probe_org_count).toBe(1); }); + it("probes data scope before LLM when a fresh chat turn is a bare organization name", async () => { + const executeLlmChat = vi.fn(async () => "llm-text"); + const resolveDataScopeProbe = vi.fn(async () => ({ + status: "resolved", + channel: "default", + organizations: [ + "\u041e\u041e\u041e \u0410\u043b\u044c\u0442\u0435\u0440\u043d\u0430\u0442\u0438\u0432\u0430 \u041f\u043b\u044e\u0441", + "\u041e\u041e\u041e \u041b\u0430\u0439\u0441\u0432\u0443\u0434" + ], + error: null + })); + const input = buildRuntimeInput({ + userMessage: "\u0410\u043b\u044c\u0442\u0435\u0440\u043d\u0430\u0442\u0438\u0432\u0430 \u041f\u043b\u044e\u0441", + modeDecision: { mode: "chat", reason: "non_domain_query_indexed" }, + hasLivingChatSignal: () => false, + shouldEmitOrganizationSelectionReply: (_message: string, organization: string | null) => Boolean(organization), + resolveOrganizationSelectionFromMessage, + resolveDataScopeProbe, + executeLlmChat, + buildAssistantDataScopeSelectionReply: (organization: string | null) => `selection:${organization ?? "none"}` + }); + + const output = await runAssistantLivingChatRuntime(input); + + expect(output.handled).toBe(true); + expect(output.chatText).toBe( + "selection:\u041e\u041e\u041e \u0410\u043b\u044c\u0442\u0435\u0440\u043d\u0430\u0442\u0438\u0432\u0430 \u041f\u043b\u044e\u0441" + ); + expect(output.debug?.living_chat_response_source).toBe("deterministic_data_scope_selection_contract"); + expect(output.debug?.living_chat_bare_scope_probe_attempted).toBe(true); + expect(output.debug?.living_chat_bare_scope_probe_matched_organization).toBe( + "\u041e\u041e\u041e \u0410\u043b\u044c\u0442\u0435\u0440\u043d\u0430\u0442\u0438\u0432\u0430 \u041f\u043b\u044e\u0441" + ); + expect(output.debug?.assistant_active_organization).toBe( + "\u041e\u041e\u041e \u0410\u043b\u044c\u0442\u0435\u0440\u043d\u0430\u0442\u0438\u0432\u0430 \u041f\u043b\u044e\u0441" + ); + expect(output.debug?.living_chat_data_scope_probe_org_count).toBe(2); + expect(resolveDataScopeProbe).toHaveBeenCalledTimes(1); + expect(executeLlmChat).not.toHaveBeenCalled(); + }); + it("selects safety refusal branch for dangerous capability meta query", async () => { const executeLlmChat = vi.fn(async () => "llm-text"); const input = buildRuntimeInput({ diff --git a/llm_normalizer/backend/tests/assistantMcpDiscoveryPilotExecutor.test.ts b/llm_normalizer/backend/tests/assistantMcpDiscoveryPilotExecutor.test.ts index 69169f1..66c0fea 100644 --- a/llm_normalizer/backend/tests/assistantMcpDiscoveryPilotExecutor.test.ts +++ b/llm_normalizer/backend/tests/assistantMcpDiscoveryPilotExecutor.test.ts @@ -311,6 +311,104 @@ describe("assistant MCP discovery pilot executor", () => { expect(deps.executeAddressMcpQuery).toHaveBeenCalledTimes(6); }); + it("recovers explicit-year business overview money coverage through monthly value-flow chunks", async () => { + const planner = planAssistantMcpDiscovery({ + dataNeedGraph: { + schema_version: "assistant_data_need_graph_v1", + policy_owner: "assistantMcpDiscoveryDataNeedGraph", + subject_candidates: [], + business_fact_family: "business_overview", + action_family: "broad_evaluation", + aggregation_need: null, + time_scope_need: "explicit_period", + comparison_need: null, + ranking_need: null, + proof_expectation: "bounded_inference", + clarification_gaps: [], + decomposition_candidates: [ + "collect_scoped_movements", + "aggregate_checked_amounts", + "aggregate_ranked_axis_values", + "fetch_supporting_documents", + "probe_coverage", + "explain_evidence_basis" + ], + forbidden_overclaim_flags: ["no_raw_model_claims", "no_profit_or_margin_claim_without_evidence"], + reason_codes: ["data_need_graph_built", "data_need_graph_family_business_overview"] + }, + turnMeaning: { + asked_domain_family: "business_overview", + asked_action_family: "broad_evaluation", + explicit_organization_scope: "ООО Альтернатива Плюс", + explicit_date_scope: "2020" + } + }); + const broadIncomingRows = Array.from({ length: 200 }, (_, index) => ({ + Period: `2020-01-${String((index % 28) + 1).padStart(2, "0")}T00:00:00`, + Amount: 1, + Counterparty: "Клиент из широкого запроса" + })); + const broadOutgoingRows = Array.from({ length: 200 }, (_, index) => ({ + Period: `2020-01-${String((index % 28) + 1).padStart(2, "0")}T00:00:00`, + Amount: 1, + Counterparty: "Поставщик из широкого запроса" + })); + const incomingMonthlyResults = Array.from({ length: 12 }, (_, index) => ({ + rows: [ + { + Period: `2020-${String(index + 1).padStart(2, "0")}-05T00:00:00`, + Amount: (index + 1) * 100, + Counterparty: index % 2 === 0 ? "Клиент А" : "Клиент Б" + } + ] + })); + const outgoingMonthlyResults = Array.from({ length: 12 }, (_, index) => ({ + rows: [ + { + Period: `2020-${String(index + 1).padStart(2, "0")}-10T00:00:00`, + Amount: (index + 1) * 50, + Counterparty: index % 2 === 0 ? "Поставщик А" : "Поставщик Б" + } + ] + })); + const deps = buildSequentialDeps([ + { rows: broadIncomingRows }, + ...incomingMonthlyResults, + { rows: broadOutgoingRows }, + ...outgoingMonthlyResults + ]); + + const result = await executeAssistantMcpDiscoveryPilot(planner, deps); + + expect(planner.discovery_plan.execution_budget.max_probe_count).toBe(30); + expect(result.derived_business_overview).toMatchObject({ + organization_scope: "ООО Альтернатива Плюс", + period_scope: "2020", + incoming_customer_revenue: { + total_amount: 7800, + coverage_limited_by_probe_limit: false, + coverage_recovered_by_period_chunking: true, + period_chunking_granularity: "month" + }, + outgoing_supplier_payout: { + total_amount: 3900, + coverage_limited_by_probe_limit: false, + coverage_recovered_by_period_chunking: true, + period_chunking_granularity: "month" + }, + net_amount: 3900, + coverage_limited_by_probe_limit: false + }); + expect(result.evidence.confirmed_facts).toContain( + "Денежное покрытие бизнес-обзора за год восстановлено через помесячные 1С-проверки, а не только через широкий общий запрос." + ); + expect(result.evidence.unknown_facts).not.toContain( + "Полное покрытие бизнес-обзора не подтверждено: хотя бы один денежный запрос достиг верхней границы выборки." + ); + expect(result.reason_codes).toContain("pilot_business_overview_incoming_monthly_period_chunking_recovered_coverage"); + expect(result.reason_codes).toContain("pilot_business_overview_outgoing_monthly_period_chunking_recovered_coverage"); + }); + it("marks bank-like counterparties in business-overview rankings before evidence wording", async () => { const planner = planAssistantMcpDiscovery({ dataNeedGraph: { diff --git a/llm_normalizer/backend/tests/assistantMcpDiscoveryPlanner.test.ts b/llm_normalizer/backend/tests/assistantMcpDiscoveryPlanner.test.ts index 2b80841..f4b7af2 100644 --- a/llm_normalizer/backend/tests/assistantMcpDiscoveryPlanner.test.ts +++ b/llm_normalizer/backend/tests/assistantMcpDiscoveryPlanner.test.ts @@ -288,6 +288,48 @@ describe("assistant MCP discovery planner", () => { expect(result.reason_codes).toContain("planner_instantiated_catalog_chain_template_business_overview"); }); + it("enables chunked coverage budget for explicit-year business overviews", () => { + const result = planAssistantMcpDiscovery({ + dataNeedGraph: { + schema_version: "assistant_data_need_graph_v1", + policy_owner: "assistantMcpDiscoveryDataNeedGraph", + subject_candidates: [], + business_fact_family: "business_overview", + action_family: "broad_evaluation", + aggregation_need: null, + time_scope_need: "explicit_period", + comparison_need: null, + ranking_need: null, + proof_expectation: "bounded_inference", + clarification_gaps: [], + decomposition_candidates: [ + "collect_scoped_movements", + "aggregate_checked_amounts", + "aggregate_ranked_axis_values", + "fetch_supporting_documents", + "probe_coverage", + "explain_evidence_basis" + ], + forbidden_overclaim_flags: [ + "no_raw_model_claims", + "no_unchecked_business_health_claim", + "no_profit_or_margin_claim_without_evidence" + ], + reason_codes: ["data_need_graph_built", "data_need_graph_family_business_overview"] + }, + turnMeaning: { + asked_domain_family: "business_overview", + asked_action_family: "broad_evaluation", + explicit_organization_scope: "ООО Альтернатива Плюс", + explicit_date_scope: "2020" + } + }); + + expect(result.selected_chain_id).toBe("business_overview"); + expect(result.discovery_plan.execution_budget.max_probe_count).toBe(30); + expect(result.reason_codes).toContain("planner_enabled_chunked_coverage_probe_budget"); + }); + it("keeps bidirectional value-flow comparison executable when checked totals are derived without aggregate_by_axis", () => { const result = planAssistantMcpDiscovery({ dataNeedGraph: { diff --git a/llm_normalizer/backend/tests/assistantMcpDiscoveryResponseCandidate.test.ts b/llm_normalizer/backend/tests/assistantMcpDiscoveryResponseCandidate.test.ts index 200dd11..63e9243 100644 --- a/llm_normalizer/backend/tests/assistantMcpDiscoveryResponseCandidate.test.ts +++ b/llm_normalizer/backend/tests/assistantMcpDiscoveryResponseCandidate.test.ts @@ -159,6 +159,60 @@ describe("assistant MCP discovery response candidate", () => { expect(candidate.reply_text).not.toContain("47 628 853"); }); + it("answers profit follow-ups with a direct cash-flow boundary before accounting result detail", () => { + const candidate = buildAssistantMcpDiscoveryResponseCandidate( + entryPoint({ + turn_input: { + adapter_status: "ready", + turn_meaning_ref: { + asked_domain_family: "business_overview", + asked_action_family: "profit_margin_boundary", + unsupported_but_understood_family: "profit_margin_boundary", + explicit_date_scope: "2020" + }, + data_need_graph: { + business_fact_family: "business_overview", + ranking_need: null, + reason_codes: ["data_need_graph_family_business_overview"] + } + }, + bridge: { + bridge_status: "answer_draft_ready", + user_facing_response_allowed: true, + business_fact_answer_allowed: true, + requires_user_clarification: false, + pilot: { + pilot_scope: "business_overview_route_template_v1", + derived_business_overview: { + accounting_financial_result: { + period_scope: "2020", + final_result_direction: "loss", + final_result_amount_human_ru: "7 136 815,85 руб.", + net_margin_to_revenue_pct: -59.41 + } + } + }, + answer_draft: { + answer_mode: "confirmed_with_bounded_inference", + headline: "Коротко: по бухгалтерскому маршруту 90/91/99 за 2020 подтвержден учетный убыток.", + confirmed_lines: [], + inference_lines: [], + unknown_lines: [], + limitation_lines: [], + next_step_line: null + } + } + }) + ); + + expect(candidate.reply_text).toContain("нет, денежное операционное нетто не стоит считать чистой прибылью"); + expect(candidate.reply_text).toContain("по закрытию счетов 90/91/99 в 1С за 2020"); + expect(candidate.reply_text).toContain("учетный убыток"); + expect(candidate.reply_text).toContain("маржа к подтвержденной выручке -59.41%"); + expect(candidate.reply_text).not.toContain("бухгалтерскому маршруту"); + expect(candidate.reply_text).not.toContain("маржа к выручке 90.01"); + }); + it("keeps vendor-risk boundary answers direct instead of compacting into a money overview", () => { const candidate = buildAssistantMcpDiscoveryResponseCandidate( entryPoint({ @@ -392,6 +446,8 @@ describe("assistant MCP discovery response candidate", () => { expect(candidate.reply_text).toContain("не полный бухгалтерский рейтинг доходности"); expect(candidate.reply_text).toContain("не как чистую бухгалтерскую прибыль"); expect(candidate.reply_text).toContain("проверка достигла лимита строк"); + expect(candidate.reply_text).toContain("выбрать конкретный год или квартал для дозапроса"); + expect(candidate.reply_text).toContain("без выдачи непроверенного итога"); expect(candidate.reply_text).not.toContain("лимит выборки MCP"); expect(candidate.reply_text).not.toContain("MCP-срез"); expect(candidate.reply_text).not.toContain("Что подтверждено:"); @@ -465,11 +521,65 @@ describe("assistant MCP discovery response candidate", () => { expect(candidate.reply_text).toContain("12 474 036,91 руб"); expect(candidate.reply_text?.split("\n")[0]).toContain("крупнейший источник входящих денег: ГКУ УКРиС"); expect(candidate.reply_text?.split("\n")[0]).toContain("крупнейший получатель исходящих денег: ООО Поставщик"); - expect(candidate.reply_text).toContain("денежный операционный показатель"); + expect(candidate.reply_text).toContain("операционный денежный показатель"); expect(candidate.reply_text).not.toContain("Что можно сказать только как вывод:"); expect(candidate.reply_text).not.toContain("Складской срез"); }); + it("labels organization-scoped bidirectional value-flow continuations as company scope", () => { + const candidate = buildAssistantMcpDiscoveryResponseCandidate( + entryPoint({ + turn_input: { + adapter_status: "ready", + turn_meaning_ref: { + explicit_organization_scope: "ООО Альтернатива Плюс", + explicit_date_scope: "2020" + }, + data_need_graph: { + business_fact_family: "value_flow", + comparison_need: "incoming_vs_outgoing", + reason_codes: ["data_need_graph_family_value_flow"] + } + }, + bridge: { + bridge_status: "answer_draft_ready", + user_facing_response_allowed: true, + business_fact_answer_allowed: true, + requires_user_clarification: false, + pilot: { + derived_bidirectional_value_flow: { + counterparty: null, + period_scope: "2020", + incoming_customer_revenue: { + total_amount_human_ru: "47 628 853,03 руб.", + rows_with_amount: 44 + }, + outgoing_supplier_payout: { + total_amount_human_ru: "43 763 351,53 руб.", + rows_with_amount: 299 + }, + net_amount_human_ru: "3 865 501,50 руб.", + net_direction: "net_incoming", + coverage_limited_by_probe_limit: false + } + }, + answer_draft: { + answer_mode: "confirmed_with_bounded_inference", + headline: "Денежный поток подтвержден.", + confirmed_lines: [], + inference_lines: [], + unknown_lines: [], + limitation_lines: [], + next_step_line: null + } + } + }) + ); + + expect(candidate.reply_text?.split("\n")[0]).toContain("по компании ООО Альтернатива Плюс за период 2020"); + expect(candidate.reply_text).not.toContain("по контрагенту запрошенному контрагенту"); + }); + it("does not present bank-like incoming leaders as ordinary client revenue", () => { const candidate = buildAssistantMcpDiscoveryResponseCandidate( entryPoint({ diff --git a/llm_normalizer/backend/tests/assistantMcpDiscoveryResponsePolicy.test.ts b/llm_normalizer/backend/tests/assistantMcpDiscoveryResponsePolicy.test.ts index 90dc2e2..a210490 100644 --- a/llm_normalizer/backend/tests/assistantMcpDiscoveryResponsePolicy.test.ts +++ b/llm_normalizer/backend/tests/assistantMcpDiscoveryResponsePolicy.test.ts @@ -606,6 +606,64 @@ describe("assistant MCP discovery response policy", () => { expect(result.reason_codes).not.toContain("mcp_discovery_response_policy_candidate_applied"); }); + it("keeps exact bank operation replies over generic value-flow discovery candidates", () => { + const result = applyAssistantMcpDiscoveryResponsePolicy({ + currentReply: + "Exact bank operation answer: incoming/outgoing rows include operation kind, payment purpose, and contract; do not classify the bank as a regular customer or supplier automatically.", + currentReplySource: "address_query_runtime_v1", + currentReplyType: "factual", + addressRuntimeMeta: { + detected_intent: "bank_operations_by_counterparty", + selected_recipe: "address_bank_operations_by_counterparty_v1", + mcp_call_status: "matched_non_empty", + capability_route_mode: "exact", + answer_grounding_check: { + status: "grounded" + }, + assistant_mcp_discovery_entry_point_v1: entryPoint({ + turn_input: { + adapter_status: "ready", + should_run_discovery: true, + data_need_graph: { + business_fact_family: "value_flow", + subject_candidates: ["SBERBANK"], + reason_codes: ["data_need_graph_built"] + }, + turn_meaning_ref: { + asked_domain_family: "counterparty_value", + asked_action_family: "payout", + explicit_entity_candidates: ["SBERBANK"], + explicit_date_scope: "2020" + } + }, + bridge: { + bridge_status: "answer_draft_ready", + user_facing_response_allowed: true, + business_fact_answer_allowed: true, + requires_user_clarification: false, + answer_draft: { + answer_mode: "confirmed_with_bounded_inference", + headline: "Generic value-flow answer.", + confirmed_lines: ["Outgoing payout total only."], + inference_lines: ["Generic supplier payout interpretation."], + unknown_lines: ["Incoming role is unknown."], + limitation_lines: [], + next_step_line: null + } + } + }) + } + }); + + expect(result.applied).toBe(false); + expect(result.decision).toBe("keep_current_reply"); + expect(result.reply_text).toContain("operation kind"); + expect(result.reason_codes).toContain("mcp_discovery_response_policy_keep_exact_bank_operations_address_reply"); + expect(result.reason_codes).not.toContain("mcp_discovery_response_policy_candidate_applied"); + expect(result.reason_codes).not.toContain("mcp_discovery_response_policy_value_flow_action_conflict_allows_candidate_override"); + expect(result.reason_codes).not.toContain("mcp_discovery_response_policy_semantic_conflict_allows_candidate_override"); + }); + it("overrides an exact ranking-shaped address reply when open-scope ranking still needs organization", () => { const result = applyAssistantMcpDiscoveryResponsePolicy({ currentReply: diff --git a/llm_normalizer/backend/tests/assistantMcpDiscoveryTurnInputAdapter.test.ts b/llm_normalizer/backend/tests/assistantMcpDiscoveryTurnInputAdapter.test.ts index 1d09961..d907012 100644 --- a/llm_normalizer/backend/tests/assistantMcpDiscoveryTurnInputAdapter.test.ts +++ b/llm_normalizer/backend/tests/assistantMcpDiscoveryTurnInputAdapter.test.ts @@ -196,6 +196,32 @@ describe("assistant MCP discovery turn input adapter", () => { expect(result.reason_codes).toContain("mcp_discovery_bidirectional_value_flow_signal_detected"); }); + it("keeps multi-token scoped counterparty from net wording when LLM entities are empty", () => { + const result = buildAssistantMcpDiscoveryTurnInput({ + userMessage: "А теперь по Группа СВК за 2020: сколько денег получили, сколько заплатили и какое нетто?", + followupContext: { + previous_filters: { + organization: "ООО Альтернатива Плюс", + period_from: "2020-01-01", + period_to: "2020-12-31" + } + } + }); + + expect(result.adapter_status).toBe("ready"); + expect(result.should_run_discovery).toBe(true); + expect(result.turn_meaning_ref).toMatchObject({ + asked_domain_family: "counterparty_value", + asked_action_family: "net_value_flow", + explicit_entity_candidates: ["Группа СВК"], + explicit_organization_scope: "ООО Альтернатива Плюс", + explicit_date_scope: "2020", + unsupported_but_understood_family: "counterparty_bidirectional_value_flow_or_netting", + stale_replay_forbidden: true + }); + expect(result.reason_codes).toContain("mcp_discovery_counterparty_from_raw_scope"); + }); + it("overrides a supported exact current-turn payout route when the question asks for a payment amount", () => { const result = buildAssistantMcpDiscoveryTurnInput({ userMessage: @@ -1733,6 +1759,36 @@ describe("assistant MCP discovery turn input adapter", () => { expect(result.reason_codes).toContain("mcp_discovery_business_overview_raw_year_overrode_predecompose_as_of_scope"); }); + it("does not turn generic role tails into predecompose counterparties for business overview", () => { + const result = buildAssistantMcpDiscoveryTurnInput({ + userMessage: + "Теперь дай взрослый обзор за 2020 по компании: входящие, исходящие, нетто, топы, но банк в топах отдельно объясни как финансовый поток, если по назначению он не обычный клиент или поставщик.", + assistantTurnMeaning: { + asked_domain_family: "business_summary", + asked_action_family: "broad_evaluation", + unsupported_but_understood_family: "broad_business_evaluation", + stale_replay_forbidden: true + }, + predecomposeContract: { + entities: { + counterparty: "или поставщик", + organization: "ООО Альтернатива Плюс" + }, + period: { + period_from: "2020-01-01", + period_to: "2020-12-31", + has_explicit_period: true + } + } + }); + + expect(result.adapter_status).toBe("ready"); + expect(result.data_need_graph?.business_fact_family).toBe("business_overview"); + expect(result.turn_meaning_ref?.explicit_entity_candidates).toBeUndefined(); + expect(result.turn_meaning_ref?.explicit_organization_scope).toBe("ООО Альтернатива Плюс"); + expect(result.reason_codes).not.toContain("mcp_discovery_counterparty_from_predecompose"); + }); + it("keeps all-time business overview from reusing a negated VAT period as active scope", () => { const result = buildAssistantMcpDiscoveryTurnInput({ userMessage: diff --git a/llm_normalizer/backend/tests/assistantTransitionPolicy.test.ts b/llm_normalizer/backend/tests/assistantTransitionPolicy.test.ts index 9cf81b7..e60404a 100644 --- a/llm_normalizer/backend/tests/assistantTransitionPolicy.test.ts +++ b/llm_normalizer/backend/tests/assistantTransitionPolicy.test.ts @@ -2013,4 +2013,48 @@ describe("assistantTransitionPolicy", () => { expect(carryover?.followupContext?.previous_discovery_entity_candidates).toEqual(["\u041d\u0414\u0421"]); expect(carryover?.followupContext?.previous_discovery_pilot_scope).toBe("metadata_inspection_v1"); }); + + it("lets short receivables-to-payables mirror override an LLM open-items expansion", () => { + const policy = buildPolicy({ + findLastAddressAssistantItem: () => ({ + text: "\u041a\u043e\u0440\u043e\u0442\u043a\u043e: \u043d\u0430\u043c \u0434\u043e\u043b\u0436\u043d\u044b \u043d\u0430 13.05.2026.", + debug: { + detected_intent: "receivables_confirmed_as_of_date", + selected_recipe: "address_receivables_confirmed_as_of_date_v1", + extracted_filters: { + organization: "\u041e\u041e\u041e \u0410\u043b\u044c\u0442\u0435\u0440\u043d\u0430\u0442\u0438\u0432\u0430 \u041f\u043b\u044e\u0441", + as_of_date: "2026-05-13" + } + } + }), + hasAddressFollowupContextSignal: () => true, + resolveDebtRoleSwapFollowupIntent: (message: string, previousIntent: string) => + message === "\u0430 \u043c\u044b \u043a\u043e\u043c\u0443?" && + previousIntent === "receivables_confirmed_as_of_date" + ? "payables_confirmed_as_of_date" + : null, + resolveAddressIntent: () => ({ + intent: "open_items_by_counterparty_or_contract" + }) + }); + + const carryover = policy.resolveAddressFollowupCarryoverContext( + "\u0430 \u043c\u044b \u043a\u043e\u043c\u0443?", + [], + "\u043e\u043f\u0440\u0435\u0434\u0435\u043b\u0438\u0442\u044c, \u043a\u043e\u043c\u0443 \u043f\u0440\u0438\u043d\u0430\u0434\u043b\u0435\u0436\u0438\u0442 \u0442\u0435\u043a\u0443\u0449\u0430\u044f \u0437\u0430\u0434\u043e\u043b\u0436\u0435\u043d\u043d\u043e\u0441\u0442\u044c", + { + predecomposeContract: { + intent: "open_items_by_counterparty_or_contract" + } + }, + null + ); + + expect(carryover?.followupContext?.previous_intent).toBe("payables_confirmed_as_of_date"); + expect(carryover?.followupContext?.target_intent).toBe("payables_confirmed_as_of_date"); + expect(carryover?.followupContext?.previous_filters).toMatchObject({ + organization: "\u041e\u041e\u041e \u0410\u043b\u044c\u0442\u0435\u0440\u043d\u0430\u0442\u0438\u0432\u0430 \u041f\u043b\u044e\u0441", + as_of_date: "2026-05-13" + }); + }); }); diff --git a/llm_normalizer/data/autorun_generators/history.json b/llm_normalizer/data/autorun_generators/history.json index d1fec52..e6f4f6b 100644 --- a/llm_normalizer/data/autorun_generators/history.json +++ b/llm_normalizer/data/autorun_generators/history.json @@ -1,4 +1,369 @@ [ + { + "generation_id": "gen-ag05131312-2d0445", + "created_at": "2026-05-13T13:12:37+00:00", + "mode": "saved_user_sessions", + "title": "AGENT | Phase 105 mixed schema/primitive closure replay", + "count": 13, + "domain": "address_phase105_mixed_schema_primitive_closure", + "questions": [ + "кайф - что там на складе по остаткам?", + "АЛЬТЕРНАТИВА", + "а исторические остатки на другие даты умеешь?", + "давай на июнь 2017", + "Теперь дай взрослый обзор за 2020 по компании: входящие, исходящие, нетто, топы, но банк в топах отдельно объясни как финансовый поток, если по назначению он не обычный клиент или поставщик.", + "А отдельно по СБЕРБАНКУ: он для нас клиент, поставщик или финансовый поток? Дай коротко по подтвержденным строкам.", + "А теперь по поставщику Группа СВК за 2020: сколько мы ему заплатили и какой общий денежный смысл?", + "А теперь по Группа СВК за 2020: сколько денег получили, сколько заплатили и какое нетто?", + "кому мы должны на конец 2020?", + "а нам кто должен на конец 2020?", + "сколько НДС надо заплатить в налоговую за декабрь 2019?", + "скока денег альтернатива заработала за 20 год?", + "а это чистая прибыль?" + ], + "generated_by": "codex_agent", + "saved_case_set_file": "assistant_autogen_saved_user_sessions_20260513131237_gen-ag05131312-2d0445.json", + "context": { + "llm_provider": null, + "model": null, + "assistant_prompt_version": null, + "decomposition_prompt_version": null, + "prompt_fingerprint": null, + "autogen_personality_id": null, + "autogen_personality_prompt": null, + "source_session_id": null, + "saved_session_file": "assistant_saved_session_20260513131237_gen-ag05131312-2d0445.json", + "saved_case_set_kind": "agent_semantic_scenario", + "agent_run": true, + "agent_focus": "phase105 mixed schema primitive closure", + "architecture_phase": "Open-World Schema/Primitive Discovery", + "source_spec_file": "X:\\1C\\NDC_1C\\docs\\orchestration\\address_truth_harness_phase105_mixed_schema_primitive_closure.json", + "scenario_id": "address_truth_harness_phase105_mixed_schema_primitive_closure", + "semantic_tags": [ + "bank_like_counterparty", + "business_overview", + "business_overview_followup", + "counterparty_net_cash_flow", + "date_followup", + "debt_polarity", + "earnings_wording", + "financial_role_purpose", + "generic_role_tail_anchor_hygiene", + "historical_inventory", + "inventory_capability_meta", + "inventory_root", + "organization_scope", + "payables", + "profit_boundary", + "receivables", + "redundant_scope_selection", + "resolved_organization_scope", + "stale_scope_guard", + "supplier_prefix_canary", + "tax_period", + "value_flow", + "vat_continuity", + "warehouse_not_required" + ], + "validation_status": "accepted_live_replay", + "validated_run_dir": "artifacts\\domain_runs\\phase105_mixed_schema_primitive_closure_live3", + "saved_after_validated_replay": true + } + }, + { + "generation_id": "gen-ag05131226-630ddf", + "created_at": "2026-05-13T12:26:18+00:00", + "mode": "saved_user_sessions", + "title": "AGENT | Phase 104 generic role-tail anchor hygiene replay", + "count": 4, + "domain": "address_phase104_generic_role_tail_anchor_hygiene", + "questions": [ + "Альтернатива Плюс", + "Теперь дай взрослый обзор за 2020 по компании: входящие, исходящие, нетто, топы, но банк в топах отдельно объясни как финансовый поток, если по назначению он не обычный клиент или поставщик.", + "А отдельно по СБЕРБАНКУ: он для нас клиент, поставщик или финансовый поток? Дай коротко по подтвержденным строкам.", + "А теперь по поставщику Группа СВК за 2020: сколько мы ему заплатили и какой общий денежный смысл?" + ], + "generated_by": "codex_agent", + "saved_case_set_file": "assistant_autogen_saved_user_sessions_20260513122618_gen-ag05131226-630ddf.json", + "context": { + "llm_provider": null, + "model": null, + "assistant_prompt_version": null, + "decomposition_prompt_version": null, + "prompt_fingerprint": null, + "autogen_personality_id": null, + "autogen_personality_prompt": null, + "source_session_id": null, + "saved_session_file": "assistant_saved_session_20260513122618_gen-ag05131226-630ddf.json", + "saved_case_set_kind": "agent_semantic_scenario", + "agent_run": true, + "agent_focus": "phase104 generic role-tail anchor hygiene", + "architecture_phase": "Open-World Schema/Primitive Discovery", + "source_spec_file": "X:\\1C\\NDC_1C\\docs\\orchestration\\address_truth_harness_phase104_generic_role_tail_anchor_hygiene.json", + "scenario_id": "address_truth_harness_phase104_generic_role_tail_anchor_hygiene", + "semantic_tags": [ + "bank_like_counterparty", + "bare_org_scope", + "business_overview", + "counterparty_net_cash_flow", + "financial_role_purpose", + "generic_role_tail_anchor_hygiene", + "phase102_canary", + "post_overview_anchor_integrity", + "stale_scope_guard", + "supplier_prefix_canary" + ], + "validation_status": "accepted_live_replay", + "validated_run_dir": "artifacts\\domain_runs\\phase104_generic_role_tail_anchor_hygiene_live2", + "saved_after_validated_replay": true + } + }, + { + "generation_id": "gen-ag05131200-0ed59a", + "created_at": "2026-05-13T12:00:47+00:00", + "mode": "saved_user_sessions", + "title": "AGENT | Phase 103 financial role and purpose arbitration replay", + "count": 6, + "domain": "address_phase103_financial_role_purpose_arbitration", + "questions": [ + "Альтернатива Плюс", + "По СБЕРБАНКУ за 2020 покажи коротко: сколько денег входило и уходило, и что это по смыслу в 1С — клиентская выручка, поставщик, комиссия, кредит или другой финансовый поток?", + "Если СБЕРБАНК есть во входящих поступлениях, можно ли считать его нашим клиентом и выручкой? Скажи по подтвержденным строкам, без притягивания.", + "А если деньги уходили в СБЕРБАНК, это наш поставщик или финансовые списания? Раздели по смыслу и покажи основание.", + "Теперь дай взрослый обзор за 2020 по компании: входящие, исходящие, нетто, топы, но банк в топах отдельно объясни как финансовый поток, если по назначению он не обычный клиент или поставщик.", + "А теперь отдельно по Группа СВК за 2020: сколько денег получили, сколько заплатили и какое нетто?" + ], + "generated_by": "codex_agent", + "saved_case_set_file": "assistant_autogen_saved_user_sessions_20260513120047_gen-ag05131200-0ed59a.json", + "context": { + "llm_provider": null, + "model": null, + "assistant_prompt_version": null, + "decomposition_prompt_version": null, + "prompt_fingerprint": null, + "autogen_personality_id": null, + "autogen_personality_prompt": null, + "source_session_id": null, + "saved_session_file": "assistant_saved_session_20260513120047_gen-ag05131200-0ed59a.json", + "saved_case_set_kind": "agent_semantic_scenario", + "agent_run": true, + "agent_focus": "phase103 financial role and purpose arbitration", + "architecture_phase": "Open-World Schema/Primitive Discovery", + "source_spec_file": "X:\\1C\\NDC_1C\\docs\\orchestration\\address_truth_harness_phase103_financial_role_purpose_arbitration.json", + "scenario_id": "address_truth_harness_phase103_financial_role_purpose_arbitration", + "semantic_tags": [ + "bank_like_counterparty", + "bank_like_customer_boundary", + "bank_like_supplier_boundary", + "bank_operations_by_counterparty", + "bare_org_scope", + "business_overview", + "canary", + "counterparty_net_cash_flow", + "customer_revenue_and_payments", + "financial_role_purpose", + "phase102_canary", + "stale_scope_guard", + "supplier_payouts_profile" + ], + "validation_status": "accepted_live_replay", + "validated_run_dir": "artifacts\\domain_runs\\phase103_financial_role_purpose_arbitration_live3", + "saved_after_validated_replay": true + } + }, + { + "generation_id": "gen-ag05131121-8c41ab", + "created_at": "2026-05-13T11:21:52+00:00", + "mode": "saved_user_sessions", + "title": "AGENT | Phase 102 debt mirror clean-scope polarity replay", + "count": 6, + "domain": "address_phase102_debt_mirror_clean_scope_polarity", + "questions": [ + "Альтернатива Плюс", + "мы кому реально должны денег на сегодня? коротко, чистый долг к оплате, без встречных обеспечений как основного долга", + "а Комитету государственных услуг мы реально должны эти 3,6 млн или это встречное обеспечение/зачет? объясни именно по смыслу долга", + "а нам Комитет государственных услуг тоже должен 3,6 млн? это же та же сумма — скажи честно, это дебиторка или встречная часть?", + "тогда кто нам реально должен денег на сегодня? именно чистая дебиторка", + "а мы кому?" + ], + "generated_by": "codex_agent", + "saved_case_set_file": "assistant_autogen_saved_user_sessions_20260513112152_gen-ag05131121-8c41ab.json", + "context": { + "llm_provider": null, + "model": null, + "assistant_prompt_version": null, + "decomposition_prompt_version": null, + "prompt_fingerprint": null, + "autogen_personality_id": null, + "autogen_personality_prompt": null, + "source_session_id": null, + "saved_session_file": "assistant_saved_session_20260513112152_gen-ag05131121-8c41ab.json", + "saved_case_set_kind": "agent_semantic_scenario", + "agent_run": true, + "agent_focus": "phase102 debt mirror clean-scope polarity", + "architecture_phase": "Open-World Schema/Primitive Discovery", + "source_spec_file": "X:\\1C\\NDC_1C\\docs\\orchestration\\address_truth_harness_phase102_debt_mirror_clean_scope_polarity.json", + "scenario_id": "address_truth_harness_phase102_debt_mirror_clean_scope_polarity", + "semantic_tags": [ + "company_selected", + "debt_mirror_clean_scope", + "direct_answer_first", + "organization_scope", + "payables_counterparty_check", + "payables_snapshot", + "polarity_honesty", + "receivables_counterparty_check", + "receivables_snapshot", + "settlements_mirror_followup" + ], + "validation_status": "accepted_live_replay", + "validated_run_dir": "artifacts\\domain_runs\\phase102_debt_mirror_clean_scope_polarity_live3", + "saved_after_validated_replay": true + } + }, + { + "generation_id": "gen-ag05131044-cbe2ff", + "created_at": "2026-05-13T10:44:38+00:00", + "mode": "saved_user_sessions", + "title": "AGENT | Phase 101 inventory root scope without warehouse clarification replay", + "count": 7, + "domain": "address_phase101_inventory_root_scope_no_warehouse_clarification", + "questions": [ + "приветик - че как там дела", + "расскажи что можешь интересного", + "кайф - что там на складе по остаткам?", + "АЛЬТЕРНАТИВА", + "а исторические остатки на другие даты умеешь?", + "давай на июнь 2017", + "март 2016" + ], + "generated_by": "codex_agent", + "saved_case_set_file": "assistant_autogen_saved_user_sessions_20260513104438_gen-ag05131044-cbe2ff.json", + "context": { + "llm_provider": null, + "model": null, + "assistant_prompt_version": null, + "decomposition_prompt_version": null, + "prompt_fingerprint": null, + "autogen_personality_id": null, + "autogen_personality_prompt": null, + "source_session_id": null, + "saved_session_file": "assistant_saved_session_20260513104438_gen-ag05131044-cbe2ff.json", + "saved_case_set_kind": "agent_semantic_scenario", + "agent_run": true, + "agent_focus": "Focused semantic replay for the manual assistant-stage1-hyh1A1WR3j signal: after a broad stock-on-hand question and organization clarification, the assistant must resume the root inventory snapshot for the selected company across available warehouses instead of asking the user to name a warehouse, item, category, or material.", + "architecture_phase": "turnaround_11", + "source_spec_file": "X:\\1C\\NDC_1C\\docs\\orchestration\\address_truth_harness_phase101_inventory_root_scope_no_warehouse_clarification.json", + "scenario_id": "address_truth_harness_phase101_inventory_root_scope_no_warehouse_clarification", + "semantic_tags": [ + "capability_meta", + "clarification_required", + "clarification_resume", + "date_followup", + "historical_inventory", + "human_answer_quality", + "inventory_capability_meta", + "inventory_root", + "smalltalk_entry", + "warehouse_not_required" + ], + "validation_status": "accepted_live_replay", + "validated_run_dir": "artifacts\\domain_runs\\phase101_inventory_root_scope_no_warehouse_clarification_live1", + "saved_after_validated_replay": true + } + }, + { + "generation_id": "gen-ag05131028-234e5e", + "created_at": "2026-05-13T10:28:45+00:00", + "mode": "saved_user_sessions", + "title": "AGENT | Phase 100 large-query continuation UX replay", + "count": 3, + "domain": "address_phase100_large_query_continuation_ux", + "questions": [ + "Дай общий бизнес-обзор ООО Альтернатива Плюс за весь доступный период: входящие, исходящие, нетто, лучший год. Если срез слишком широкий, не выдумывай полный итог, а скажи как безопасно дособрать.", + "Ок, тогда дособери конкретно 2020: входящие, исходящие и расчетное денежное нетто.", + "Это можно считать прибылью за 2020 или нет? Ответь коротко и по делу." + ], + "generated_by": "codex_agent", + "saved_case_set_file": "assistant_autogen_saved_user_sessions_20260513102845_gen-ag05131028-234e5e.json", + "context": { + "llm_provider": null, + "model": null, + "assistant_prompt_version": null, + "decomposition_prompt_version": null, + "prompt_fingerprint": null, + "autogen_personality_id": null, + "autogen_personality_prompt": null, + "source_session_id": null, + "saved_session_file": "assistant_saved_session_20260513102845_gen-ag05131028-234e5e.json", + "saved_case_set_kind": "agent_semantic_scenario", + "agent_run": true, + "agent_focus": "Focused semantic replay for all-time or very broad business overview questions: the assistant should answer from checked evidence, disclose row-limit coverage honestly, offer a safe period-narrowing continuation path, and then recover the explicit-year follow-up through chunked evidence without leaking technical mechanics.", + "architecture_phase": "turnaround_11", + "source_spec_file": "X:\\1C\\NDC_1C\\docs\\orchestration\\address_truth_harness_phase100_large_query_continuation_ux.json", + "scenario_id": "address_truth_harness_phase100_large_query_continuation_ux", + "semantic_tags": [ + "business_language", + "business_overview", + "followup_continuation", + "followup_directness", + "large_query_budget", + "large_query_continuation", + "limit_honesty", + "profit_boundary" + ], + "validation_status": "accepted_live_replay", + "validated_run_dir": "artifacts\\domain_runs\\phase100_large_query_continuation_ux_live2", + "saved_after_validated_replay": true + } + }, + { + "generation_id": "gen-ag05131009-f08174", + "created_at": "2026-05-13T10:09:01+00:00", + "mode": "saved_user_sessions", + "title": "AGENT | Phase 99 large-query budget and continuation policy replay", + "count": 4, + "domain": "address_phase99_large_query_budget_continuation", + "questions": [ + "Дай взрослый бизнес-обзор ООО Альтернатива Плюс за 2020: входящие, исходящие, нетто, кто основные источники денег и где важные ограничения.", + "То есть это можно считать прибылью за 2020 или нет? Коротко.", + "А кто за 2020 принес больше всего денег, и если там банк, не называй его обычным клиентом.", + "По исходящим за 2020 есть зависимость от одного поставщика или это только денежная концентрация?" + ], + "generated_by": "codex_agent", + "saved_case_set_file": "assistant_autogen_saved_user_sessions_20260513100901_gen-ag05131009-f08174.json", + "context": { + "llm_provider": null, + "model": null, + "assistant_prompt_version": null, + "decomposition_prompt_version": null, + "prompt_fingerprint": null, + "autogen_personality_id": null, + "autogen_personality_prompt": null, + "source_session_id": null, + "saved_session_file": "assistant_saved_session_20260513100901_gen-ag05131009-f08174.json", + "saved_case_set_kind": "agent_semantic_scenario", + "agent_run": true, + "agent_focus": "Focused semantic replay for explicit-year large business questions: the assistant should use chunked 1C evidence where available, answer direct-first, avoid fake profit claims, keep bank/counterparty boundaries, and not collapse into row-limit refusal wording when yearly money coverage can be recovered.", + "architecture_phase": "turnaround_11", + "source_spec_file": "X:\\1C\\NDC_1C\\docs\\orchestration\\address_truth_harness_phase99_large_query_budget_continuation.json", + "scenario_id": "address_truth_harness_phase99_large_query_budget_continuation", + "semantic_tags": [ + "business_language", + "business_overview", + "context_continuity", + "direct_answer_first", + "financial_counterparty_flow_hint", + "followup_directness", + "large_query_budget", + "profit_boundary", + "vendor_risk_procurement_quality" + ], + "validation_status": "accepted_live_replay", + "validated_run_dir": "artifacts\\domain_runs\\phase99_large_query_budget_continuation_live2", + "saved_after_validated_replay": true + } + }, { "generation_id": "gen-ag05122315-f1e27c", "created_at": "2026-05-12T23:15:48+00:00", diff --git a/llm_normalizer/data/autorun_generators/saved_sessions/assistant_saved_session_20260513100901_gen-ag05131009-f08174.json b/llm_normalizer/data/autorun_generators/saved_sessions/assistant_saved_session_20260513100901_gen-ag05131009-f08174.json new file mode 100644 index 0000000..edab59e --- /dev/null +++ b/llm_normalizer/data/autorun_generators/saved_sessions/assistant_saved_session_20260513100901_gen-ag05131009-f08174.json @@ -0,0 +1,133 @@ +{ + "saved_at": "2026-05-13T10:09:01+00:00", + "generation_id": "gen-ag05131009-f08174", + "mode": "saved_user_sessions", + "title": "AGENT | Phase 99 large-query budget and continuation policy replay", + "agent_run": true, + "questions": [ + "Дай взрослый бизнес-обзор ООО Альтернатива Плюс за 2020: входящие, исходящие, нетто, кто основные источники денег и где важные ограничения.", + "То есть это можно считать прибылью за 2020 или нет? Коротко.", + "А кто за 2020 принес больше всего денег, и если там банк, не называй его обычным клиентом.", + "По исходящим за 2020 есть зависимость от одного поставщика или это только денежная концентрация?" + ], + "metadata": { + "assistant_prompt_version": null, + "decomposition_prompt_version": null, + "prompt_fingerprint": null, + "agent_focus": "Focused semantic replay for explicit-year large business questions: the assistant should use chunked 1C evidence where available, answer direct-first, avoid fake profit claims, keep bank/counterparty boundaries, and not collapse into row-limit refusal wording when yearly money coverage can be recovered.", + "architecture_phase": "turnaround_11", + "source_spec_file": "X:\\1C\\NDC_1C\\docs\\orchestration\\address_truth_harness_phase99_large_query_budget_continuation.json", + "scenario_id": "address_truth_harness_phase99_large_query_budget_continuation", + "semantic_tags": [ + "business_language", + "business_overview", + "context_continuity", + "direct_answer_first", + "financial_counterparty_flow_hint", + "followup_directness", + "large_query_budget", + "profit_boundary", + "vendor_risk_procurement_quality" + ], + "validation_status": "accepted_live_replay", + "validated_run_dir": "artifacts\\domain_runs\\phase99_large_query_budget_continuation_live2", + "saved_after_validated_replay": true, + "save_gate": { + "schema_version": "agent_semantic_save_gate_v1", + "validation_status": "accepted_live_replay", + "validated_run_dir": "artifacts\\domain_runs\\phase99_large_query_budget_continuation_live2", + "final_status": "accepted", + "review_overall_status": "pass", + "business_overall_status": "pass", + "steps_total": 4, + "steps_passed": 4, + "steps_failed": 0, + "steps_with_business_failures": 0, + "steps_with_business_warnings": 0, + "acceptance_gate_passed": true, + "saved_after_validated_replay": true + } + }, + "source_session_id": null, + "session": { + "session_id": null, + "mode": "agent_semantic_run", + "items": [ + { + "message_id": "agent-user-001", + "role": "user", + "text": "Дай взрослый бизнес-обзор ООО Альтернатива Плюс за 2020: входящие, исходящие, нетто, кто основные источники денег и где важные ограничения.", + "created_at": "2026-05-13T10:09:01+00:00", + "reply_type": null, + "trace_id": null, + "debug": null + }, + { + "message_id": "agent-user-002", + "role": "user", + "text": "То есть это можно считать прибылью за 2020 или нет? Коротко.", + "created_at": "2026-05-13T10:09:01+00:00", + "reply_type": null, + "trace_id": null, + "debug": null + }, + { + "message_id": "agent-user-003", + "role": "user", + "text": "А кто за 2020 принес больше всего денег, и если там банк, не называй его обычным клиентом.", + "created_at": "2026-05-13T10:09:01+00:00", + "reply_type": null, + "trace_id": null, + "debug": null + }, + { + "message_id": "agent-user-004", + "role": "user", + "text": "По исходящим за 2020 есть зависимость от одного поставщика или это только денежная концентрация?", + "created_at": "2026-05-13T10:09:01+00:00", + "reply_type": null, + "trace_id": null, + "debug": null + } + ], + "agent_run": true, + "metadata": { + "assistant_prompt_version": null, + "decomposition_prompt_version": null, + "prompt_fingerprint": null, + "agent_focus": "Focused semantic replay for explicit-year large business questions: the assistant should use chunked 1C evidence where available, answer direct-first, avoid fake profit claims, keep bank/counterparty boundaries, and not collapse into row-limit refusal wording when yearly money coverage can be recovered.", + "architecture_phase": "turnaround_11", + "source_spec_file": "X:\\1C\\NDC_1C\\docs\\orchestration\\address_truth_harness_phase99_large_query_budget_continuation.json", + "scenario_id": "address_truth_harness_phase99_large_query_budget_continuation", + "semantic_tags": [ + "business_language", + "business_overview", + "context_continuity", + "direct_answer_first", + "financial_counterparty_flow_hint", + "followup_directness", + "large_query_budget", + "profit_boundary", + "vendor_risk_procurement_quality" + ], + "validation_status": "accepted_live_replay", + "validated_run_dir": "artifacts\\domain_runs\\phase99_large_query_budget_continuation_live2", + "saved_after_validated_replay": true, + "save_gate": { + "schema_version": "agent_semantic_save_gate_v1", + "validation_status": "accepted_live_replay", + "validated_run_dir": "artifacts\\domain_runs\\phase99_large_query_budget_continuation_live2", + "final_status": "accepted", + "review_overall_status": "pass", + "business_overall_status": "pass", + "steps_total": 4, + "steps_passed": 4, + "steps_failed": 0, + "steps_with_business_failures": 0, + "steps_with_business_warnings": 0, + "acceptance_gate_passed": true, + "saved_after_validated_replay": true + } + } + } +} diff --git a/llm_normalizer/data/autorun_generators/saved_sessions/assistant_saved_session_20260513102845_gen-ag05131028-234e5e.json b/llm_normalizer/data/autorun_generators/saved_sessions/assistant_saved_session_20260513102845_gen-ag05131028-234e5e.json new file mode 100644 index 0000000..4f0e601 --- /dev/null +++ b/llm_normalizer/data/autorun_generators/saved_sessions/assistant_saved_session_20260513102845_gen-ag05131028-234e5e.json @@ -0,0 +1,121 @@ +{ + "saved_at": "2026-05-13T10:28:45+00:00", + "generation_id": "gen-ag05131028-234e5e", + "mode": "saved_user_sessions", + "title": "AGENT | Phase 100 large-query continuation UX replay", + "agent_run": true, + "questions": [ + "Дай общий бизнес-обзор ООО Альтернатива Плюс за весь доступный период: входящие, исходящие, нетто, лучший год. Если срез слишком широкий, не выдумывай полный итог, а скажи как безопасно дособрать.", + "Ок, тогда дособери конкретно 2020: входящие, исходящие и расчетное денежное нетто.", + "Это можно считать прибылью за 2020 или нет? Ответь коротко и по делу." + ], + "metadata": { + "assistant_prompt_version": null, + "decomposition_prompt_version": null, + "prompt_fingerprint": null, + "agent_focus": "Focused semantic replay for all-time or very broad business overview questions: the assistant should answer from checked evidence, disclose row-limit coverage honestly, offer a safe period-narrowing continuation path, and then recover the explicit-year follow-up through chunked evidence without leaking technical mechanics.", + "architecture_phase": "turnaround_11", + "source_spec_file": "X:\\1C\\NDC_1C\\docs\\orchestration\\address_truth_harness_phase100_large_query_continuation_ux.json", + "scenario_id": "address_truth_harness_phase100_large_query_continuation_ux", + "semantic_tags": [ + "business_language", + "business_overview", + "followup_continuation", + "followup_directness", + "large_query_budget", + "large_query_continuation", + "limit_honesty", + "profit_boundary" + ], + "validation_status": "accepted_live_replay", + "validated_run_dir": "artifacts\\domain_runs\\phase100_large_query_continuation_ux_live2", + "saved_after_validated_replay": true, + "save_gate": { + "schema_version": "agent_semantic_save_gate_v1", + "validation_status": "accepted_live_replay", + "validated_run_dir": "artifacts\\domain_runs\\phase100_large_query_continuation_ux_live2", + "final_status": "accepted", + "review_overall_status": "pass", + "business_overall_status": "pass", + "steps_total": 3, + "steps_passed": 3, + "steps_failed": 0, + "steps_with_business_failures": 0, + "steps_with_business_warnings": 0, + "acceptance_gate_passed": true, + "saved_after_validated_replay": true + } + }, + "source_session_id": null, + "session": { + "session_id": null, + "mode": "agent_semantic_run", + "items": [ + { + "message_id": "agent-user-001", + "role": "user", + "text": "Дай общий бизнес-обзор ООО Альтернатива Плюс за весь доступный период: входящие, исходящие, нетто, лучший год. Если срез слишком широкий, не выдумывай полный итог, а скажи как безопасно дособрать.", + "created_at": "2026-05-13T10:28:45+00:00", + "reply_type": null, + "trace_id": null, + "debug": null + }, + { + "message_id": "agent-user-002", + "role": "user", + "text": "Ок, тогда дособери конкретно 2020: входящие, исходящие и расчетное денежное нетто.", + "created_at": "2026-05-13T10:28:45+00:00", + "reply_type": null, + "trace_id": null, + "debug": null + }, + { + "message_id": "agent-user-003", + "role": "user", + "text": "Это можно считать прибылью за 2020 или нет? Ответь коротко и по делу.", + "created_at": "2026-05-13T10:28:45+00:00", + "reply_type": null, + "trace_id": null, + "debug": null + } + ], + "agent_run": true, + "metadata": { + "assistant_prompt_version": null, + "decomposition_prompt_version": null, + "prompt_fingerprint": null, + "agent_focus": "Focused semantic replay for all-time or very broad business overview questions: the assistant should answer from checked evidence, disclose row-limit coverage honestly, offer a safe period-narrowing continuation path, and then recover the explicit-year follow-up through chunked evidence without leaking technical mechanics.", + "architecture_phase": "turnaround_11", + "source_spec_file": "X:\\1C\\NDC_1C\\docs\\orchestration\\address_truth_harness_phase100_large_query_continuation_ux.json", + "scenario_id": "address_truth_harness_phase100_large_query_continuation_ux", + "semantic_tags": [ + "business_language", + "business_overview", + "followup_continuation", + "followup_directness", + "large_query_budget", + "large_query_continuation", + "limit_honesty", + "profit_boundary" + ], + "validation_status": "accepted_live_replay", + "validated_run_dir": "artifacts\\domain_runs\\phase100_large_query_continuation_ux_live2", + "saved_after_validated_replay": true, + "save_gate": { + "schema_version": "agent_semantic_save_gate_v1", + "validation_status": "accepted_live_replay", + "validated_run_dir": "artifacts\\domain_runs\\phase100_large_query_continuation_ux_live2", + "final_status": "accepted", + "review_overall_status": "pass", + "business_overall_status": "pass", + "steps_total": 3, + "steps_passed": 3, + "steps_failed": 0, + "steps_with_business_failures": 0, + "steps_with_business_warnings": 0, + "acceptance_gate_passed": true, + "saved_after_validated_replay": true + } + } + } +} diff --git a/llm_normalizer/data/autorun_generators/saved_sessions/assistant_saved_session_20260513104438_gen-ag05131044-cbe2ff.json b/llm_normalizer/data/autorun_generators/saved_sessions/assistant_saved_session_20260513104438_gen-ag05131044-cbe2ff.json new file mode 100644 index 0000000..d50b9ef --- /dev/null +++ b/llm_normalizer/data/autorun_generators/saved_sessions/assistant_saved_session_20260513104438_gen-ag05131044-cbe2ff.json @@ -0,0 +1,165 @@ +{ + "saved_at": "2026-05-13T10:44:38+00:00", + "generation_id": "gen-ag05131044-cbe2ff", + "mode": "saved_user_sessions", + "title": "AGENT | Phase 101 inventory root scope without warehouse clarification replay", + "agent_run": true, + "questions": [ + "приветик - че как там дела", + "расскажи что можешь интересного", + "кайф - что там на складе по остаткам?", + "АЛЬТЕРНАТИВА", + "а исторические остатки на другие даты умеешь?", + "давай на июнь 2017", + "март 2016" + ], + "metadata": { + "assistant_prompt_version": null, + "decomposition_prompt_version": null, + "prompt_fingerprint": null, + "agent_focus": "Focused semantic replay for the manual assistant-stage1-hyh1A1WR3j signal: after a broad stock-on-hand question and organization clarification, the assistant must resume the root inventory snapshot for the selected company across available warehouses instead of asking the user to name a warehouse, item, category, or material.", + "architecture_phase": "turnaround_11", + "source_spec_file": "X:\\1C\\NDC_1C\\docs\\orchestration\\address_truth_harness_phase101_inventory_root_scope_no_warehouse_clarification.json", + "scenario_id": "address_truth_harness_phase101_inventory_root_scope_no_warehouse_clarification", + "semantic_tags": [ + "capability_meta", + "clarification_required", + "clarification_resume", + "date_followup", + "historical_inventory", + "human_answer_quality", + "inventory_capability_meta", + "inventory_root", + "smalltalk_entry", + "warehouse_not_required" + ], + "validation_status": "accepted_live_replay", + "validated_run_dir": "artifacts\\domain_runs\\phase101_inventory_root_scope_no_warehouse_clarification_live1", + "saved_after_validated_replay": true, + "save_gate": { + "schema_version": "agent_semantic_save_gate_v1", + "validation_status": "accepted_live_replay", + "validated_run_dir": "artifacts\\domain_runs\\phase101_inventory_root_scope_no_warehouse_clarification_live1", + "final_status": "accepted", + "review_overall_status": "pass", + "business_overall_status": "pass", + "steps_total": 7, + "steps_passed": 7, + "steps_failed": 0, + "steps_with_business_failures": 0, + "steps_with_business_warnings": 0, + "acceptance_gate_passed": true, + "saved_after_validated_replay": true + } + }, + "source_session_id": null, + "session": { + "session_id": null, + "mode": "agent_semantic_run", + "items": [ + { + "message_id": "agent-user-001", + "role": "user", + "text": "приветик - че как там дела", + "created_at": "2026-05-13T10:44:38+00:00", + "reply_type": null, + "trace_id": null, + "debug": null + }, + { + "message_id": "agent-user-002", + "role": "user", + "text": "расскажи что можешь интересного", + "created_at": "2026-05-13T10:44:38+00:00", + "reply_type": null, + "trace_id": null, + "debug": null + }, + { + "message_id": "agent-user-003", + "role": "user", + "text": "кайф - что там на складе по остаткам?", + "created_at": "2026-05-13T10:44:38+00:00", + "reply_type": null, + "trace_id": null, + "debug": null + }, + { + "message_id": "agent-user-004", + "role": "user", + "text": "АЛЬТЕРНАТИВА", + "created_at": "2026-05-13T10:44:38+00:00", + "reply_type": null, + "trace_id": null, + "debug": null + }, + { + "message_id": "agent-user-005", + "role": "user", + "text": "а исторические остатки на другие даты умеешь?", + "created_at": "2026-05-13T10:44:38+00:00", + "reply_type": null, + "trace_id": null, + "debug": null + }, + { + "message_id": "agent-user-006", + "role": "user", + "text": "давай на июнь 2017", + "created_at": "2026-05-13T10:44:38+00:00", + "reply_type": null, + "trace_id": null, + "debug": null + }, + { + "message_id": "agent-user-007", + "role": "user", + "text": "март 2016", + "created_at": "2026-05-13T10:44:38+00:00", + "reply_type": null, + "trace_id": null, + "debug": null + } + ], + "agent_run": true, + "metadata": { + "assistant_prompt_version": null, + "decomposition_prompt_version": null, + "prompt_fingerprint": null, + "agent_focus": "Focused semantic replay for the manual assistant-stage1-hyh1A1WR3j signal: after a broad stock-on-hand question and organization clarification, the assistant must resume the root inventory snapshot for the selected company across available warehouses instead of asking the user to name a warehouse, item, category, or material.", + "architecture_phase": "turnaround_11", + "source_spec_file": "X:\\1C\\NDC_1C\\docs\\orchestration\\address_truth_harness_phase101_inventory_root_scope_no_warehouse_clarification.json", + "scenario_id": "address_truth_harness_phase101_inventory_root_scope_no_warehouse_clarification", + "semantic_tags": [ + "capability_meta", + "clarification_required", + "clarification_resume", + "date_followup", + "historical_inventory", + "human_answer_quality", + "inventory_capability_meta", + "inventory_root", + "smalltalk_entry", + "warehouse_not_required" + ], + "validation_status": "accepted_live_replay", + "validated_run_dir": "artifacts\\domain_runs\\phase101_inventory_root_scope_no_warehouse_clarification_live1", + "saved_after_validated_replay": true, + "save_gate": { + "schema_version": "agent_semantic_save_gate_v1", + "validation_status": "accepted_live_replay", + "validated_run_dir": "artifacts\\domain_runs\\phase101_inventory_root_scope_no_warehouse_clarification_live1", + "final_status": "accepted", + "review_overall_status": "pass", + "business_overall_status": "pass", + "steps_total": 7, + "steps_passed": 7, + "steps_failed": 0, + "steps_with_business_failures": 0, + "steps_with_business_warnings": 0, + "acceptance_gate_passed": true, + "saved_after_validated_replay": true + } + } + } +} diff --git a/llm_normalizer/data/autorun_generators/saved_sessions/assistant_saved_session_20260513112152_gen-ag05131121-8c41ab.json b/llm_normalizer/data/autorun_generators/saved_sessions/assistant_saved_session_20260513112152_gen-ag05131121-8c41ab.json new file mode 100644 index 0000000..c073f17 --- /dev/null +++ b/llm_normalizer/data/autorun_generators/saved_sessions/assistant_saved_session_20260513112152_gen-ag05131121-8c41ab.json @@ -0,0 +1,155 @@ +{ + "saved_at": "2026-05-13T11:21:52+00:00", + "generation_id": "gen-ag05131121-8c41ab", + "mode": "saved_user_sessions", + "title": "AGENT | Phase 102 debt mirror clean-scope polarity replay", + "agent_run": true, + "questions": [ + "Альтернатива Плюс", + "мы кому реально должны денег на сегодня? коротко, чистый долг к оплате, без встречных обеспечений как основного долга", + "а Комитету государственных услуг мы реально должны эти 3,6 млн или это встречное обеспечение/зачет? объясни именно по смыслу долга", + "а нам Комитет государственных услуг тоже должен 3,6 млн? это же та же сумма — скажи честно, это дебиторка или встречная часть?", + "тогда кто нам реально должен денег на сегодня? именно чистая дебиторка", + "а мы кому?" + ], + "metadata": { + "assistant_prompt_version": null, + "decomposition_prompt_version": null, + "prompt_fingerprint": null, + "agent_focus": "phase102 debt mirror clean-scope polarity", + "architecture_phase": "Open-World Schema/Primitive Discovery", + "source_spec_file": "X:\\1C\\NDC_1C\\docs\\orchestration\\address_truth_harness_phase102_debt_mirror_clean_scope_polarity.json", + "scenario_id": "address_truth_harness_phase102_debt_mirror_clean_scope_polarity", + "semantic_tags": [ + "company_selected", + "debt_mirror_clean_scope", + "direct_answer_first", + "organization_scope", + "payables_counterparty_check", + "payables_snapshot", + "polarity_honesty", + "receivables_counterparty_check", + "receivables_snapshot", + "settlements_mirror_followup" + ], + "validation_status": "accepted_live_replay", + "validated_run_dir": "artifacts\\domain_runs\\phase102_debt_mirror_clean_scope_polarity_live3", + "saved_after_validated_replay": true, + "save_gate": { + "schema_version": "agent_semantic_save_gate_v1", + "validation_status": "accepted_live_replay", + "validated_run_dir": "artifacts\\domain_runs\\phase102_debt_mirror_clean_scope_polarity_live3", + "final_status": "accepted", + "review_overall_status": "pass", + "business_overall_status": "pass", + "steps_total": 6, + "steps_passed": 6, + "steps_failed": 0, + "steps_with_business_failures": 0, + "steps_with_business_warnings": 0, + "acceptance_gate_passed": true, + "saved_after_validated_replay": true + } + }, + "source_session_id": null, + "session": { + "session_id": null, + "mode": "agent_semantic_run", + "items": [ + { + "message_id": "agent-user-001", + "role": "user", + "text": "Альтернатива Плюс", + "created_at": "2026-05-13T11:21:52+00:00", + "reply_type": null, + "trace_id": null, + "debug": null + }, + { + "message_id": "agent-user-002", + "role": "user", + "text": "мы кому реально должны денег на сегодня? коротко, чистый долг к оплате, без встречных обеспечений как основного долга", + "created_at": "2026-05-13T11:21:52+00:00", + "reply_type": null, + "trace_id": null, + "debug": null + }, + { + "message_id": "agent-user-003", + "role": "user", + "text": "а Комитету государственных услуг мы реально должны эти 3,6 млн или это встречное обеспечение/зачет? объясни именно по смыслу долга", + "created_at": "2026-05-13T11:21:52+00:00", + "reply_type": null, + "trace_id": null, + "debug": null + }, + { + "message_id": "agent-user-004", + "role": "user", + "text": "а нам Комитет государственных услуг тоже должен 3,6 млн? это же та же сумма — скажи честно, это дебиторка или встречная часть?", + "created_at": "2026-05-13T11:21:52+00:00", + "reply_type": null, + "trace_id": null, + "debug": null + }, + { + "message_id": "agent-user-005", + "role": "user", + "text": "тогда кто нам реально должен денег на сегодня? именно чистая дебиторка", + "created_at": "2026-05-13T11:21:52+00:00", + "reply_type": null, + "trace_id": null, + "debug": null + }, + { + "message_id": "agent-user-006", + "role": "user", + "text": "а мы кому?", + "created_at": "2026-05-13T11:21:52+00:00", + "reply_type": null, + "trace_id": null, + "debug": null + } + ], + "agent_run": true, + "metadata": { + "assistant_prompt_version": null, + "decomposition_prompt_version": null, + "prompt_fingerprint": null, + "agent_focus": "phase102 debt mirror clean-scope polarity", + "architecture_phase": "Open-World Schema/Primitive Discovery", + "source_spec_file": "X:\\1C\\NDC_1C\\docs\\orchestration\\address_truth_harness_phase102_debt_mirror_clean_scope_polarity.json", + "scenario_id": "address_truth_harness_phase102_debt_mirror_clean_scope_polarity", + "semantic_tags": [ + "company_selected", + "debt_mirror_clean_scope", + "direct_answer_first", + "organization_scope", + "payables_counterparty_check", + "payables_snapshot", + "polarity_honesty", + "receivables_counterparty_check", + "receivables_snapshot", + "settlements_mirror_followup" + ], + "validation_status": "accepted_live_replay", + "validated_run_dir": "artifacts\\domain_runs\\phase102_debt_mirror_clean_scope_polarity_live3", + "saved_after_validated_replay": true, + "save_gate": { + "schema_version": "agent_semantic_save_gate_v1", + "validation_status": "accepted_live_replay", + "validated_run_dir": "artifacts\\domain_runs\\phase102_debt_mirror_clean_scope_polarity_live3", + "final_status": "accepted", + "review_overall_status": "pass", + "business_overall_status": "pass", + "steps_total": 6, + "steps_passed": 6, + "steps_failed": 0, + "steps_with_business_failures": 0, + "steps_with_business_warnings": 0, + "acceptance_gate_passed": true, + "saved_after_validated_replay": true + } + } + } +} diff --git a/llm_normalizer/data/autorun_generators/saved_sessions/assistant_saved_session_20260513120047_gen-ag05131200-0ed59a.json b/llm_normalizer/data/autorun_generators/saved_sessions/assistant_saved_session_20260513120047_gen-ag05131200-0ed59a.json new file mode 100644 index 0000000..3f2b77a --- /dev/null +++ b/llm_normalizer/data/autorun_generators/saved_sessions/assistant_saved_session_20260513120047_gen-ag05131200-0ed59a.json @@ -0,0 +1,161 @@ +{ + "saved_at": "2026-05-13T12:00:47+00:00", + "generation_id": "gen-ag05131200-0ed59a", + "mode": "saved_user_sessions", + "title": "AGENT | Phase 103 financial role and purpose arbitration replay", + "agent_run": true, + "questions": [ + "Альтернатива Плюс", + "По СБЕРБАНКУ за 2020 покажи коротко: сколько денег входило и уходило, и что это по смыслу в 1С — клиентская выручка, поставщик, комиссия, кредит или другой финансовый поток?", + "Если СБЕРБАНК есть во входящих поступлениях, можно ли считать его нашим клиентом и выручкой? Скажи по подтвержденным строкам, без притягивания.", + "А если деньги уходили в СБЕРБАНК, это наш поставщик или финансовые списания? Раздели по смыслу и покажи основание.", + "Теперь дай взрослый обзор за 2020 по компании: входящие, исходящие, нетто, топы, но банк в топах отдельно объясни как финансовый поток, если по назначению он не обычный клиент или поставщик.", + "А теперь отдельно по Группа СВК за 2020: сколько денег получили, сколько заплатили и какое нетто?" + ], + "metadata": { + "assistant_prompt_version": null, + "decomposition_prompt_version": null, + "prompt_fingerprint": null, + "agent_focus": "phase103 financial role and purpose arbitration", + "architecture_phase": "Open-World Schema/Primitive Discovery", + "source_spec_file": "X:\\1C\\NDC_1C\\docs\\orchestration\\address_truth_harness_phase103_financial_role_purpose_arbitration.json", + "scenario_id": "address_truth_harness_phase103_financial_role_purpose_arbitration", + "semantic_tags": [ + "bank_like_counterparty", + "bank_like_customer_boundary", + "bank_like_supplier_boundary", + "bank_operations_by_counterparty", + "bare_org_scope", + "business_overview", + "canary", + "counterparty_net_cash_flow", + "customer_revenue_and_payments", + "financial_role_purpose", + "phase102_canary", + "stale_scope_guard", + "supplier_payouts_profile" + ], + "validation_status": "accepted_live_replay", + "validated_run_dir": "artifacts\\domain_runs\\phase103_financial_role_purpose_arbitration_live3", + "saved_after_validated_replay": true, + "save_gate": { + "schema_version": "agent_semantic_save_gate_v1", + "validation_status": "accepted_live_replay", + "validated_run_dir": "artifacts\\domain_runs\\phase103_financial_role_purpose_arbitration_live3", + "final_status": "accepted", + "review_overall_status": "pass", + "business_overall_status": "pass", + "steps_total": 6, + "steps_passed": 6, + "steps_failed": 0, + "steps_with_business_failures": 0, + "steps_with_business_warnings": 0, + "acceptance_gate_passed": true, + "saved_after_validated_replay": true + } + }, + "source_session_id": null, + "session": { + "session_id": null, + "mode": "agent_semantic_run", + "items": [ + { + "message_id": "agent-user-001", + "role": "user", + "text": "Альтернатива Плюс", + "created_at": "2026-05-13T12:00:47+00:00", + "reply_type": null, + "trace_id": null, + "debug": null + }, + { + "message_id": "agent-user-002", + "role": "user", + "text": "По СБЕРБАНКУ за 2020 покажи коротко: сколько денег входило и уходило, и что это по смыслу в 1С — клиентская выручка, поставщик, комиссия, кредит или другой финансовый поток?", + "created_at": "2026-05-13T12:00:47+00:00", + "reply_type": null, + "trace_id": null, + "debug": null + }, + { + "message_id": "agent-user-003", + "role": "user", + "text": "Если СБЕРБАНК есть во входящих поступлениях, можно ли считать его нашим клиентом и выручкой? Скажи по подтвержденным строкам, без притягивания.", + "created_at": "2026-05-13T12:00:47+00:00", + "reply_type": null, + "trace_id": null, + "debug": null + }, + { + "message_id": "agent-user-004", + "role": "user", + "text": "А если деньги уходили в СБЕРБАНК, это наш поставщик или финансовые списания? Раздели по смыслу и покажи основание.", + "created_at": "2026-05-13T12:00:47+00:00", + "reply_type": null, + "trace_id": null, + "debug": null + }, + { + "message_id": "agent-user-005", + "role": "user", + "text": "Теперь дай взрослый обзор за 2020 по компании: входящие, исходящие, нетто, топы, но банк в топах отдельно объясни как финансовый поток, если по назначению он не обычный клиент или поставщик.", + "created_at": "2026-05-13T12:00:47+00:00", + "reply_type": null, + "trace_id": null, + "debug": null + }, + { + "message_id": "agent-user-006", + "role": "user", + "text": "А теперь отдельно по Группа СВК за 2020: сколько денег получили, сколько заплатили и какое нетто?", + "created_at": "2026-05-13T12:00:47+00:00", + "reply_type": null, + "trace_id": null, + "debug": null + } + ], + "agent_run": true, + "metadata": { + "assistant_prompt_version": null, + "decomposition_prompt_version": null, + "prompt_fingerprint": null, + "agent_focus": "phase103 financial role and purpose arbitration", + "architecture_phase": "Open-World Schema/Primitive Discovery", + "source_spec_file": "X:\\1C\\NDC_1C\\docs\\orchestration\\address_truth_harness_phase103_financial_role_purpose_arbitration.json", + "scenario_id": "address_truth_harness_phase103_financial_role_purpose_arbitration", + "semantic_tags": [ + "bank_like_counterparty", + "bank_like_customer_boundary", + "bank_like_supplier_boundary", + "bank_operations_by_counterparty", + "bare_org_scope", + "business_overview", + "canary", + "counterparty_net_cash_flow", + "customer_revenue_and_payments", + "financial_role_purpose", + "phase102_canary", + "stale_scope_guard", + "supplier_payouts_profile" + ], + "validation_status": "accepted_live_replay", + "validated_run_dir": "artifacts\\domain_runs\\phase103_financial_role_purpose_arbitration_live3", + "saved_after_validated_replay": true, + "save_gate": { + "schema_version": "agent_semantic_save_gate_v1", + "validation_status": "accepted_live_replay", + "validated_run_dir": "artifacts\\domain_runs\\phase103_financial_role_purpose_arbitration_live3", + "final_status": "accepted", + "review_overall_status": "pass", + "business_overall_status": "pass", + "steps_total": 6, + "steps_passed": 6, + "steps_failed": 0, + "steps_with_business_failures": 0, + "steps_with_business_warnings": 0, + "acceptance_gate_passed": true, + "saved_after_validated_replay": true + } + } + } +} diff --git a/llm_normalizer/data/autorun_generators/saved_sessions/assistant_saved_session_20260513122618_gen-ag05131226-630ddf.json b/llm_normalizer/data/autorun_generators/saved_sessions/assistant_saved_session_20260513122618_gen-ag05131226-630ddf.json new file mode 100644 index 0000000..7145e12 --- /dev/null +++ b/llm_normalizer/data/autorun_generators/saved_sessions/assistant_saved_session_20260513122618_gen-ag05131226-630ddf.json @@ -0,0 +1,135 @@ +{ + "saved_at": "2026-05-13T12:26:18+00:00", + "generation_id": "gen-ag05131226-630ddf", + "mode": "saved_user_sessions", + "title": "AGENT | Phase 104 generic role-tail anchor hygiene replay", + "agent_run": true, + "questions": [ + "Альтернатива Плюс", + "Теперь дай взрослый обзор за 2020 по компании: входящие, исходящие, нетто, топы, но банк в топах отдельно объясни как финансовый поток, если по назначению он не обычный клиент или поставщик.", + "А отдельно по СБЕРБАНКУ: он для нас клиент, поставщик или финансовый поток? Дай коротко по подтвержденным строкам.", + "А теперь по поставщику Группа СВК за 2020: сколько мы ему заплатили и какой общий денежный смысл?" + ], + "metadata": { + "assistant_prompt_version": null, + "decomposition_prompt_version": null, + "prompt_fingerprint": null, + "agent_focus": "phase104 generic role-tail anchor hygiene", + "architecture_phase": "Open-World Schema/Primitive Discovery", + "source_spec_file": "X:\\1C\\NDC_1C\\docs\\orchestration\\address_truth_harness_phase104_generic_role_tail_anchor_hygiene.json", + "scenario_id": "address_truth_harness_phase104_generic_role_tail_anchor_hygiene", + "semantic_tags": [ + "bank_like_counterparty", + "bare_org_scope", + "business_overview", + "counterparty_net_cash_flow", + "financial_role_purpose", + "generic_role_tail_anchor_hygiene", + "phase102_canary", + "post_overview_anchor_integrity", + "stale_scope_guard", + "supplier_prefix_canary" + ], + "validation_status": "accepted_live_replay", + "validated_run_dir": "artifacts\\domain_runs\\phase104_generic_role_tail_anchor_hygiene_live2", + "saved_after_validated_replay": true, + "save_gate": { + "schema_version": "agent_semantic_save_gate_v1", + "validation_status": "accepted_live_replay", + "validated_run_dir": "artifacts\\domain_runs\\phase104_generic_role_tail_anchor_hygiene_live2", + "final_status": "accepted", + "review_overall_status": "pass", + "business_overall_status": "pass", + "steps_total": 4, + "steps_passed": 4, + "steps_failed": 0, + "steps_with_business_failures": 0, + "steps_with_business_warnings": 0, + "acceptance_gate_passed": true, + "saved_after_validated_replay": true + } + }, + "source_session_id": null, + "session": { + "session_id": null, + "mode": "agent_semantic_run", + "items": [ + { + "message_id": "agent-user-001", + "role": "user", + "text": "Альтернатива Плюс", + "created_at": "2026-05-13T12:26:18+00:00", + "reply_type": null, + "trace_id": null, + "debug": null + }, + { + "message_id": "agent-user-002", + "role": "user", + "text": "Теперь дай взрослый обзор за 2020 по компании: входящие, исходящие, нетто, топы, но банк в топах отдельно объясни как финансовый поток, если по назначению он не обычный клиент или поставщик.", + "created_at": "2026-05-13T12:26:18+00:00", + "reply_type": null, + "trace_id": null, + "debug": null + }, + { + "message_id": "agent-user-003", + "role": "user", + "text": "А отдельно по СБЕРБАНКУ: он для нас клиент, поставщик или финансовый поток? Дай коротко по подтвержденным строкам.", + "created_at": "2026-05-13T12:26:18+00:00", + "reply_type": null, + "trace_id": null, + "debug": null + }, + { + "message_id": "agent-user-004", + "role": "user", + "text": "А теперь по поставщику Группа СВК за 2020: сколько мы ему заплатили и какой общий денежный смысл?", + "created_at": "2026-05-13T12:26:18+00:00", + "reply_type": null, + "trace_id": null, + "debug": null + } + ], + "agent_run": true, + "metadata": { + "assistant_prompt_version": null, + "decomposition_prompt_version": null, + "prompt_fingerprint": null, + "agent_focus": "phase104 generic role-tail anchor hygiene", + "architecture_phase": "Open-World Schema/Primitive Discovery", + "source_spec_file": "X:\\1C\\NDC_1C\\docs\\orchestration\\address_truth_harness_phase104_generic_role_tail_anchor_hygiene.json", + "scenario_id": "address_truth_harness_phase104_generic_role_tail_anchor_hygiene", + "semantic_tags": [ + "bank_like_counterparty", + "bare_org_scope", + "business_overview", + "counterparty_net_cash_flow", + "financial_role_purpose", + "generic_role_tail_anchor_hygiene", + "phase102_canary", + "post_overview_anchor_integrity", + "stale_scope_guard", + "supplier_prefix_canary" + ], + "validation_status": "accepted_live_replay", + "validated_run_dir": "artifacts\\domain_runs\\phase104_generic_role_tail_anchor_hygiene_live2", + "saved_after_validated_replay": true, + "save_gate": { + "schema_version": "agent_semantic_save_gate_v1", + "validation_status": "accepted_live_replay", + "validated_run_dir": "artifacts\\domain_runs\\phase104_generic_role_tail_anchor_hygiene_live2", + "final_status": "accepted", + "review_overall_status": "pass", + "business_overall_status": "pass", + "steps_total": 4, + "steps_passed": 4, + "steps_failed": 0, + "steps_with_business_failures": 0, + "steps_with_business_warnings": 0, + "acceptance_gate_passed": true, + "saved_after_validated_replay": true + } + } + } +} diff --git a/llm_normalizer/data/autorun_generators/saved_sessions/assistant_saved_session_20260513131237_gen-ag05131312-2d0445.json b/llm_normalizer/data/autorun_generators/saved_sessions/assistant_saved_session_20260513131237_gen-ag05131312-2d0445.json new file mode 100644 index 0000000..32dc9bb --- /dev/null +++ b/llm_normalizer/data/autorun_generators/saved_sessions/assistant_saved_session_20260513131237_gen-ag05131312-2d0445.json @@ -0,0 +1,253 @@ +{ + "saved_at": "2026-05-13T13:12:37+00:00", + "generation_id": "gen-ag05131312-2d0445", + "mode": "saved_user_sessions", + "title": "AGENT | Phase 105 mixed schema/primitive closure replay", + "agent_run": true, + "questions": [ + "кайф - что там на складе по остаткам?", + "АЛЬТЕРНАТИВА", + "а исторические остатки на другие даты умеешь?", + "давай на июнь 2017", + "Теперь дай взрослый обзор за 2020 по компании: входящие, исходящие, нетто, топы, но банк в топах отдельно объясни как финансовый поток, если по назначению он не обычный клиент или поставщик.", + "А отдельно по СБЕРБАНКУ: он для нас клиент, поставщик или финансовый поток? Дай коротко по подтвержденным строкам.", + "А теперь по поставщику Группа СВК за 2020: сколько мы ему заплатили и какой общий денежный смысл?", + "А теперь по Группа СВК за 2020: сколько денег получили, сколько заплатили и какое нетто?", + "кому мы должны на конец 2020?", + "а нам кто должен на конец 2020?", + "сколько НДС надо заплатить в налоговую за декабрь 2019?", + "скока денег альтернатива заработала за 20 год?", + "а это чистая прибыль?" + ], + "metadata": { + "assistant_prompt_version": null, + "decomposition_prompt_version": null, + "prompt_fingerprint": null, + "agent_focus": "phase105 mixed schema primitive closure", + "architecture_phase": "Open-World Schema/Primitive Discovery", + "source_spec_file": "X:\\1C\\NDC_1C\\docs\\orchestration\\address_truth_harness_phase105_mixed_schema_primitive_closure.json", + "scenario_id": "address_truth_harness_phase105_mixed_schema_primitive_closure", + "semantic_tags": [ + "bank_like_counterparty", + "business_overview", + "business_overview_followup", + "counterparty_net_cash_flow", + "date_followup", + "debt_polarity", + "earnings_wording", + "financial_role_purpose", + "generic_role_tail_anchor_hygiene", + "historical_inventory", + "inventory_capability_meta", + "inventory_root", + "organization_scope", + "payables", + "profit_boundary", + "receivables", + "redundant_scope_selection", + "resolved_organization_scope", + "stale_scope_guard", + "supplier_prefix_canary", + "tax_period", + "value_flow", + "vat_continuity", + "warehouse_not_required" + ], + "validation_status": "accepted_live_replay", + "validated_run_dir": "artifacts\\domain_runs\\phase105_mixed_schema_primitive_closure_live3", + "saved_after_validated_replay": true, + "save_gate": { + "schema_version": "agent_semantic_save_gate_v1", + "validation_status": "accepted_live_replay", + "validated_run_dir": "artifacts\\domain_runs\\phase105_mixed_schema_primitive_closure_live3", + "final_status": "accepted", + "review_overall_status": "pass", + "business_overall_status": "pass", + "steps_total": 13, + "steps_passed": 13, + "steps_failed": 0, + "steps_with_business_failures": 0, + "steps_with_business_warnings": 0, + "acceptance_gate_passed": true, + "saved_after_validated_replay": true + } + }, + "source_session_id": null, + "session": { + "session_id": null, + "mode": "agent_semantic_run", + "items": [ + { + "message_id": "agent-user-001", + "role": "user", + "text": "кайф - что там на складе по остаткам?", + "created_at": "2026-05-13T13:12:37+00:00", + "reply_type": null, + "trace_id": null, + "debug": null + }, + { + "message_id": "agent-user-002", + "role": "user", + "text": "АЛЬТЕРНАТИВА", + "created_at": "2026-05-13T13:12:37+00:00", + "reply_type": null, + "trace_id": null, + "debug": null + }, + { + "message_id": "agent-user-003", + "role": "user", + "text": "а исторические остатки на другие даты умеешь?", + "created_at": "2026-05-13T13:12:37+00:00", + "reply_type": null, + "trace_id": null, + "debug": null + }, + { + "message_id": "agent-user-004", + "role": "user", + "text": "давай на июнь 2017", + "created_at": "2026-05-13T13:12:37+00:00", + "reply_type": null, + "trace_id": null, + "debug": null + }, + { + "message_id": "agent-user-005", + "role": "user", + "text": "Теперь дай взрослый обзор за 2020 по компании: входящие, исходящие, нетто, топы, но банк в топах отдельно объясни как финансовый поток, если по назначению он не обычный клиент или поставщик.", + "created_at": "2026-05-13T13:12:37+00:00", + "reply_type": null, + "trace_id": null, + "debug": null + }, + { + "message_id": "agent-user-006", + "role": "user", + "text": "А отдельно по СБЕРБАНКУ: он для нас клиент, поставщик или финансовый поток? Дай коротко по подтвержденным строкам.", + "created_at": "2026-05-13T13:12:37+00:00", + "reply_type": null, + "trace_id": null, + "debug": null + }, + { + "message_id": "agent-user-007", + "role": "user", + "text": "А теперь по поставщику Группа СВК за 2020: сколько мы ему заплатили и какой общий денежный смысл?", + "created_at": "2026-05-13T13:12:37+00:00", + "reply_type": null, + "trace_id": null, + "debug": null + }, + { + "message_id": "agent-user-008", + "role": "user", + "text": "А теперь по Группа СВК за 2020: сколько денег получили, сколько заплатили и какое нетто?", + "created_at": "2026-05-13T13:12:37+00:00", + "reply_type": null, + "trace_id": null, + "debug": null + }, + { + "message_id": "agent-user-009", + "role": "user", + "text": "кому мы должны на конец 2020?", + "created_at": "2026-05-13T13:12:37+00:00", + "reply_type": null, + "trace_id": null, + "debug": null + }, + { + "message_id": "agent-user-010", + "role": "user", + "text": "а нам кто должен на конец 2020?", + "created_at": "2026-05-13T13:12:37+00:00", + "reply_type": null, + "trace_id": null, + "debug": null + }, + { + "message_id": "agent-user-011", + "role": "user", + "text": "сколько НДС надо заплатить в налоговую за декабрь 2019?", + "created_at": "2026-05-13T13:12:37+00:00", + "reply_type": null, + "trace_id": null, + "debug": null + }, + { + "message_id": "agent-user-012", + "role": "user", + "text": "скока денег альтернатива заработала за 20 год?", + "created_at": "2026-05-13T13:12:37+00:00", + "reply_type": null, + "trace_id": null, + "debug": null + }, + { + "message_id": "agent-user-013", + "role": "user", + "text": "а это чистая прибыль?", + "created_at": "2026-05-13T13:12:37+00:00", + "reply_type": null, + "trace_id": null, + "debug": null + } + ], + "agent_run": true, + "metadata": { + "assistant_prompt_version": null, + "decomposition_prompt_version": null, + "prompt_fingerprint": null, + "agent_focus": "phase105 mixed schema primitive closure", + "architecture_phase": "Open-World Schema/Primitive Discovery", + "source_spec_file": "X:\\1C\\NDC_1C\\docs\\orchestration\\address_truth_harness_phase105_mixed_schema_primitive_closure.json", + "scenario_id": "address_truth_harness_phase105_mixed_schema_primitive_closure", + "semantic_tags": [ + "bank_like_counterparty", + "business_overview", + "business_overview_followup", + "counterparty_net_cash_flow", + "date_followup", + "debt_polarity", + "earnings_wording", + "financial_role_purpose", + "generic_role_tail_anchor_hygiene", + "historical_inventory", + "inventory_capability_meta", + "inventory_root", + "organization_scope", + "payables", + "profit_boundary", + "receivables", + "redundant_scope_selection", + "resolved_organization_scope", + "stale_scope_guard", + "supplier_prefix_canary", + "tax_period", + "value_flow", + "vat_continuity", + "warehouse_not_required" + ], + "validation_status": "accepted_live_replay", + "validated_run_dir": "artifacts\\domain_runs\\phase105_mixed_schema_primitive_closure_live3", + "saved_after_validated_replay": true, + "save_gate": { + "schema_version": "agent_semantic_save_gate_v1", + "validation_status": "accepted_live_replay", + "validated_run_dir": "artifacts\\domain_runs\\phase105_mixed_schema_primitive_closure_live3", + "final_status": "accepted", + "review_overall_status": "pass", + "business_overall_status": "pass", + "steps_total": 13, + "steps_passed": 13, + "steps_failed": 0, + "steps_with_business_failures": 0, + "steps_with_business_warnings": 0, + "acceptance_gate_passed": true, + "saved_after_validated_replay": true + } + } + } +} diff --git a/llm_normalizer/data/eval_cases/assistant_autogen_saved_user_sessions_20260513100901_gen-ag05131009-f08174.json b/llm_normalizer/data/eval_cases/assistant_autogen_saved_user_sessions_20260513100901_gen-ag05131009-f08174.json new file mode 100644 index 0000000..712af6b --- /dev/null +++ b/llm_normalizer/data/eval_cases/assistant_autogen_saved_user_sessions_20260513100901_gen-ag05131009-f08174.json @@ -0,0 +1,37 @@ +{ + "suite_id": "assistant_saved_session_gen-ag05131009-f08174", + "suite_version": "0.1.0", + "schema_version": "assistant_saved_session_suite_v0_1", + "generated_at": "2026-05-13T10:09:01+00:00", + "generation_id": "gen-ag05131009-f08174", + "mode": "saved_user_sessions", + "title": "AGENT | Phase 99 large-query budget and continuation policy replay", + "domain": "address_phase99_large_query_budget_continuation", + "scenario_count": 1, + "case_ids": [ + "SAVED-001" + ], + "cases": [ + { + "case_id": "SAVED-001", + "scenario_tag": "agent_saved_user_sessions", + "title": "AGENT | Phase 99 large-query budget and continuation policy replay", + "question_type": "followup", + "broadness_level": "medium", + "turns": [ + { + "user_message": "Дай взрослый бизнес-обзор ООО Альтернатива Плюс за 2020: входящие, исходящие, нетто, кто основные источники денег и где важные ограничения." + }, + { + "user_message": "То есть это можно считать прибылью за 2020 или нет? Коротко." + }, + { + "user_message": "А кто за 2020 принес больше всего денег, и если там банк, не называй его обычным клиентом." + }, + { + "user_message": "По исходящим за 2020 есть зависимость от одного поставщика или это только денежная концентрация?" + } + ] + } + ] +} diff --git a/llm_normalizer/data/eval_cases/assistant_autogen_saved_user_sessions_20260513102845_gen-ag05131028-234e5e.json b/llm_normalizer/data/eval_cases/assistant_autogen_saved_user_sessions_20260513102845_gen-ag05131028-234e5e.json new file mode 100644 index 0000000..2e720a9 --- /dev/null +++ b/llm_normalizer/data/eval_cases/assistant_autogen_saved_user_sessions_20260513102845_gen-ag05131028-234e5e.json @@ -0,0 +1,34 @@ +{ + "suite_id": "assistant_saved_session_gen-ag05131028-234e5e", + "suite_version": "0.1.0", + "schema_version": "assistant_saved_session_suite_v0_1", + "generated_at": "2026-05-13T10:28:45+00:00", + "generation_id": "gen-ag05131028-234e5e", + "mode": "saved_user_sessions", + "title": "AGENT | Phase 100 large-query continuation UX replay", + "domain": "address_phase100_large_query_continuation_ux", + "scenario_count": 1, + "case_ids": [ + "SAVED-001" + ], + "cases": [ + { + "case_id": "SAVED-001", + "scenario_tag": "agent_saved_user_sessions", + "title": "AGENT | Phase 100 large-query continuation UX replay", + "question_type": "followup", + "broadness_level": "medium", + "turns": [ + { + "user_message": "Дай общий бизнес-обзор ООО Альтернатива Плюс за весь доступный период: входящие, исходящие, нетто, лучший год. Если срез слишком широкий, не выдумывай полный итог, а скажи как безопасно дособрать." + }, + { + "user_message": "Ок, тогда дособери конкретно 2020: входящие, исходящие и расчетное денежное нетто." + }, + { + "user_message": "Это можно считать прибылью за 2020 или нет? Ответь коротко и по делу." + } + ] + } + ] +} diff --git a/llm_normalizer/data/eval_cases/assistant_autogen_saved_user_sessions_20260513104438_gen-ag05131044-cbe2ff.json b/llm_normalizer/data/eval_cases/assistant_autogen_saved_user_sessions_20260513104438_gen-ag05131044-cbe2ff.json new file mode 100644 index 0000000..cae0d9b --- /dev/null +++ b/llm_normalizer/data/eval_cases/assistant_autogen_saved_user_sessions_20260513104438_gen-ag05131044-cbe2ff.json @@ -0,0 +1,46 @@ +{ + "suite_id": "assistant_saved_session_gen-ag05131044-cbe2ff", + "suite_version": "0.1.0", + "schema_version": "assistant_saved_session_suite_v0_1", + "generated_at": "2026-05-13T10:44:38+00:00", + "generation_id": "gen-ag05131044-cbe2ff", + "mode": "saved_user_sessions", + "title": "AGENT | Phase 101 inventory root scope without warehouse clarification replay", + "domain": "address_phase101_inventory_root_scope_no_warehouse_clarification", + "scenario_count": 1, + "case_ids": [ + "SAVED-001" + ], + "cases": [ + { + "case_id": "SAVED-001", + "scenario_tag": "agent_saved_user_sessions", + "title": "AGENT | Phase 101 inventory root scope without warehouse clarification replay", + "question_type": "followup", + "broadness_level": "medium", + "turns": [ + { + "user_message": "приветик - че как там дела" + }, + { + "user_message": "расскажи что можешь интересного" + }, + { + "user_message": "кайф - что там на складе по остаткам?" + }, + { + "user_message": "АЛЬТЕРНАТИВА" + }, + { + "user_message": "а исторические остатки на другие даты умеешь?" + }, + { + "user_message": "давай на июнь 2017" + }, + { + "user_message": "март 2016" + } + ] + } + ] +} diff --git a/llm_normalizer/data/eval_cases/assistant_autogen_saved_user_sessions_20260513112152_gen-ag05131121-8c41ab.json b/llm_normalizer/data/eval_cases/assistant_autogen_saved_user_sessions_20260513112152_gen-ag05131121-8c41ab.json new file mode 100644 index 0000000..0524bf8 --- /dev/null +++ b/llm_normalizer/data/eval_cases/assistant_autogen_saved_user_sessions_20260513112152_gen-ag05131121-8c41ab.json @@ -0,0 +1,43 @@ +{ + "suite_id": "assistant_saved_session_gen-ag05131121-8c41ab", + "suite_version": "0.1.0", + "schema_version": "assistant_saved_session_suite_v0_1", + "generated_at": "2026-05-13T11:21:52+00:00", + "generation_id": "gen-ag05131121-8c41ab", + "mode": "saved_user_sessions", + "title": "AGENT | Phase 102 debt mirror clean-scope polarity replay", + "domain": "address_phase102_debt_mirror_clean_scope_polarity", + "scenario_count": 1, + "case_ids": [ + "SAVED-001" + ], + "cases": [ + { + "case_id": "SAVED-001", + "scenario_tag": "agent_saved_user_sessions", + "title": "AGENT | Phase 102 debt mirror clean-scope polarity replay", + "question_type": "followup", + "broadness_level": "medium", + "turns": [ + { + "user_message": "Альтернатива Плюс" + }, + { + "user_message": "мы кому реально должны денег на сегодня? коротко, чистый долг к оплате, без встречных обеспечений как основного долга" + }, + { + "user_message": "а Комитету государственных услуг мы реально должны эти 3,6 млн или это встречное обеспечение/зачет? объясни именно по смыслу долга" + }, + { + "user_message": "а нам Комитет государственных услуг тоже должен 3,6 млн? это же та же сумма — скажи честно, это дебиторка или встречная часть?" + }, + { + "user_message": "тогда кто нам реально должен денег на сегодня? именно чистая дебиторка" + }, + { + "user_message": "а мы кому?" + } + ] + } + ] +} diff --git a/llm_normalizer/data/eval_cases/assistant_autogen_saved_user_sessions_20260513120047_gen-ag05131200-0ed59a.json b/llm_normalizer/data/eval_cases/assistant_autogen_saved_user_sessions_20260513120047_gen-ag05131200-0ed59a.json new file mode 100644 index 0000000..8628bb9 --- /dev/null +++ b/llm_normalizer/data/eval_cases/assistant_autogen_saved_user_sessions_20260513120047_gen-ag05131200-0ed59a.json @@ -0,0 +1,43 @@ +{ + "suite_id": "assistant_saved_session_gen-ag05131200-0ed59a", + "suite_version": "0.1.0", + "schema_version": "assistant_saved_session_suite_v0_1", + "generated_at": "2026-05-13T12:00:47+00:00", + "generation_id": "gen-ag05131200-0ed59a", + "mode": "saved_user_sessions", + "title": "AGENT | Phase 103 financial role and purpose arbitration replay", + "domain": "address_phase103_financial_role_purpose_arbitration", + "scenario_count": 1, + "case_ids": [ + "SAVED-001" + ], + "cases": [ + { + "case_id": "SAVED-001", + "scenario_tag": "agent_saved_user_sessions", + "title": "AGENT | Phase 103 financial role and purpose arbitration replay", + "question_type": "followup", + "broadness_level": "medium", + "turns": [ + { + "user_message": "Альтернатива Плюс" + }, + { + "user_message": "По СБЕРБАНКУ за 2020 покажи коротко: сколько денег входило и уходило, и что это по смыслу в 1С — клиентская выручка, поставщик, комиссия, кредит или другой финансовый поток?" + }, + { + "user_message": "Если СБЕРБАНК есть во входящих поступлениях, можно ли считать его нашим клиентом и выручкой? Скажи по подтвержденным строкам, без притягивания." + }, + { + "user_message": "А если деньги уходили в СБЕРБАНК, это наш поставщик или финансовые списания? Раздели по смыслу и покажи основание." + }, + { + "user_message": "Теперь дай взрослый обзор за 2020 по компании: входящие, исходящие, нетто, топы, но банк в топах отдельно объясни как финансовый поток, если по назначению он не обычный клиент или поставщик." + }, + { + "user_message": "А теперь отдельно по Группа СВК за 2020: сколько денег получили, сколько заплатили и какое нетто?" + } + ] + } + ] +} diff --git a/llm_normalizer/data/eval_cases/assistant_autogen_saved_user_sessions_20260513122618_gen-ag05131226-630ddf.json b/llm_normalizer/data/eval_cases/assistant_autogen_saved_user_sessions_20260513122618_gen-ag05131226-630ddf.json new file mode 100644 index 0000000..4a760d2 --- /dev/null +++ b/llm_normalizer/data/eval_cases/assistant_autogen_saved_user_sessions_20260513122618_gen-ag05131226-630ddf.json @@ -0,0 +1,37 @@ +{ + "suite_id": "assistant_saved_session_gen-ag05131226-630ddf", + "suite_version": "0.1.0", + "schema_version": "assistant_saved_session_suite_v0_1", + "generated_at": "2026-05-13T12:26:18+00:00", + "generation_id": "gen-ag05131226-630ddf", + "mode": "saved_user_sessions", + "title": "AGENT | Phase 104 generic role-tail anchor hygiene replay", + "domain": "address_phase104_generic_role_tail_anchor_hygiene", + "scenario_count": 1, + "case_ids": [ + "SAVED-001" + ], + "cases": [ + { + "case_id": "SAVED-001", + "scenario_tag": "agent_saved_user_sessions", + "title": "AGENT | Phase 104 generic role-tail anchor hygiene replay", + "question_type": "followup", + "broadness_level": "medium", + "turns": [ + { + "user_message": "Альтернатива Плюс" + }, + { + "user_message": "Теперь дай взрослый обзор за 2020 по компании: входящие, исходящие, нетто, топы, но банк в топах отдельно объясни как финансовый поток, если по назначению он не обычный клиент или поставщик." + }, + { + "user_message": "А отдельно по СБЕРБАНКУ: он для нас клиент, поставщик или финансовый поток? Дай коротко по подтвержденным строкам." + }, + { + "user_message": "А теперь по поставщику Группа СВК за 2020: сколько мы ему заплатили и какой общий денежный смысл?" + } + ] + } + ] +} diff --git a/llm_normalizer/data/eval_cases/assistant_autogen_saved_user_sessions_20260513131237_gen-ag05131312-2d0445.json b/llm_normalizer/data/eval_cases/assistant_autogen_saved_user_sessions_20260513131237_gen-ag05131312-2d0445.json new file mode 100644 index 0000000..e4c2087 --- /dev/null +++ b/llm_normalizer/data/eval_cases/assistant_autogen_saved_user_sessions_20260513131237_gen-ag05131312-2d0445.json @@ -0,0 +1,64 @@ +{ + "suite_id": "assistant_saved_session_gen-ag05131312-2d0445", + "suite_version": "0.1.0", + "schema_version": "assistant_saved_session_suite_v0_1", + "generated_at": "2026-05-13T13:12:37+00:00", + "generation_id": "gen-ag05131312-2d0445", + "mode": "saved_user_sessions", + "title": "AGENT | Phase 105 mixed schema/primitive closure replay", + "domain": "address_phase105_mixed_schema_primitive_closure", + "scenario_count": 1, + "case_ids": [ + "SAVED-001" + ], + "cases": [ + { + "case_id": "SAVED-001", + "scenario_tag": "agent_saved_user_sessions", + "title": "AGENT | Phase 105 mixed schema/primitive closure replay", + "question_type": "followup", + "broadness_level": "medium", + "turns": [ + { + "user_message": "кайф - что там на складе по остаткам?" + }, + { + "user_message": "АЛЬТЕРНАТИВА" + }, + { + "user_message": "а исторические остатки на другие даты умеешь?" + }, + { + "user_message": "давай на июнь 2017" + }, + { + "user_message": "Теперь дай взрослый обзор за 2020 по компании: входящие, исходящие, нетто, топы, но банк в топах отдельно объясни как финансовый поток, если по назначению он не обычный клиент или поставщик." + }, + { + "user_message": "А отдельно по СБЕРБАНКУ: он для нас клиент, поставщик или финансовый поток? Дай коротко по подтвержденным строкам." + }, + { + "user_message": "А теперь по поставщику Группа СВК за 2020: сколько мы ему заплатили и какой общий денежный смысл?" + }, + { + "user_message": "А теперь по Группа СВК за 2020: сколько денег получили, сколько заплатили и какое нетто?" + }, + { + "user_message": "кому мы должны на конец 2020?" + }, + { + "user_message": "а нам кто должен на конец 2020?" + }, + { + "user_message": "сколько НДС надо заплатить в налоговую за декабрь 2019?" + }, + { + "user_message": "скока денег альтернатива заработала за 20 год?" + }, + { + "user_message": "а это чистая прибыль?" + } + ] + } + ] +} diff --git a/llm_normalizer/data/eval_cases/assistant_saved_session_runtime_job-3663W_Yhi1.json b/llm_normalizer/data/eval_cases/assistant_saved_session_runtime_job-3663W_Yhi1.json new file mode 100644 index 0000000..e60f7fd --- /dev/null +++ b/llm_normalizer/data/eval_cases/assistant_saved_session_runtime_job-3663W_Yhi1.json @@ -0,0 +1,60 @@ +{ + "suite_id": "assistant_saved_session_runtime_job-3663W_Yhi1", + "suite_version": "0.1.0", + "schema_version": "assistant_saved_session_runtime_v0_1", + "title": "AGENT | Phase 105 mixed schema/primitive closure replay", + "scenario_count": 1, + "case_ids": [ + "SAVED-001" + ], + "cases": [ + { + "case_id": "SAVED-001", + "scenario_tag": "saved_user_sessions_runtime", + "title": "AGENT | Phase 105 mixed schema/primitive closure replay", + "question_type": "followup", + "broadness_level": "medium", + "turns": [ + { + "user_message": "кайф - что там на складе по остаткам?" + }, + { + "user_message": "АЛЬТЕРНАТИВА" + }, + { + "user_message": "а исторические остатки на другие даты умеешь?" + }, + { + "user_message": "давай на июнь 2017" + }, + { + "user_message": "Теперь дай взрослый обзор за 2020 по компании: входящие, исходящие, нетто, топы, но банк в топах отдельно объясни как финансовый поток, если по назначению он не обычный клиент или поставщик." + }, + { + "user_message": "А отдельно по СБЕРБАНКУ: он для нас клиент, поставщик или финансовый поток? Дай коротко по подтвержденным строкам." + }, + { + "user_message": "А теперь по поставщику Группа СВК за 2020: сколько мы ему заплатили и какой общий денежный смысл?" + }, + { + "user_message": "А теперь по Группа СВК за 2020: сколько денег получили, сколько заплатили и какое нетто?" + }, + { + "user_message": "кому мы должны на конец 2020?" + }, + { + "user_message": "а нам кто должен на конец 2020?" + }, + { + "user_message": "сколько НДС надо заплатить в налоговую за декабрь 2019?" + }, + { + "user_message": "скока денег альтернатива заработала за 20 год?" + }, + { + "user_message": "а это чистая прибыль?" + } + ] + } + ] +} \ No newline at end of file diff --git a/llm_normalizer/data/eval_cases/assistant_saved_session_runtime_job-sRkV0B_p77.json b/llm_normalizer/data/eval_cases/assistant_saved_session_runtime_job-sRkV0B_p77.json new file mode 100644 index 0000000..f7f35ae --- /dev/null +++ b/llm_normalizer/data/eval_cases/assistant_saved_session_runtime_job-sRkV0B_p77.json @@ -0,0 +1,123 @@ +{ + "suite_id": "assistant_saved_session_runtime_job-sRkV0B_p77", + "suite_version": "0.1.0", + "schema_version": "assistant_saved_session_runtime_v0_1", + "title": "БОЛЬШОЙ ОБЩИЙ Ручная сессия 16.04.2026, 21:26:06", + "scenario_count": 1, + "case_ids": [ + "SAVED-001" + ], + "cases": [ + { + "case_id": "SAVED-001", + "scenario_tag": "saved_user_sessions_runtime", + "title": "БОЛЬШОЙ ОБЩИЙ Ручная сессия 16.04.2026, 21:26:06", + "question_type": "followup", + "broadness_level": "medium", + "turns": [ + { + "user_message": "приветик - че как там дела" + }, + { + "user_message": "расскажи что можешь интересного" + }, + { + "user_message": "кайф - что там на складе по остаткам?" + }, + { + "user_message": "АЛЬТЕРНАТИВА" + }, + { + "user_message": "а исторические остатки на другие даты умеешь?" + }, + { + "user_message": "давай на июль 2017" + }, + { + "user_message": "март 2016" + }, + { + "user_message": "По выбранному объекту \"Рабочая станция универсального специалиста (индивидуальное изготовление)\": где взяли это?" + }, + { + "user_message": "а кому продали?" + }, + { + "user_message": "у тебя написано кто контрагент: рабочая станция - это ошибка?" + }, + { + "user_message": "ндс можешь прикинуть на дату покупки рабочей станции?" + }, + { + "user_message": "а какой ндс мы должны сгрузить на март 2020?" + }, + { + "user_message": "прикинь какой ндс нам надо заплатить на февраль 2017" + }, + { + "user_message": "кто у нас самый доходный клиент за все время" + }, + { + "user_message": "кто нам должен денег на май 2017" + }, + { + "user_message": "а какой ндс мы должны примерно заплатить за этот период?" + }, + { + "user_message": "мы должны комуто денег на сегодня?" + }, + { + "user_message": "а нам?" + }, + { + "user_message": "какой у нас самый доходный год" + }, + { + "user_message": "а за 2017 мы скок заработали?" + }, + { + "user_message": "сколько вообще денег мы заработали за все время?" + }, + { + "user_message": "ты умеешь считать дельту по договорам?" + }, + { + "user_message": "по чепурнову покажи все доки" + }, + { + "user_message": "а по свк" + }, + { + "user_message": "а сейчас у нас есть что на складе?" + }, + { + "user_message": "что нам отгружал чепурнов? какой товар или услугу?" + }, + { + "user_message": "какие остатки на складе на сегодня" + }, + { + "user_message": "остатки на март 2016" + }, + { + "user_message": "хвосты покажи по счету 60 на август 2022" + }, + { + "user_message": "Есть ли остатки товара, которые закупались очень давно" + }, + { + "user_message": "Какие конкретно номенклатуры формируют остаток по складу на май 2020" + }, + { + "user_message": "а по Альтернативе Плюс сколько лет активности в базе 1С?" + }, + { + "user_message": "Как ты оценишь деятельность компании?" + }, + { + "user_message": "какое нетто по деньгам с Группа СВК за 2020 год: сколько получили и сколько заплатили?" + } + ] + } + ] +} \ No newline at end of file