diff --git a/docs/ARCH/11 - architecture_turnaround/06 - phase_acceptance_matrix.md b/docs/ARCH/11 - architecture_turnaround/06 - phase_acceptance_matrix.md index a54e230..745cfb1 100644 --- a/docs/ARCH/11 - architecture_turnaround/06 - phase_acceptance_matrix.md +++ b/docs/ARCH/11 - architecture_turnaround/06 - phase_acceptance_matrix.md @@ -27,7 +27,7 @@ Current reporting baseline: - Post-F Semantic Integrity Hardening: `99%`, operationally closed/regression gate. - Planner Autonomy Consolidation: `100%` for the declared phase83 slice. - Open-World Business Overview implementation breadth: `~99%` through Slice 25. -- Active next pressure: `Open-World Semantic Control Gate`, accepted module progress `~96%` after the second local control-gate cut. +- Active next pressure: `Open-World Semantic Control Gate`, accepted module progress `~98%` after the EHMO-derived critical subset accepted live; fat GUI pack review remains before full closure. ## Archived Execution Snapshot (2026-04-17) diff --git a/docs/ARCH/11 - architecture_turnaround/21 - current_status_canon_2026-05-01.md b/docs/ARCH/11 - architecture_turnaround/21 - current_status_canon_2026-05-01.md index d225371..054f590 100644 --- a/docs/ARCH/11 - architecture_turnaround/21 - current_status_canon_2026-05-01.md +++ b/docs/ARCH/11 - architecture_turnaround/21 - current_status_canon_2026-05-01.md @@ -17,8 +17,9 @@ It did not reopen Post-F and it did not prove that the Open-World implementation From this point forward: - `~99%` for Open-World means implementation breadth through `Business Overview Missing Proof Ledger`; -- accepted module progress is `~96%` after the second local Semantic Control Gate cut, and remains below closure until the EHMO-derived subset is rerun; -- the active work is control-gate hardening, not immediate expansion into more proof families. +- accepted module progress is `~98%` after the EHMO-derived Semantic Control Gate subset accepted live at `21/21`; +- the active work is finishing the control-gate closure surface, not immediate expansion into more proof families. +- full `100%` is still held back until the fat manual GUI pack is rerun/reviewed or remaining rough answers are explicitly classified outside the declared contour. For the current execution spine, read `23 - current_execution_spine_and_semantic_control_gate_2026-05-05.md`. @@ -55,7 +56,7 @@ For the current execution spine, read `23 - current_execution_spine_and_semantic - Completed active slice: `Business Overview Missing Proof Ledger`: business overview now records machine-readable hard proof gaps for accounting profit/margin, due-date debt aging, inventory reserve/liquidation quality, and vendor/procurement quality, distinguishing proxy-only evidence from reviewed routes that are not wired yet. - Implementation breadth: `~99% (Open-World Bounded Autonomy Breadth through Slice 25)`. - Next active slice: `Open-World Semantic Control Gate`, covering garbage-anchor protection, business-overview continuation, intent dominance, frame hygiene, counterparty/organization arbitration, and final-summary answer shape. -- Active module progress: `~96% (Open-World Bounded Autonomy Breadth, active slice: Semantic Control Gate)`. +- Active module progress: `~98% (Open-World Bounded Autonomy Breadth, active slice: Semantic Control Gate)`. ## Reporting Rule @@ -63,15 +64,15 @@ Use these labels when reporting progress: - `Прогресс модуля: 99% (Post-F Semantic Integrity Hardening, operationally closed/regression gate)` when discussing the Post-F slice itself. - `Прогресс модуля: 100% (Planner Autonomy Consolidation, declared phase83 slice closed)` when discussing the planner-autonomy slice that was just completed. -- `Прогресс модуля: 96% (Open-World Bounded Autonomy Breadth, active slice: Semantic Control Gate)` while discussing current module closure after the second local Semantic Control Gate cut. -- `Open-World Business Overview implementation breadth: ~99%, semantic acceptance gate still open` when discussing only the already wired Slice 25 breadth. +- `Прогресс модуля: 98% (Open-World Bounded Autonomy Breadth, active slice: Semantic Control Gate)` while discussing current module closure after the EHMO-derived critical subset accepted live. +- `Open-World Business Overview implementation breadth: ~99%, Semantic Control Gate critical subset accepted, fat GUI pack still pending` when discussing only the already wired Slice 25 breadth. - `Прогресс модуля: X% (Open-World Bounded Autonomy Breadth, active slice: )` for later breadth work after the Semantic Control Gate is accepted. Do not report Post-F as `78%`, `87%`, or `92%`. Do not report Planner Autonomy as still open unless the discussion is about the next broader module, not the declared phase83 closure target. -Do not report Open-World as simply `99% closed` until the EHMO-derived semantic control gate passes replay review. +Do not report Open-World as simply `99% closed` until the fat manual GUI pack is rerun/reviewed or remaining residuals are explicitly classified. ## What Is Actually Closed @@ -96,7 +97,7 @@ The project is not yet a universal arbitrary-1C agent. Remaining work belongs to the next breadth module: -- close the `Open-World Semantic Control Gate` opened by `assistant-stage1-EHMOy3lNFt`; +- finish closure of the `Open-World Semantic Control Gate` opened by `assistant-stage1-EHMOy3lNFt`; the EHMO-derived critical subset is accepted live, but the fat GUI pack and residual answer-shape roughness still need final review; - extend `business_overview` beyond money-flow/activity, customer and supplier concentration, document/account-section activity mix, counterparty role split, contract usage, yearly operating-flow dynamics, explicit profit/margin wording boundaries, explicit debt due-date wording boundaries, explicit inventory reserve/liquidation wording boundaries, explicit supplier/procurement-quality wording boundaries, explicit-period VAT/tax, as-of-date debt position, open-settlement concentration, contract-date debt age, debt staleness-risk proxy, as-of-date inventory position, trading-margin proxy, sales-to-stock inventory proxy, warehouse staleness-risk proxy, and the missing-proof ledger into separately proven exact accounting profit/margin, due-date debt aging/overdue, confirmed vendor-risk/procurement-quality analysis, and confirmed reserve/write-off/liquidation inventory evidence families; - broader dynamic schema traversal for unfamiliar 1C asks; - more primitive descriptors where live evidence proves a real gap; diff --git a/docs/ARCH/11 - architecture_turnaround/22 - open_world_bounded_autonomy_breadth_2026-05-01.md b/docs/ARCH/11 - architecture_turnaround/22 - open_world_bounded_autonomy_breadth_2026-05-01.md index 7bba8bb..f5bd975 100644 --- a/docs/ARCH/11 - architecture_turnaround/22 - open_world_bounded_autonomy_breadth_2026-05-01.md +++ b/docs/ARCH/11 - architecture_turnaround/22 - open_world_bounded_autonomy_breadth_2026-05-01.md @@ -779,7 +779,7 @@ Suggested first subset: Current status: - implementation breadth through Slice 25: `~99%`; -- accepted Open-World module progress after the second local Semantic Control Gate cut: `~96%`; +- accepted Open-World module progress after the EHMO-derived Semantic Control Gate subset accepted live: `~98%`; - exact P&L, real due-date debt aging, reserve/write-off/liquidation evidence, and vendor-risk engines stay queued behind this semantic gate. ### Slice 26 local cut 1 - anchor hygiene and overview continuation @@ -814,8 +814,24 @@ Local validation is accepted for this cut: - `npm.cmd run build`: passed. - graphify rebuild after Slice 26 local cut 2: `6076 nodes`, `13247 edges`, `138 communities`. +### Slice 26 local cut 3 - business-audit and executive-summary control + +Implemented now: + +- broad business-audit wording with noisy capability-meta phrases, such as "what can we say", stays in bounded business overview synthesis instead of falling into capability-list help; +- final `executive summary` wording over the whole conversation is handled by deterministic memory synthesis with confirmed facts, proxy boundaries, missing evidence, and manual-check sections; +- recap fact construction filters low-quality pseudo-counterparty anchors such as standalone service prepositions before they can leak into the final answer. + +Validation is accepted for this cut: + +- `npm.cmd test -- assistantMemoryRecapPolicy.test.ts assistantLivingChatRuntimeAdapter.test.ts assistantRoutePolicy.test.ts assistantLivingModePolicy.test.ts`: passed `54/54`. +- `npm.cmd test -- assistantLivingRouter.test.ts assistantLivingChatMode.test.ts assistantAgentSemanticRunInventoryRegression.test.ts assistantTurnMeaningPolicy.test.ts`: passed `90/90`. +- `npm.cmd run build`: passed. +- `address_truth_harness_phase89_open_world_semantic_control_gate_ehmo_subset_live_fix4_20260505`: accepted `21/21`, `0` warnings, MCP live-readiness `ready`. +- graphify rebuild after this cut: `6081 nodes`, `13263 edges`, `140 communities`. + Remaining before acceptance: -- rerun the EHMO-derived semantic subset; -- continue W5 and any EHMO-revealed W3/W4/W6 gaps for counterparty/organization arbitration, remaining wrong-lane prevention, frame reset, and final-summary answer lane; -- only then rerun the fat manual GUI pack for acceptance. +- keep the EHMO-derived semantic subset as a regression gate for nearby edits; +- review remaining W5/SVK answer-shape roughness for counterparty/organization arbitration after pivots; +- rerun the fat manual GUI pack for final acceptance or explicitly classify residuals outside the declared contour. diff --git a/docs/ARCH/11 - architecture_turnaround/23 - current_execution_spine_and_semantic_control_gate_2026-05-05.md b/docs/ARCH/11 - architecture_turnaround/23 - current_execution_spine_and_semantic_control_gate_2026-05-05.md index 18c1c5b..a9edd5b 100644 --- a/docs/ARCH/11 - architecture_turnaround/23 - current_execution_spine_and_semantic_control_gate_2026-05-05.md +++ b/docs/ARCH/11 - architecture_turnaround/23 - current_execution_spine_and_semantic_control_gate_2026-05-05.md @@ -66,12 +66,12 @@ This gate is not a request to tune the assistant for every weird question in tha Current status should be reported as: - implementation breadth: `~99%` for Open-World Business Overview through Slice 25; -- accepted module progress: `~96% (Open-World Bounded Autonomy Breadth, active slice: Semantic Control Gate)`. +- accepted module progress: `~98% (Open-World Bounded Autonomy Breadth, active slice: Semantic Control Gate)`. This is not a regression from `99%` to `96%`. It is a metric split: - `99%` describes wired breadth; -- `96%` describes closure confidence after the second local control-gate cut; the gate is still not accepted without EHMO-subset replay. +- `98%` describes closure confidence after the EHMO-derived critical subset passed live replay; the gate is still not full module closure until the fat manual GUI pack and remaining answer-shape residuals are reviewed. ## Current Local Cut @@ -88,7 +88,13 @@ Local cut 2 is implemented: - W6 starter: final next-step summary wording after business overview stays in the bounded overview lane and does not invent document/counterparty subjects. - Business-overview frame hygiene now suppresses stale follow-up counterparties when the current turn is a broad overview continuation. -This is local evidence only. It improves the gate but does not close it until the EHMO-derived subset and the fat manual pack are semantically reviewed again. +Local cut 3 is implemented: + +- W2/W3: broad business-audit wording such as "what can we say" now overrides noisy capability-meta cues and stays in the business-overview synthesis lane. +- W6: final `executive summary` / "confirmed, proxy, missing evidence, manual checks" wording is handled as deterministic conversation memory synthesis instead of generic address lookup. +- W1/W6 hygiene: low-quality recap counterparty anchors such as standalone service prepositions are suppressed before they can appear as `«для»`-style pseudo-counterparties in the final answer. + +The EHMO-derived critical subset is now live-accepted. The remaining gate pressure is the fat manual GUI pack and known residual answer-shape roughness around selected counterparty money/document/movement follow-ups. ## Failure Classes To Fix @@ -127,9 +133,9 @@ Each work unit should add focused local tests and then be validated against the ## Acceptance Gate -The current module can move from `~96%` toward closure only after: +The current module can move from `~98%` toward closure only after: -- the EHMO-derived critical subset is rerun and semantically reviewed; +- the EHMO-derived critical subset remains accepted after future nearby edits; - old canaries remain green: Post-F, phase83, inventory selected-object, VAT continuity, SVK document/movement chains; - broad business overview still answers from confirmed/proxy/missing evidence rather than unsupported confidence; - no stale organization/counterparty/date/selected-object contamination appears in the reviewed subset; @@ -155,20 +161,27 @@ Manual runtime run reviewed as the gate opener: - report: `llm_normalizer/reports/assistant-stage1-EHMOy3lNFt.md` - session: `llm_normalizer/data/assistant_sessions/assistant-stage1-EHMOy3lNFt-SAVED-001.json` +Live EHMO-derived critical subset proof: + +- spec: `docs/orchestration/address_truth_harness_phase89_open_world_semantic_control_gate_ehmo_subset.json` +- run: `artifacts/domain_runs/address_truth_harness_phase89_open_world_semantic_control_gate_ehmo_subset_live_fix4_20260505` +- result: `accepted`, `21/21`, `0` warnings, MCP live-readiness `ready` +- key covered repairs: business-audit synthesis no longer falls into capability help; final executive summary uses confirmed/proxy/missing/manual-check sections and filters pseudo-counterparty garbage. + Graphify snapshot at this status cut: -- `6076 nodes` -- `13247 edges` -- `138 communities` +- `6081 nodes` +- `13263 edges` +- `140 communities` ## Reporting Rule -Until the semantic control gate is accepted, use: +Until the fat manual GUI pack is reviewed or residuals are explicitly classified, use: -`Прогресс модуля: 96% (Open-World Bounded Autonomy Breadth, active slice: Semantic Control Gate)` +`Прогресс модуля: 98% (Open-World Bounded Autonomy Breadth, active slice: Semantic Control Gate)` If discussing only the already wired business-overview breadth, say: -`Open-World Business Overview implementation breadth: ~99%, semantic acceptance gate still open` +`Open-World Business Overview implementation breadth: ~99%, Semantic Control Gate critical subset accepted, fat GUI pack still pending` Do not collapse those two statements into one number. diff --git a/docs/ARCH/11 - architecture_turnaround/README.md b/docs/ARCH/11 - architecture_turnaround/README.md index f5f9bef..3054d70 100644 --- a/docs/ARCH/11 - architecture_turnaround/README.md +++ b/docs/ARCH/11 - architecture_turnaround/README.md @@ -144,11 +144,11 @@ Current honest status: - bounded-autonomy foundation readiness: `~89%` - open-world bounded-autonomy readiness: `~87%` - active Open-World Bounded Autonomy Breadth implementation breadth: `~99%`, with business-overview evidence fusion, the reviewed `business_overview` catalog/data-need/planner route-fabric slice, the fresh multi-probe runtime bridge, the explicit-period VAT/tax fact-family bridge, the explicit-period debt-position bridge, the explicit-date inventory-position bridge, the open-settlement quality bridge accepted by live semantic replay, selected-item profitability bridged by local semantic/runtime regression tests, contract-date debt age bridged locally, debt staleness-risk proxy bridged locally, debt due-date boundary arbitration bridged locally, inventory reserve/liquidation boundary arbitration bridged locally, supplier/procurement-quality boundary arbitration bridged locally, supplier concentration proxy bridged locally, document/account-section activity profile bridged locally, counterparty population/roles and contract usage profiles bridged locally, yearly operating-flow proxy bridged locally, earnings/best-year wording arbitration bridged locally, profit/margin wording boundary arbitration bridged locally, analyst synthesis added to business-overview answer drafting, company-period trading margin proxy bridged locally, inventory sales-to-stock proxy bridged locally, inventory staleness-risk proxy bridged locally, gap-specific answer shaping bridged locally, and missing proof families recorded as runtime evidence ledger; exact accounting profit/margin, true due-date debt aging/overdue, confirmed vendor-risk/procurement-quality analysis, and confirmed reserve/write-off/liquidation inventory evidence are still pending -- active Open-World Bounded Autonomy Breadth accepted-module progress: `~96%`, because local `Open-World Semantic Control Gate` cuts now cover anchor hygiene, overview continuation, debt-position intent dominance, metadata topic-switch frame reset, and final-summary continuation, but EHMO-subset replay is still pending +- active Open-World Bounded Autonomy Breadth accepted-module progress: `~98%`, because the EHMO-derived `Open-World Semantic Control Gate` critical subset now accepts live at `21/21`; full closure is still held back for the fat manual GUI pack and remaining answer-shape residual review - Post-F semantic integrity module progress: `~99%` operationally closed, with remaining risk now treated as next-slice discovery rather than an open blocker inside the closed slice - active inventory-stock breadth slice progress: `100%` for the declared scenario pack, not for arbitrary inventory questions - Planner Autonomy Consolidation progress: `100%` for the declared module, with catalog-fabric, value-flow arbitration, lifecycle bounded inference, broad-evaluation bridge, inventory catalog templates, inventory runtime-boundary honesty, exact inventory recipe bridging, unambiguous metadata-surface lane inference, catalog chain-template scoring, structured chain-match contract exposure, runtime/debug propagation, subject-aware bidirectional comparison arbitration, structured catalog-alignment verdicts, representative alignment regression guard, catalog-alignment reason-code telemetry, explicit `alignment_status` propagation, truth-harness/acceptance-matrix surfacing, soft divergence warning, `catalog_alignment_ok` acceptance invariant, step-level expected catalog-alignment assertions, phase66 and phase32 spec alignment expectations, AGENT source-catalog surfacing, generated phase83 mixed planner-brain replay spec, checked-source user-facing error sanitation, surface-grounded catalog promotion, and guarded live phase83 acceptance validated. Broader unfamiliar 1C asks are now next-module breadth work rather than an open blocker inside this declared slice -- graph snapshot after latest rebuild: `6076 nodes`, `13247 edges`, `138 communities` +- graph snapshot after latest rebuild: `6081 nodes`, `13263 edges`, `140 communities` - current regression-gate breakpoint: - the validated hot paths are no longer structurally broken; - flagship continuity collapse is no longer the primary risk; @@ -177,6 +177,7 @@ Latest live proof now includes: - `address_truth_harness_phase82_human_mixed_integrity_status_dialog_post_f_account_injection_guard_clean_scope` accepted `19/19`, with the `Жуковке 51` numeric counterparty suffix kept as counterparty scope instead of leaking as account `51` - `address_truth_harness_post_f_cross_stage_canary_agent_20260424_live7` accepted `24/24`, proving a saved cross-stage AGENT canary across VAT metadata, metadata-scoped organization/document pivots, numeric counterparty suffixes, open-organization value-flow clarification, ranked value-flow year switches, and SVK grounded reset; the saved autorun is `AGENT | Post-F cross-stage semantic integrity canary` (`gen-ag04241406-abe4d8`) - `address_truth_harness_post_f_manual_failures_20260424_live3` accepted `11/11`, proving the manual failure slice from `assistant-stage1-9liEOh-7JP`: VAT purchase-date, VAT February 2017, highest-value customer, and Chepurnov item-flow after stale inventory context; the saved autorun is `AGENT | Post-F ручные провалы VAT revenue item-flow live3` (`gen-ag04241710-bdb248`) +- `address_truth_harness_phase89_open_world_semantic_control_gate_ehmo_subset_live_fix4_20260505` accepted `21/21`, proving the EHMO-derived Semantic Control Gate subset after business-audit lane repair, final executive-summary memory synthesis, and pseudo-counterparty recap filtering - `address_truth_harness_phase11_manual_followup_meta_quality_live_rerun_vatfix` accepted `10/10` - `address_truth_harness_phase20_continuity_stabilization_live_rerun_vatfix` accepted `6/6` - `addressQueryRuntimeM23.test.ts` full semantic/runtime slice accepted `403/403` after Post-F VAT/date-basis, scope-recovery, open value-flow organization clarification, document-vs-bank arbitration, and reply-shape hardening @@ -203,7 +204,7 @@ Latest live proof now includes: - business-overview supplier concentration proxy accepted locally: targeted executor/answer-adapter slice passed `66/66` with `1` skipped; M23 route/runtime regression passed `412/412`; build passed; graphify rebuilt to `6041 nodes`, `13162 edges`, `136 communities`; the proxy ranks confirmed outgoing payment counterparties while vendor risk, procurement quality, and full expense structure remain unclaimed - business-overview yearly operating-flow proxy accepted locally: targeted executor/answer-adapter slice passed `66/66` with `1` skipped; M23 route/runtime regression passed `412/412`; build passed; graphify rebuilt to `6047 nodes`, `13177 edges`, `139 communities`; the proxy builds annual incoming/outgoing/net buckets from confirmed money-flow rows while profit, финрезультат, and full P&L remain unclaimed - business-overview missing proof ledger accepted locally: targeted executor/answer-adapter slice passed `66/66` with `1` skipped; M23 route/runtime regression passed `416/416`; build passed; graphify count is recorded in the current graph snapshot; hard remaining proof gaps are now visible as machine-readable `missing_proof_families` rather than only prose warnings -- semantic control gate local cuts accepted locally: adapter/resolver focused regressions passed `98/98` with `6` skipped; broader address counterparty/M23/adapter/resolver slice passed `519/519` with `6` skipped; build passed; graphify rebuilt to `6076 nodes`, `13247 edges`, `138 communities`; EHMO-derived semantic replay remains the acceptance gate +- semantic control gate critical subset accepted live: focused W2/W3/W6 regressions passed `54/54`; broader living/router semantic slice passed `90/90`; build passed; EHMO-derived replay `address_truth_harness_phase89_open_world_semantic_control_gate_ehmo_subset_live_fix4_20260505` accepted `21/21` with `0` warnings; graphify rebuilt to `6081 nodes`, `13263 edges`, `140 communities`; fat manual GUI pack remains the closure check - business-overview earnings wording arbitration accepted locally: turn-meaning/turn-input slice passed `85/85` with `6` skipped; M23 route/runtime regression passed `412/412`; runtime-entry/pilot/answer slice passed `85/85` with `3` skipped; build passed; graphify rebuilt to `6052 nodes`, `13187 edges`, `138 communities`; organization-level earnings/best-year wording now reaches `business_overview` while explicit customer/counterparty ranking remains in exact customer value routes - inventory template lift accepted locally: catalog/data-need/planner/turn-input slice passed `139/139` with `6` skipped; full MCP-discovery slice passed `276/276` with `9` skipped; build passed; graphify stayed at `5912 nodes`, `12833 edges`, `138 communities` - inventory runtime-boundary hardening accepted locally: runtime-bridge/answer-adapter/pilot-executor slice passed `68/68` with `1` skipped; full MCP-discovery slice passed `277/277` with `9` skipped; build passed; graphify rebuilt to `5913 nodes`, `12837 edges`, `138 communities` diff --git a/llm_normalizer/backend/dist/services/assistantLivingChatRuntimeAdapter.js b/llm_normalizer/backend/dist/services/assistantLivingChatRuntimeAdapter.js index 067601e..5020dd5 100644 --- a/llm_normalizer/backend/dist/services/assistantLivingChatRuntimeAdapter.js +++ b/llm_normalizer/backend/dist/services/assistantLivingChatRuntimeAdapter.js @@ -14,6 +14,14 @@ function hasPriorAssistantTurn(items) { function buildDeterministicSmalltalkLeadReply() { return "\u041f\u0440\u0438\u0432\u0435\u0442! \u0412\u0441\u0451 \u043d\u043e\u0440\u043c\u0430\u043b\u044c\u043d\u043e."; } +function hasConversationExecutiveSummarySignal(value) { + const normalized = String(value ?? "") + .toLowerCase() + .replace(/\u0451/gu, "\u0435") + .replace(/\s+/g, " ") + .trim(); + return /(?:executive\s+summary|финальн\w*\s+собери|итогов\w*\s+(?:резюм|summary|вывод)|по\s+всему\s+диалогу|где\s+ответы\s+были\s+подтвержден|где\s+proxy|где\s+прокси|не\s+хватил\w*\s+доказательств|ручн\w*\s+(?:смотр|провер|контрол))/iu.test(normalized); +} function asRecord(value) { return value && typeof value === "object" && !Array.isArray(value) ? value : null; } @@ -169,15 +177,25 @@ async function runAssistantLivingChatRuntime(input) { livingChatSource = "deterministic_inventory_history_capability_contract"; } else if (contextualMemoryRecapFollowup) { - const scopedOrganization = selectedOrganization ?? activeOrganization ?? null; - chatText = (0, assistantMemoryRecapPolicy_1.buildAddressMemoryRecapReply)({ - organization: scopedOrganization, - addressDebug: lastMemoryAddressDebug, - sessionItems: input.sessionItems, - toNonEmptyString: input.toNonEmptyString - }); + const scopedOrganization = selectedOrganization ?? activeOrganization ?? continuityActiveOrganization ?? null; + const executiveSummaryFollowup = hasConversationExecutiveSummarySignal(userMessage); + chatText = executiveSummaryFollowup + ? (0, assistantMemoryRecapPolicy_1.buildConversationExecutiveSummaryReply)({ + organization: scopedOrganization, + addressDebug: lastMemoryAddressDebug, + sessionItems: input.sessionItems, + toNonEmptyString: input.toNonEmptyString + }) + : (0, assistantMemoryRecapPolicy_1.buildAddressMemoryRecapReply)({ + organization: scopedOrganization, + addressDebug: lastMemoryAddressDebug, + sessionItems: input.sessionItems, + toNonEmptyString: input.toNonEmptyString + }); activeOrganization = scopedOrganization ?? activeOrganization; - livingChatSource = "deterministic_memory_recap_contract"; + livingChatSource = executiveSummaryFollowup + ? "deterministic_conversation_executive_summary_contract" + : "deterministic_memory_recap_contract"; } else if (contextualAnswerInspectionFollowup) { chatText = (0, assistantMemoryRecapPolicy_1.buildSelectedObjectAnswerInspectionReply)({ diff --git a/llm_normalizer/backend/dist/services/assistantLivingModePolicy.js b/llm_normalizer/backend/dist/services/assistantLivingModePolicy.js index e795386..b11b108 100644 --- a/llm_normalizer/backend/dist/services/assistantLivingModePolicy.js +++ b/llm_normalizer/backend/dist/services/assistantLivingModePolicy.js @@ -119,6 +119,9 @@ function createAssistantLivingModePolicy(deps) { } return true; } + function hasConversationExecutiveSummarySignal(sample) { + return /(?:executive\s+summary|финальн\w*\s+собери|итогов\w*\s+(?:резюм|summary|вывод)|по\s+всему\s+диалогу|где\s+ответы\s+были\s+подтвержден|где\s+proxy|где\s+прокси|не\s+хватил\w*\s+доказательств|ручн\w*\s+(?:смотр|провер|контрол))/iu.test(sample); + } function hasConversationMemoryRecallFollowupSignal(userMessage) { const rawText = compactWhitespace(String(userMessage ?? "").toLowerCase()); const repairedText = compactWhitespace(repairAddressMojibake(String(userMessage ?? "")).toLowerCase()); @@ -130,7 +133,8 @@ function createAssistantLivingModePolicy(deps) { } const hasMemoryCue = samples.some((sample) => /(?:помни(?:шь|те|м)?|remember|recall)/iu.test(sample)); const hasDiscussionCue = samples.some((sample) => /(?:обсуждал[аи]?|говорил[аи]?|смотрел[аи]?|разбирал[аи]?|спрашивал[аи]?)/iu.test(sample)); - const hasExplicitRecapPrompt = samples.some((sample) => /(?:что\s+мы\s+.*(?:обсуждали|выяснили)|что\s+уже\s+выяснили|напомни\s+что\s+мы|what\s+we\s+already\s+(?:discussed|figured\s+out))/iu.test(sample)); + const hasExplicitRecapPrompt = samples.some((sample) => /(?:что\s+мы\s+.*(?:обсуждали|выяснили)|что\s+уже\s+выяснили|напомни\s+что\s+мы|what\s+we\s+already\s+(?:discussed|figured\s+out))/iu.test(sample) || + hasConversationExecutiveSummarySignal(sample)); if (!(hasExplicitRecapPrompt || (hasMemoryCue && hasDiscussionCue))) { return false; } diff --git a/llm_normalizer/backend/dist/services/assistantMemoryRecapPolicy.js b/llm_normalizer/backend/dist/services/assistantMemoryRecapPolicy.js index eff45a9..9b3bddc 100644 --- a/llm_normalizer/backend/dist/services/assistantMemoryRecapPolicy.js +++ b/llm_normalizer/backend/dist/services/assistantMemoryRecapPolicy.js @@ -4,6 +4,7 @@ Object.defineProperty(exports, "__esModule", { value: true }); exports.buildInventoryHistoryCapabilityFollowupReply = buildInventoryHistoryCapabilityFollowupReply; exports.buildAddressMemoryRecapReply = buildAddressMemoryRecapReply; exports.buildBroadBusinessEvaluationReply = buildBroadBusinessEvaluationReply; +exports.buildConversationExecutiveSummaryReply = buildConversationExecutiveSummaryReply; exports.buildSelectedObjectAnswerInspectionReply = buildSelectedObjectAnswerInspectionReply; exports.resolveAssistantLivingChatMemoryContext = resolveAssistantLivingChatMemoryContext; exports.createAssistantMemoryRecapPolicy = createAssistantMemoryRecapPolicy; @@ -168,7 +169,7 @@ function hasSignalAcrossSamples(samples, detector) { return samples.some((sample) => detector(sample)); } function hasExplicitRecapPromptSignal(samples) { - return samples.some((sample) => /(?:что\s+мы\s+.*(?:обсуждали|выяснили)|что\s+уже\s+выяснили|что\s+уже\s+поняли|напомни\s+что\s+мы)/iu.test(sample)); + return samples.some((sample) => /(?:что\s+мы\s+.*(?:обсуждали|выяснили)|что\s+уже\s+выяснили|что\s+уже\s+поняли|напомни\s+что\s+мы|executive\s+summary|финальн\w*\s+собери|итогов\w*\s+(?:резюм|summary|вывод)|по\s+всему\s+диалогу|где\s+ответы\s+были\s+подтвержден|где\s+proxy|где\s+прокси|не\s+хватил\w*\s+доказательств|ручн\w*\s+(?:смотр|провер|контрол))/iu.test(sample)); } function buildInventoryHistoryCapabilityFollowupReply(input) { const contextFacts = (0, assistantContinuityPolicy_1.resolveAddressDebugContextFacts)(input.addressDebug, input.toNonEmptyString); @@ -196,12 +197,45 @@ function normalizeRecapIdentity(value) { .replace(/[«»"'`]/g, "") .replace(/\s+/g, " "); } +function isLowQualityRecapCounterparty(value) { + const normalized = normalizeRecapIdentity(value); + if (!normalized) { + return false; + } + const stopwordOnlyCounterparty = new Set([ + "без", + "в", + "во", + "для", + "до", + "за", + "из", + "к", + "ко", + "на", + "от", + "по", + "с", + "со", + "у" + ]); + if (stopwordOnlyCounterparty.has(normalized)) { + return true; + } + if (/^(?:и\s+)?(?:кто|что|где|какой|какие)\b/iu.test(normalized) || + /(?:главн|основн|крупн|поставщик|клиент|контрагент|покупател|документ|движени|операци)/iu.test(normalized) && + !/(? `- ${ensureSentence(fact)}`), ...(businessEvidence.confirmedLines.length > 0 @@ -533,6 +567,69 @@ function buildBroadBusinessEvaluationReply(input) { "Если хочешь, я быстро доберу основу для такой оценки: денежный поток, дебиторка/кредиторка, НДС или ключевые контрагенты." ].join(" "); } +function buildConversationExecutiveSummaryReply(input) { + const contextFacts = (0, assistantContinuityPolicy_1.resolveAddressDebugContextFacts)(input.addressDebug, input.toNonEmptyString); + const organization = input.organization ?? contextFacts.organization; + const organizationPart = organization ? ` по компании «${organization}»` : ""; + const recapFacts = collectRecentRecapFacts({ + sessionItems: input.sessionItems, + item: null, + organization, + toNonEmptyString: input.toNonEmptyString, + limit: 8 + }); + const businessEvidence = collectBusinessEvaluationEvidence({ + sessionItems: input.sessionItems, + organization, + toNonEmptyString: input.toNonEmptyString + }); + const confirmedLines = []; + for (const fact of recapFacts) { + pushBusinessLine(confirmedLines, ensureSentence(fact)); + } + for (const fact of businessEvidence.confirmedLines) { + pushBusinessLine(confirmedLines, ensureSentence(fact)); + } + const proxyLines = []; + for (const line of businessEvidence.interpretationLines) { + pushBusinessLine(proxyLines, ensureSentence(line)); + } + if (businessEvidence.hasNet) { + pushBusinessLine(proxyLines, "Нетто по деньгам можно использовать как cash-flow proxy, но это не бухгалтерская прибыль и не маржа."); + } + if (businessEvidence.hasRanking) { + pushBusinessLine(proxyLines, "Крупнейшие клиенты/контрагенты видны как операционный сигнал концентрации, но это не полноценный CRM-аудит."); + } + const missingLines = [ + "Чистая прибыль, финрезультат и маржа не доказаны без отдельной проверки себестоимости, расходов и закрытия периода.", + "Просрочка, качество долга и due-date aging не доказаны без сроков оплаты и отдельного долгового контура.", + "Ликвидность склада, резервы, списания и устаревание нельзя считать подтвержденными без специальных складских доказательств.", + "Vendor-risk, качество закупок и юридическая надежность контрагентов остаются вне подтвержденного контура." + ]; + const manualLines = [ + "Сверить ОСВ/финрезультат и управленческие расходы за ключевые годы.", + "Отдельно проверить старые открытые расчеты, крупные долги и документы закрытия.", + "Посмотреть концентрацию клиентов и поставщиков не только по сумме, но и по доле в периоде.", + "Сверить НДС, склад и договоры там, где ответ опирался на proxy, а не на прямой учетный факт." + ]; + const confirmedSection = confirmedLines.length > 0 + ? confirmedLines.slice(0, 10).map((line) => `- ${line}`) + : ["- Есть grounded-контекст диалога, но для строгого executive summary не хватает подтвержденных метрик в текущем окне."]; + const proxySection = proxyLines.length > 0 + ? proxyLines.slice(0, 6).map((line) => `- ${line}`) + : ["- Proxy-выводы пока слабые: можно говорить только о направлении проверки, не о зрелой оценке бизнеса."]; + return [ + `Executive summary${organizationPart}: по диалогу уже можно собрать рабочую карту подтвержденного, proxy и ручного контроля, но не стоит выдавать это за полный аудит компании.`, + "Подтверждено по данным/ответам 1С:", + ...confirmedSection, + "Proxy и осторожная аналитика:", + ...proxySection, + "Где не хватило доказательств:", + ...missingLines.map((line) => `- ${line}`), + "Что директору смотреть руками в первую очередь:", + ...manualLines.map((line) => `- ${line}`) + ].join("\n"); +} function buildSelectedObjectAnswerInspectionReply(input) { const contextFacts = (0, assistantContinuityPolicy_1.resolveAddressDebugContextFacts)(input.addressDebug, input.toNonEmptyString); const itemLabel = contextFacts.item ?? "эта позиция"; diff --git a/llm_normalizer/backend/dist/services/assistantRoutePolicy.js b/llm_normalizer/backend/dist/services/assistantRoutePolicy.js index a9b1115..cf9bbcc 100644 --- a/llm_normalizer/backend/dist/services/assistantRoutePolicy.js +++ b/llm_normalizer/backend/dist/services/assistantRoutePolicy.js @@ -578,6 +578,13 @@ function createAssistantRoutePolicy(deps) { hasOrganizationFactFollowupSignal(repairedRawUserMessage, sessionItems ?? []) || hasOrganizationFactFollowupSignal(effectiveAddressUserMessage, sessionItems ?? []) || hasOrganizationFactFollowupSignal(repairedEffectiveAddressUserMessage, sessionItems ?? []); + const broadBusinessMeaningBoundary = Boolean(assistantTurnMeaning?.unsupported_but_understood_family === "broad_business_evaluation" && + assistantTurnMeaning?.stale_replay_forbidden === true && + !turnMeaningIntentCandidate && + !dataScopeMetaQuery && + !dangerOrCoercionSignal && + !groundedValueFlowFollowupContextDetected && + !organizationClarificationContinuationDetected); const hardMetaMode = resolveHardMetaMode({ dataScopeMetaQuery, capabilityMetaQuery, @@ -613,6 +620,41 @@ function createAssistantRoutePolicy(deps) { }; } if (hardMetaMode === "capability") { + if (broadBusinessMeaningBoundary) { + return { + runAddressLane: false, + toolGateDecision: "skip_address_lane", + toolGateReason: "unsupported_current_turn_meaning_boundary", + livingMode: "chat", + livingReason: "unsupported_current_turn_meaning_boundary", + orchestrationContract: { + schema_version: "assistant_orchestration_contract_v1", + hard_meta_mode: null, + provider_execution: providerExecution, + assistant_turn_meaning: assistantTurnMeaning, + address_mode: resolvedModeDetection.mode, + address_mode_confidence: resolvedModeDetection.confidence, + address_intent: resolvedIntentResolution.intent, + address_intent_confidence: resolvedIntentResolution.confidence, + strong_data_signal_detected: strongDataSignal, + data_retrieval_signal_detected: dataRetrievalSignal, + followup_context_detected: Boolean(followupContext), + unsupported_current_turn_meaning_boundary: true, + unsupported_current_turn_family: assistantTurnMeaning.unsupported_but_understood_family, + unsupported_address_intent_fallback_to_deep: false, + final_decision: { + run_address_lane: false, + tool_gate_decision: "skip_address_lane", + tool_gate_reason: "unsupported_current_turn_meaning_boundary", + living_mode: "chat", + living_reason: "unsupported_current_turn_meaning_boundary" + }, + reason_codes: [ + "business_overview_meaning_overrides_capability_meta_noise" + ] + } + }; + } if (contextualHistoricalCapabilityFollowupDetected) { return { runAddressLane: false, @@ -699,7 +741,7 @@ function createAssistantRoutePolicy(deps) { } }; } - const unsupportedCurrentTurnMeaningBoundary = Boolean(assistantTurnMeaning?.unsupported_but_understood_family && + const unsupportedCurrentTurnMeaningBoundary = Boolean((broadBusinessMeaningBoundary || assistantTurnMeaning?.unsupported_but_understood_family) && assistantTurnMeaning?.stale_replay_forbidden === true && !turnMeaningIntentCandidate && !aggregateBusinessAnalyticsSignal && diff --git a/llm_normalizer/backend/src/services/assistantLivingChatRuntimeAdapter.ts b/llm_normalizer/backend/src/services/assistantLivingChatRuntimeAdapter.ts index 3df6244..22cc251 100644 --- a/llm_normalizer/backend/src/services/assistantLivingChatRuntimeAdapter.ts +++ b/llm_normalizer/backend/src/services/assistantLivingChatRuntimeAdapter.ts @@ -1,6 +1,7 @@ import { buildAddressMemoryRecapReply as buildAddressMemoryRecapReplyFromPolicy, buildBroadBusinessEvaluationReply as buildBroadBusinessEvaluationReplyFromPolicy, + buildConversationExecutiveSummaryReply as buildConversationExecutiveSummaryReplyFromPolicy, buildSelectedObjectAnswerInspectionReply as buildSelectedObjectAnswerInspectionReplyFromPolicy, buildInventoryHistoryCapabilityFollowupReply as buildInventoryHistoryCapabilityFollowupReplyFromPolicy, resolveAssistantLivingChatMemoryContext @@ -81,6 +82,17 @@ function buildDeterministicSmalltalkLeadReply(): string { return "\u041f\u0440\u0438\u0432\u0435\u0442! \u0412\u0441\u0451 \u043d\u043e\u0440\u043c\u0430\u043b\u044c\u043d\u043e."; } +function hasConversationExecutiveSummarySignal(value: unknown): boolean { + const normalized = String(value ?? "") + .toLowerCase() + .replace(/\u0451/gu, "\u0435") + .replace(/\s+/g, " ") + .trim(); + return /(?:executive\s+summary|финальн\w*\s+собери|итогов\w*\s+(?:резюм|summary|вывод)|по\s+всему\s+диалогу|где\s+ответы\s+были\s+подтвержден|где\s+proxy|где\s+прокси|не\s+хватил\w*\s+доказательств|ручн\w*\s+(?:смотр|провер|контрол))/iu.test( + normalized + ); +} + function asRecord(value: unknown): Record | null { return value && typeof value === "object" && !Array.isArray(value) ? (value as Record) : null; } @@ -246,15 +258,25 @@ export async function runAssistantLivingChatRuntime( activeOrganization = scopedOrganization ?? activeOrganization; livingChatSource = "deterministic_inventory_history_capability_contract"; } else if (contextualMemoryRecapFollowup) { - const scopedOrganization = selectedOrganization ?? activeOrganization ?? null; - chatText = buildAddressMemoryRecapReplyFromPolicy({ - organization: scopedOrganization, - addressDebug: lastMemoryAddressDebug, - sessionItems: input.sessionItems, - toNonEmptyString: input.toNonEmptyString - }); + const scopedOrganization = selectedOrganization ?? activeOrganization ?? continuityActiveOrganization ?? null; + const executiveSummaryFollowup = hasConversationExecutiveSummarySignal(userMessage); + chatText = executiveSummaryFollowup + ? buildConversationExecutiveSummaryReplyFromPolicy({ + organization: scopedOrganization, + addressDebug: lastMemoryAddressDebug, + sessionItems: input.sessionItems, + toNonEmptyString: input.toNonEmptyString + }) + : buildAddressMemoryRecapReplyFromPolicy({ + organization: scopedOrganization, + addressDebug: lastMemoryAddressDebug, + sessionItems: input.sessionItems, + toNonEmptyString: input.toNonEmptyString + }); activeOrganization = scopedOrganization ?? activeOrganization; - livingChatSource = "deterministic_memory_recap_contract"; + livingChatSource = executiveSummaryFollowup + ? "deterministic_conversation_executive_summary_contract" + : "deterministic_memory_recap_contract"; } else if (contextualAnswerInspectionFollowup) { chatText = buildSelectedObjectAnswerInspectionReplyFromPolicy({ addressDebug: lastAnswerInspectionAddressDebug, diff --git a/llm_normalizer/backend/src/services/assistantLivingModePolicy.ts b/llm_normalizer/backend/src/services/assistantLivingModePolicy.ts index 898cffb..c9ce125 100644 --- a/llm_normalizer/backend/src/services/assistantLivingModePolicy.ts +++ b/llm_normalizer/backend/src/services/assistantLivingModePolicy.ts @@ -185,6 +185,10 @@ export function createAssistantLivingModePolicy(deps: AssistantLivingModePolicyD return true; } + function hasConversationExecutiveSummarySignal(sample) { + return /(?:executive\s+summary|финальн\w*\s+собери|итогов\w*\s+(?:резюм|summary|вывод)|по\s+всему\s+диалогу|где\s+ответы\s+были\s+подтвержден|где\s+proxy|где\s+прокси|не\s+хватил\w*\s+доказательств|ручн\w*\s+(?:смотр|провер|контрол))/iu.test(sample); + } + function hasConversationMemoryRecallFollowupSignal(userMessage) { const rawText = compactWhitespace(String(userMessage ?? "").toLowerCase()); const repairedText = compactWhitespace(repairAddressMojibake(String(userMessage ?? "")).toLowerCase()); @@ -196,7 +200,8 @@ export function createAssistantLivingModePolicy(deps: AssistantLivingModePolicyD } const hasMemoryCue = samples.some((sample) => /(?:помни(?:шь|те|м)?|remember|recall)/iu.test(sample)); const hasDiscussionCue = samples.some((sample) => /(?:обсуждал[аи]?|говорил[аи]?|смотрел[аи]?|разбирал[аи]?|спрашивал[аи]?)/iu.test(sample)); - const hasExplicitRecapPrompt = samples.some((sample) => /(?:что\s+мы\s+.*(?:обсуждали|выяснили)|что\s+уже\s+выяснили|напомни\s+что\s+мы|what\s+we\s+already\s+(?:discussed|figured\s+out))/iu.test(sample)); + const hasExplicitRecapPrompt = samples.some((sample) => /(?:что\s+мы\s+.*(?:обсуждали|выяснили)|что\s+уже\s+выяснили|напомни\s+что\s+мы|what\s+we\s+already\s+(?:discussed|figured\s+out))/iu.test(sample) || + hasConversationExecutiveSummarySignal(sample)); if (!(hasExplicitRecapPrompt || (hasMemoryCue && hasDiscussionCue))) { return false; } diff --git a/llm_normalizer/backend/src/services/assistantMemoryRecapPolicy.ts b/llm_normalizer/backend/src/services/assistantMemoryRecapPolicy.ts index 53c594d..cf243ce 100644 --- a/llm_normalizer/backend/src/services/assistantMemoryRecapPolicy.ts +++ b/llm_normalizer/backend/src/services/assistantMemoryRecapPolicy.ts @@ -230,7 +230,7 @@ function hasSignalAcrossSamples( function hasExplicitRecapPromptSignal(samples: string[]): boolean { return samples.some((sample) => - /(?:что\s+мы\s+.*(?:обсуждали|выяснили)|что\s+уже\s+выяснили|что\s+уже\s+поняли|напомни\s+что\s+мы)/iu.test( + /(?:что\s+мы\s+.*(?:обсуждали|выяснили)|что\s+уже\s+выяснили|что\s+уже\s+поняли|напомни\s+что\s+мы|executive\s+summary|финальн\w*\s+собери|итогов\w*\s+(?:резюм|summary|вывод)|по\s+всему\s+диалогу|где\s+ответы\s+были\s+подтвержден|где\s+proxy|где\s+прокси|не\s+хватил\w*\s+доказательств|ручн\w*\s+(?:смотр|провер|контрол))/iu.test( sample ) ); @@ -268,6 +268,41 @@ function normalizeRecapIdentity(value: unknown): string { .replace(/\s+/g, " "); } +function isLowQualityRecapCounterparty(value: string | null): boolean { + const normalized = normalizeRecapIdentity(value); + if (!normalized) { + return false; + } + const stopwordOnlyCounterparty = new Set([ + "без", + "в", + "во", + "для", + "до", + "за", + "из", + "к", + "ко", + "на", + "от", + "по", + "с", + "со", + "у" + ]); + if (stopwordOnlyCounterparty.has(normalized)) { + return true; + } + if ( + /^(?:и\s+)?(?:кто|что|где|какой|какие)\b/iu.test(normalized) || + /(?:главн|основн|крупн|поставщик|клиент|контрагент|покупател|документ|движени|операци)/iu.test(normalized) && + !/(? | null; item: string | null; @@ -276,9 +311,10 @@ function buildRecapFactLine(input: { }): string | null { const detectedIntent = String(input.debug?.detected_intent ?? ""); const scopedDate = resolveAddressDebugContextFacts(input.debug).scopedDate; + const counterparty = isLowQualityRecapCounterparty(input.counterparty) ? null : input.counterparty; const discoveryFact = buildDiscoveryRecapFactLine({ debug: input.debug, - counterparty: input.counterparty, + counterparty, organization: input.organization, scopedDate }); @@ -286,7 +322,7 @@ function buildRecapFactLine(input: { return discoveryFact; } const itemPart = input.item ? `по позиции «${input.item}»` : null; - const counterpartyPart = input.counterparty ? `по контрагенту «${input.counterparty}»` : null; + const counterpartyPart = counterparty ? `по контрагенту «${counterparty}»` : null; const organizationPart = input.organization ? `по компании «${input.organization}»` : null; const datePart = scopedDate ? ` на ${scopedDate}` : ""; if (detectedIntent === "inventory_on_hand_as_of_date") { @@ -673,7 +709,7 @@ export function buildBroadBusinessEvaluationReply(input: { : "- Прибыль, маржа и качество операционки пока не доказаны: нужны расходы, себестоимость и задолженность." ]; return [ - `Коротко: по уже подтвержденным срезам 1С${organizationPart} компания выглядит операционно живой; это предварительная оценка бизнеса, а для взрослого вывода еще нужны прибыль, маржа и долги.`, + `Коротко: по уже подтвержденным срезам 1С${organizationPart} бизнес выглядит операционно живым; это предварительная оценка бизнеса, а для взрослого вывода еще нужны прибыль, маржа и долги.`, "Что уже видно:", ...recapFacts.map((fact) => `- ${ensureSentence(fact)}`), ...(businessEvidence.confirmedLines.length > 0 @@ -692,6 +728,84 @@ export function buildBroadBusinessEvaluationReply(input: { ].join(" "); } +export function buildConversationExecutiveSummaryReply(input: { + organization: string | null; + addressDebug: Record | null; + sessionItems?: unknown[]; + toNonEmptyString: (value: unknown) => string | null; +}): string { + const contextFacts = resolveAddressDebugContextFacts(input.addressDebug, input.toNonEmptyString); + const organization = input.organization ?? contextFacts.organization; + const organizationPart = organization ? ` по компании «${organization}»` : ""; + const recapFacts = collectRecentRecapFacts({ + sessionItems: input.sessionItems, + item: null, + organization, + toNonEmptyString: input.toNonEmptyString, + limit: 8 + }); + const businessEvidence = collectBusinessEvaluationEvidence({ + sessionItems: input.sessionItems, + organization, + toNonEmptyString: input.toNonEmptyString + }); + const confirmedLines: string[] = []; + for (const fact of recapFacts) { + pushBusinessLine(confirmedLines, ensureSentence(fact)); + } + for (const fact of businessEvidence.confirmedLines) { + pushBusinessLine(confirmedLines, ensureSentence(fact)); + } + const proxyLines: string[] = []; + for (const line of businessEvidence.interpretationLines) { + pushBusinessLine(proxyLines, ensureSentence(line)); + } + if (businessEvidence.hasNet) { + pushBusinessLine( + proxyLines, + "Нетто по деньгам можно использовать как cash-flow proxy, но это не бухгалтерская прибыль и не маржа." + ); + } + if (businessEvidence.hasRanking) { + pushBusinessLine( + proxyLines, + "Крупнейшие клиенты/контрагенты видны как операционный сигнал концентрации, но это не полноценный CRM-аудит." + ); + } + const missingLines = [ + "Чистая прибыль, финрезультат и маржа не доказаны без отдельной проверки себестоимости, расходов и закрытия периода.", + "Просрочка, качество долга и due-date aging не доказаны без сроков оплаты и отдельного долгового контура.", + "Ликвидность склада, резервы, списания и устаревание нельзя считать подтвержденными без специальных складских доказательств.", + "Vendor-risk, качество закупок и юридическая надежность контрагентов остаются вне подтвержденного контура." + ]; + const manualLines = [ + "Сверить ОСВ/финрезультат и управленческие расходы за ключевые годы.", + "Отдельно проверить старые открытые расчеты, крупные долги и документы закрытия.", + "Посмотреть концентрацию клиентов и поставщиков не только по сумме, но и по доле в периоде.", + "Сверить НДС, склад и договоры там, где ответ опирался на proxy, а не на прямой учетный факт." + ]; + const confirmedSection = + confirmedLines.length > 0 + ? confirmedLines.slice(0, 10).map((line) => `- ${line}`) + : ["- Есть grounded-контекст диалога, но для строгого executive summary не хватает подтвержденных метрик в текущем окне."]; + const proxySection = + proxyLines.length > 0 + ? proxyLines.slice(0, 6).map((line) => `- ${line}`) + : ["- Proxy-выводы пока слабые: можно говорить только о направлении проверки, не о зрелой оценке бизнеса."]; + + return [ + `Executive summary${organizationPart}: по диалогу уже можно собрать рабочую карту подтвержденного, proxy и ручного контроля, но не стоит выдавать это за полный аудит компании.`, + "Подтверждено по данным/ответам 1С:", + ...confirmedSection, + "Proxy и осторожная аналитика:", + ...proxySection, + "Где не хватило доказательств:", + ...missingLines.map((line) => `- ${line}`), + "Что директору смотреть руками в первую очередь:", + ...manualLines.map((line) => `- ${line}`) + ].join("\n"); +} + export function buildSelectedObjectAnswerInspectionReply(input: { addressDebug: Record | null; toNonEmptyString: (value: unknown) => string | null; diff --git a/llm_normalizer/backend/src/services/assistantRoutePolicy.ts b/llm_normalizer/backend/src/services/assistantRoutePolicy.ts index 5902e8e..7366d47 100644 --- a/llm_normalizer/backend/src/services/assistantRoutePolicy.ts +++ b/llm_normalizer/backend/src/services/assistantRoutePolicy.ts @@ -662,6 +662,14 @@ export function createAssistantRoutePolicy(deps) { hasOrganizationFactFollowupSignal(repairedRawUserMessage, sessionItems ?? []) || hasOrganizationFactFollowupSignal(effectiveAddressUserMessage, sessionItems ?? []) || hasOrganizationFactFollowupSignal(repairedEffectiveAddressUserMessage, sessionItems ?? []); + const broadBusinessMeaningBoundary = Boolean( + assistantTurnMeaning?.unsupported_but_understood_family === "broad_business_evaluation" && + assistantTurnMeaning?.stale_replay_forbidden === true && + !turnMeaningIntentCandidate && + !dataScopeMetaQuery && + !dangerOrCoercionSignal && + !groundedValueFlowFollowupContextDetected && + !organizationClarificationContinuationDetected); const hardMetaMode = resolveHardMetaMode({ dataScopeMetaQuery, capabilityMetaQuery, @@ -697,6 +705,41 @@ export function createAssistantRoutePolicy(deps) { }; } if (hardMetaMode === "capability") { + if (broadBusinessMeaningBoundary) { + return { + runAddressLane: false, + toolGateDecision: "skip_address_lane", + toolGateReason: "unsupported_current_turn_meaning_boundary", + livingMode: "chat", + livingReason: "unsupported_current_turn_meaning_boundary", + orchestrationContract: { + schema_version: "assistant_orchestration_contract_v1", + hard_meta_mode: null, + provider_execution: providerExecution, + assistant_turn_meaning: assistantTurnMeaning, + address_mode: resolvedModeDetection.mode, + address_mode_confidence: resolvedModeDetection.confidence, + address_intent: resolvedIntentResolution.intent, + address_intent_confidence: resolvedIntentResolution.confidence, + strong_data_signal_detected: strongDataSignal, + data_retrieval_signal_detected: dataRetrievalSignal, + followup_context_detected: Boolean(followupContext), + unsupported_current_turn_meaning_boundary: true, + unsupported_current_turn_family: assistantTurnMeaning.unsupported_but_understood_family, + unsupported_address_intent_fallback_to_deep: false, + final_decision: { + run_address_lane: false, + tool_gate_decision: "skip_address_lane", + tool_gate_reason: "unsupported_current_turn_meaning_boundary", + living_mode: "chat", + living_reason: "unsupported_current_turn_meaning_boundary" + }, + reason_codes: [ + "business_overview_meaning_overrides_capability_meta_noise" + ] + } + }; + } if (contextualHistoricalCapabilityFollowupDetected) { return { runAddressLane: false, @@ -784,7 +827,7 @@ export function createAssistantRoutePolicy(deps) { }; } const unsupportedCurrentTurnMeaningBoundary = Boolean( - assistantTurnMeaning?.unsupported_but_understood_family && + (broadBusinessMeaningBoundary || assistantTurnMeaning?.unsupported_but_understood_family) && assistantTurnMeaning?.stale_replay_forbidden === true && !turnMeaningIntentCandidate && !aggregateBusinessAnalyticsSignal && diff --git a/llm_normalizer/backend/tests/assistantLivingChatRuntimeAdapter.test.ts b/llm_normalizer/backend/tests/assistantLivingChatRuntimeAdapter.test.ts index d12cb9e..b7c3b55 100644 --- a/llm_normalizer/backend/tests/assistantLivingChatRuntimeAdapter.test.ts +++ b/llm_normalizer/backend/tests/assistantLivingChatRuntimeAdapter.test.ts @@ -426,6 +426,20 @@ describe("assistant living chat runtime adapter", () => { userMessage: "а ты помнишь мы зеркало обсуждали?", modeDecision: { mode: "chat", reason: "memory_recap_followup_detected" }, sessionItems: [ + { + role: "assistant", + debug: { + execution_lane: "address_query", + answer_grounding_check: { + status: "grounded" + }, + detected_intent: "list_documents_by_counterparty", + extracted_filters: { + organization: "ООО Альтернатива Плюс", + counterparty: "для" + } + } + }, { role: "assistant", debug: { @@ -504,6 +518,81 @@ describe("assistant living chat runtime adapter", () => { expect(executeLlmChat).not.toHaveBeenCalled(); }); + it("builds executive summary from memory instead of running generic address lookup", async () => { + const executeLlmChat = vi.fn(async () => "raw-llm"); + const input = buildRuntimeInput({ + userMessage: + "Финально собери executive summary по всему диалогу: где подтверждено, где proxy и что смотреть руками.", + modeDecision: { mode: "chat", reason: "memory_recap_followup_detected" }, + sessionItems: [ + { + role: "assistant", + debug: { + execution_lane: "address_query", + answer_grounding_check: { + status: "grounded" + }, + detected_intent: "list_documents_by_counterparty", + extracted_filters: { + organization: "ООО Альтернатива Плюс", + counterparty: "для" + } + } + }, + { + role: "assistant", + debug: { + execution_lane: "living_chat", + mcp_discovery_response_applied: true, + assistant_mcp_discovery_entry_point_v1: { + schema_version: "assistant_mcp_discovery_runtime_entry_point_v1", + entry_status: "bridge_executed", + turn_input: { + turn_meaning_ref: { + explicit_organization_scope: "ООО Альтернатива Плюс", + explicit_date_scope: "2020" + } + }, + bridge: { + bridge_status: "answer_draft_ready", + business_fact_answer_allowed: true, + answer_draft: { + answer_mode: "confirmed_with_bounded_inference" + }, + pilot: { + pilot_scope: "counterparty_bidirectional_value_flow_query_movements_v1", + derived_bidirectional_value_flow: { + period_scope: "2020", + net_amount_human_ru: "3 865 501,50 руб.", + incoming_customer_revenue: { + total_amount_human_ru: "47 628 853,03 руб." + }, + outgoing_supplier_payout: { + total_amount_human_ru: "43 763 351,53 руб." + } + } + } + } + } + } + } + ], + executeLlmChat + }); + + const output = await runAssistantLivingChatRuntime(input); + + expect(output.handled).toBe(true); + expect(output.chatText).toContain("Executive summary"); + expect(output.chatText).toContain("Подтверждено"); + expect(output.chatText).toContain("Proxy"); + expect(output.chatText).toContain("3 865 501,50"); + expect(output.chatText).toContain("директору смотреть руками"); + expect(output.chatText).not.toContain("«для»"); + expect(output.debug?.living_chat_response_source).toBe("deterministic_conversation_executive_summary_contract"); + expect(executeLlmChat).not.toHaveBeenCalled(); + }); + it("uses continuity-backed active organization for organization-fact boundary even when session scope is empty", async () => { const executeLlmChat = vi.fn(async () => "raw-llm"); const input = buildRuntimeInput({ diff --git a/llm_normalizer/backend/tests/assistantLivingModePolicy.test.ts b/llm_normalizer/backend/tests/assistantLivingModePolicy.test.ts index 4e9c7c7..f9e798a 100644 --- a/llm_normalizer/backend/tests/assistantLivingModePolicy.test.ts +++ b/llm_normalizer/backend/tests/assistantLivingModePolicy.test.ts @@ -50,6 +50,16 @@ describe("assistantLivingModePolicy", () => { expect(policy.hasConversationMemoryRecallFollowupSignal("а что мы уже выяснили по этой позиции?")).toBe(true); }); + it("detects final executive summary wording as memory signal", () => { + const policy = buildPolicy(); + + expect( + policy.hasConversationMemoryRecallFollowupSignal( + "Финально собери executive summary по всему диалогу: где подтверждено, где proxy и что смотреть руками" + ) + ).toBe(true); + }); + it("detects answer inspection wording for previous answer correction", () => { const policy = buildPolicy(); diff --git a/llm_normalizer/backend/tests/assistantMemoryRecapPolicy.test.ts b/llm_normalizer/backend/tests/assistantMemoryRecapPolicy.test.ts index 676b36a..fb76061 100644 --- a/llm_normalizer/backend/tests/assistantMemoryRecapPolicy.test.ts +++ b/llm_normalizer/backend/tests/assistantMemoryRecapPolicy.test.ts @@ -104,6 +104,42 @@ describe("assistantMemoryRecapPolicy", () => { expect(signals.contextualMemoryRecapFollowupDetected).toBe(true); }); + it("keeps final executive summary in memory lane even with strong data wording", () => { + const executivePolicy = createAssistantMemoryRecapPolicy({ + hasHistoricalCapabilityFollowupSignal: () => false, + hasConversationMemoryRecallFollowupSignal: (text: unknown) => /executive summary/i.test(String(text ?? "")), + isGroundedInventoryContextDebug: () => false + }); + + const signals = executivePolicy.resolveRouteMemorySignals({ + rawUserMessage: + "Финально собери executive summary по всему диалогу: где подтверждено, где proxy и где не хватило доказательств.", + repairedRawUserMessage: "", + effectiveAddressUserMessage: "", + repairedEffectiveAddressUserMessage: "", + dataScopeMetaQuery: false, + capabilityMetaQuery: false, + dataRetrievalSignal: false, + strongDataSignal: true, + aggregateBusinessAnalyticsSignal: false, + sessionItems: [ + { + role: "assistant", + debug: { + execution_lane: "address_query", + answer_grounding_check: { status: "grounded" }, + detected_intent: "receivables_confirmed_as_of_date", + extracted_filters: { + organization: "ООО Альтернатива Плюс" + } + } + } + ] + }); + + expect(signals.contextualMemoryRecapFollowupDetected).toBe(true); + }); + it("does not trigger recap from ungrounded address history", () => { const signals = policy.resolveRouteMemorySignals({ rawUserMessage: "а ты помнишь что мы обсуждали?", diff --git a/llm_normalizer/backend/tests/assistantRoutePolicy.test.ts b/llm_normalizer/backend/tests/assistantRoutePolicy.test.ts index 02c5153..99d1bea 100644 --- a/llm_normalizer/backend/tests/assistantRoutePolicy.test.ts +++ b/llm_normalizer/backend/tests/assistantRoutePolicy.test.ts @@ -683,6 +683,45 @@ describe("assistantRoutePolicy", () => { expect(decision.orchestrationContract?.unsupported_current_turn_family).toBe("broad_business_evaluation"); }); + it("lets broad business-audit meaning override noisy capability wording", () => { + const policy = buildPolicy({ + resolveMetaSignalSet: () => ({ + dataScopeMetaQuery: false, + capabilityMetaQuery: true, + metaAnswerFollowupSignal: false, + answerInspectionFollowupSignal: false + }), + resolveAssistantTurnMeaning: () => ({ + schema_version: "assistant_turn_meaning_v1", + asked_domain_family: "business_summary", + asked_action_family: "broad_evaluation", + explicit_intent_candidate: null, + unsupported_but_understood_family: "broad_business_evaluation", + stale_replay_forbidden: true, + reason_codes: ["broad_business_evaluation_current_turn_signal"] + }) + }); + + const decision = policy.resolveAssistantOrchestrationDecision({ + rawUserMessage: + "Собери это как нормальный бизнес-аудит: что уже можно сказать уверенно, что proxy и что директору проверить руками.", + effectiveAddressUserMessage: + "Собери это как нормальный бизнес-аудит: что уже можно сказать уверенно, что proxy и что директору проверить руками.", + followupContext: null, + llmPreDecomposeMeta: null, + useMock: false + }); + + expect(decision.runAddressLane).toBe(false); + expect(decision.toolGateReason).toBe("unsupported_current_turn_meaning_boundary"); + expect(decision.livingReason).toBe("unsupported_current_turn_meaning_boundary"); + expect(decision.orchestrationContract?.unsupported_current_turn_family).toBe("broad_business_evaluation"); + expect(decision.orchestrationContract?.hard_meta_mode).toBeNull(); + expect((decision.orchestrationContract as Record)?.reason_codes).toContain( + "business_overview_meaning_overrides_capability_meta_noise" + ); + }); + it("recovers an address route from current-turn meaning when L0 resolver is noisy", () => { const policy = buildPolicy({ resolveAddressToolGateDecision: undefined,