22 KiB
11 - Continuity Stabilization Plan (2026-04-17)
Purpose
This note defines the recovery plan for the current pre-expansion breakpoint.
The goal is not to patch individual failing prompts.
The goal is to finish the missing runtime authority that should govern mixed live sessions after the turnaround 11 owner extractions.
Current Reading
The strongest current evidence is:
- narrow and company-selected scenarios can pass end-to-end;
- mixed saved-session runtime still fails on root inventory, selected-object continuity, same-date restore, and cross-domain same-date pivot;
- therefore the architecture is not missing only routes;
- it is missing one governing continuity authority.
In one sentence:
- decision ownership became distributed faster than continuity ownership became explicit.
What This Plan Stabilizes
This plan is specifically about one system object:
assistant_session_continuity_v1
That object should become the shared authority for:
- active root frame
- active selected object
- active organization scope
- active date scope
- active clarification state
- active answer object / reusable bundle
- recap source of truth
Target Runtime Rule
Before any of the following decisions are made:
- route arbitration
- company clarification
- selected-object follow-up routing
- same-date restore
- recap answer generation
the runtime must first resolve one continuity snapshot for the active session.
Those downstream owners may interpret the snapshot differently, but they must not reconstruct competing versions of the session state independently.
Immediate Passes
Pass A. Install shared continuity snapshot
Scope:
- create one shared continuity resolver for session items and grounded address context;
- centralize extraction of active item, organization, date, root frame, and recap-eligible grounded context;
- stop allowing recap and adjacent meta logic to build over ungrounded or clarification-only state.
Exit condition:
- recap can only trigger over verified grounded address context;
- selected-object memory cannot be reconstructed from failed clarification turns;
- route and memory layers consume the same continuity snapshot.
Pass B. Reduce clarification priority conflicts
Scope:
- move clarification behind restored continuity when the business frame is already sufficient;
- stop repeated company clarification from interrupting same-family continuation;
- make clarification state explicit and resumable instead of re-discovered ad hoc.
Exit condition:
- repeated clarification no longer appears after adjacent grounded business answers in the same thread;
- selected-object and same-date follow-ups stop falling into generic company templates.
Pass C. Re-ground recap and answer packaging
Scope:
- recap must summarize verified session facts only;
- answer packaging must not sound more certain than truth assembly;
- technical scaffolding must not leak into the top block of user-facing answers.
Exit condition:
- recap cannot claim supplier/date/document facts that were never grounded;
- meta boundary replies no longer expose
MCP,read-only, route ids, capability ids, or debug labels; - user-facing top blocks remain business-first.
Pass D. Lock mixed runtime as the primary gate
Scope:
- promote mixed saved-session runtime to the main architecture gate before domain expansion;
- keep narrow harnesses and seam tests, but do not let them overrule mixed replay;
- evaluate critical user paths rather than isolated route green status.
Exit condition:
- the core mixed replay is green on direct answer, selected-object continuity, same-date carryover, recap truthfulness, and technical cleanliness;
- no unresolved
P0remains on the primary user path.
Anti-Goals
This stabilization pass is not:
- a rollback to the old monolith
- a case-by-case regex patch sweep
- a prompt-only wording cleanup
- a UI-only improvement pass
Practical Sequence
- Finish the continuity snapshot and wire it into the hot route / recap path.
- Rework clarification precedence so it becomes a last meaningful step.
- Harden recap and boundary presentation against ungrounded and technical output.
- Rerun the mixed AGENT replay until the critical continuity edges are green.
- Only then continue deeper intent extraction and wider domain expansion.
Current Pass Status
Completed in the current working pass:
- shared continuity snapshot is already wired into recap and adjacent route memory logic;
- grounded address history can now restore
active organization scopeinstead of depending only on explicit company-selection metadata; - early organization clarification no longer outranks item-focused inventory follow-up paths when the session already carries a strong object frame;
- meta boundary replies were already cleaned from technical
MCP/read-onlyleakage. - early
non_domainarbitration no longer suppresses a positive L0 address-lane decision for colloquial but supported exact routes; - foreign-accounting pivots over inventory drilldown now preserve root-scoped carryover instead of dropping continuity before root-frame sanitation;
- the wide
assistantAddressFollowupContextregression pack is green again, including month-only VAT follow-up and inventory -> VAT pivot sanitation. - counterparty document root wording is now recovered through unicode-safe exact signals instead of depending on mojibake-sensitive legacy phrases;
- declined Russian account wording like
по счёту 60now restores account scope inside polarity/runtime guards instead of collapsing intoother_numeric; - exact address intents can now stay in the address lane even if the semantic guard overflags deep investigation without an actual investigative user request;
- selected-object inventory follow-ups can now override a stale stock root intent when the semantic contract already marks
selected_object_scope_detected, including exact user wording likeпо выбранному объекту ... где взяли это; - explicit capability-meta wording for
дельта по договорамnow keeps the asked capability in the user-facing answer instead of collapsing into the genericчто ты умеешьcatalog reply. - the transition hot path now starts consuming the shared continuity snapshot as fallback authority for active item / active organization / grounded inventory root frame instead of rebuilding those values only from local ad hoc history scans;
- live replay
address_truth_harness_phase7_meta_domain_mix_live_20260417_post_arch_fix_rerun2is accepted end-to-end with14/14steps green, including the previously brokenstep_01_counterparty_documentsandstep_04_open_items_account_60.
Still open after this pass:
- mixed continuity is now strong enough for the current phase7 gate, but it still needs broader saved-session proof before domain expansion can be treated as low-risk;
- the next architecture pass should move from one repaired mixed replay to a wider saved-session set and multi-domain acceptance pack;
- remaining work should focus on keeping the unified continuity authority stable under new real user paths, not on wording-only polish or isolated route greens.
- company authority is still not proactive enough at root inventory entry in multi-company sessions without an already grounded active organization;
- the next stabilization slice should prefer system-level company authority handling over repeated local clarification templates when the session has enough business context.
Completed in the current follow-up pass:
- direct company activity-age wording like
а по Альтернативе Плюс сколько лет активности в базе 1С?is now protected by a unicode-safe exact signal instead of depending on mojibake-sensitive legacy lifecycle phrases; - capability meta answers now explain supported business groups through human examples instead of leaking internal operation ids like
vat_period_snapshot,inventory_on_hand_as_of_date,explain_boundary, orsuggest_safe_next_step; - the next proof target after unit/build checks is the live phase5 replay, because it exercises both the restored activity-age path and the capability-meta interrupt in one shared session.
Latest live replay evidence after that proof run:
- the capability meta interrupt is now business-first and no longer leaks internal operation ids in the top block;
- the same replay exposed a stricter continuity defect that the top-level review initially missed: organization identity can drift in session state as a damaged live label like
ООО \\Альтернати"а Плюс\\; - when that happens, the runtime keeps both
organizationand a stalecounterpartyanchor, does not emitcounterparty_cleared_for_selected_organization_activity, and falls intocounterparty_anchor_not_matched_in_materialized_rows; - this is a system-level organization-identity robustness gap between data-scope probing, continuity memory, and exact-route truth gating, not a wording-only prompt defect;
- the current stabilization slice therefore includes hardening organization identity matching itself and rerunning the same live pack until step-level human answers and review verdicts align.
Latest phase8 runtime authority evidence after the manual mixed replay hardening:
- live replay
address_truth_harness_phase8_manual_runtime_authority_mix_live_20260417_rerun1proved that the activity-age route was restored, but also exposed a hidden false-green:step_11_inventory_same_date_after_receivablessilently reused stale inventory-root date2021-03-31instead of the freshest receivables date2020-03-31; - the first fix in
assistantServicewas not sufficient on its own, becausedecomposeStagestill rebuiltinventory_rootfollow-up context by overwritingprevious_filtersfromroot_filterswholesale; - the architectural correction was to preserve
rootauthority for organization / warehouse while preserving the freshest temporal scope (as_of_date,period_from,period_to) from the immediately previous grounded step; - this was locked by direct regressions in
assistantTransitionPolicy.test.tsandaddressInventoryRootFrameRegression.test.ts, plus a live rerun against the same manual replay spec; - live replay
address_truth_harness_phase8_manual_runtime_authority_mix_live_20260417_rerun4is now accepted end-to-end with14/14steps green, including:step_07_capability_metawith business-first human wording;step_11_inventory_same_date_after_receivableson the correct date31.03.2020;step_14_company_activity_agewith restored factual lifecycle answer;- cleaned user-facing company labels in the data-scope meta reply (
ООО Альтернатива Плюс,ООО Лайсвуд,РАЙМ) instead of damaged raw probe labels.
Still open after the accepted phase8 replay:
- proactive organization authority at the very beginning of a new multi-company bookkeeping session is still weaker than the target product feel; the current system now clarifies honestly and cleanly, but it does not yet always pre-offer company selection early in the conversational flow;
- some user-facing inventory/counterparty labels inside business answers still deserve final presentation cleanup, but these are now post-stabilization quality refinements rather than continuity-authority blockers.
Latest phase9 proactive-authority evidence after the fresh multi-company replay:
- a new live replay
address_truth_harness_phase9_proactive_scope_offer_live_20260418_rerun3is accepted end-to-end with5/5steps green; - on the very first smalltalk turn, the assistant now stays in normal living-chat mode but appends a business-first proactive organization offer instead of waiting for a later forced clarification;
- explicit company choice in the next turn is now fixed deterministically into session authority before the first accounting route, so later business turns inherit one stable
active organization; - the restored activity-age route for
ООО Альтернатива Плюсis now proven again inside the same shared session, not only in isolated route checks; - the previously broken same-date inventory pivot after receivables is now routed as
inventory_on_hand_as_of_datewith the carried date31.03.2020and the carried organizationООО Альтернатива Плюс, without falling back into repeated company clarification; - this phase therefore closes the remaining gap called out at the end of phase8: proactive company authority is no longer purely reactive in fresh multi-company bookkeeping sessions.
Still open after the accepted phase9 replay:
- business answers are now semantically correct on this path, but some inventory list formatting still feels heavier and more mechanical than the target human style;
- the next architecture slice should keep expanding saved-session proof across additional real user chains, while separately tightening answer presentation so exact routes do not feel template-driven even when the truth path is already correct.
Latest phase10 bridge-and-aggregate evidence after the manual replay recovery:
- live replay
address_truth_harness_phase10_manual_bridge_and_aggregate_mix_live_20260418_rerun8is accepted end-to-end with9/9steps green; - the previously broken bridge
selected item purchase provenance -> VAT on purchase dateis now explicit instead of implicit:- the continuity layer derives the purchase month from the grounded provenance evidence;
- the same session keeps
selected objectcontinuity instead of collapsing into generic root-only VAT arbitration; - the runtime now routes this follow-up as
vat_liability_confirmed_for_tax_period, not asforecast,unknown, or generic clarification;
- the same replay also proves that the neighboring aggregate fixes are live on the real assistant path:
- top-customer-all-time now returns a direct business answer first;
- top-year aggregate now returns a direct business answer first;
- very-old-stock now prefers
inventory_aging_by_purchase_dateover a generic inventory snapshot;
- this matters architecturally because the seam that used to exist only as ambient monolith behavior is now protected as an explicit carryover contract plus replay-backed acceptance path.
Still open after the accepted phase10 replay:
- the user-facing VAT explanation block is now correct and grounded, but some long exact answers still feel heavier than the target human product tone;
- the next architecture slice should keep moving from repaired bridge authority into answer-shaping cleanup and broader saved-session replay coverage, not back into isolated wording tweaks.
Latest phase11 manual follow-up/meta-quality evidence after the current hardening loop:
- live replay
address_truth_harness_phase11_manual_followup_meta_quality_live_20260418_rerun6is accepted end-to-end with10/10steps green; - the previously broken
ты умеешь считать дельту по договорам?branch is now protected by an explicit authority rule:- raw capability-meta intent outranks canonical predecompose rewrites that look like address retrieval;
- stale VAT follow-up continuity no longer wins over a fresh capability/meta question in the same session;
- the previously broken short counterparty retarget
а по свкis now clean on the real assistant path:- the display label uses the most specific confirmed counterparty name instead of a generic group fallback or a stale carryover anchor;
- short uppercase Cyrillic acronyms like
СВКno longer get stripped by the user-facing sanitizer as false mojibake; - the replay acceptance rule now targets the real regression (
Контрагент: Группа Найдено ...) instead of incorrectly rejecting valid names likeКонтрагент: Группа СВК.;
- this phase matters architecturally because it closes two different seam classes at once:
meta authority vs stale follow-up authority;resolved business label vs boundary sanitization noise.
Still open after the accepted phase11 replay:
- the current phase11 path is now semantically clean, but broader manual/user session packs still need to be replayed before expansion can be called low-risk across new domains;
- answer shaping on some long exact list answers is still heavier than the target human product feel, even though the truth path and routing are now correct;
- the next architecture slice should move to wider saved-session acceptance coverage and humanized exact-answer presentation, not back to isolated prompt-level repairs.
Latest continuity-authority convergence evidence after the current route pass:
- the route hot path now consumes the shared continuity snapshot directly instead of relying only on local
findLastGrounded...helpers:- grounded address context can now survive into route arbitration even when the legacy local helper returns nothing for the current turn shape;
- active organization continuity is now allowed to participate in organization-selection arbitration, instead of forcing route policy to reconstruct that context only from immediate clarification payloads;
- a bare organization-selection turn after grounded bookkeeping continuity is no longer automatically classified as
non_domain_query_indexednoise when the session still carries valid grounded business context; - session organization recovery inside the data-scope layer now has a final fallback to the same continuity snapshot, reducing one more duplicate path that used to rescan assistant history independently;
- the living-chat runtime now also consumes continuity-backed organization authority:
- deterministic organization-fact boundary replies can now trigger from grounded continuity even when
sessionScope.selectedOrganizationandsessionScope.activeOrganizationare both empty at runtime entry; - the chat layer now records whether it entered with grounded continuity and which organization came from that continuity snapshot, making future saved-session review less blind;
- proactive organization offer logic is now explicitly blocked when grounded address continuity already exists, so the chat layer does not re-offer company selection on top of an already grounded business session;
- deterministic organization-fact boundary replies can now trigger from grounded continuity even when
- the first human-answer-shaping cleanup pass is now applied to heavy profile/aggregate exact answers:
period_coverage_profileanddocument_type_and_account_section_profilenow start with a direct business-first lead (Коротко: ...) instead of service-flavored intros likeпрофиль собран/строк агрегата;- the top block now states the business conclusion first and leaves ranked detail blocks below, which reduces the catalog-like feel without hiding the actual data;
- the next human-answer-shaping cleanup pass is now applied to VAT exact replies:
vat_payable_forecastandvat_liability_confirmed_for_tax_periodnow open with a business-firstКоротко: ...lead, while the detailed calculation stays in the secondary block;- service-flavored top lines like
Собран прогноз...,Режим результата..., andСтрок агрегата...are removed from the first screen of the reply, which makes VAT answers read like user-facing guidance instead of an engine report; - VAT reply tests now explicitly protect this top-block shape, so future changes cannot silently reintroduce the same mechanical preamble;
- the next human-answer-shaping cleanup pass is now applied to counterparty ranking/profile replies:
counterparty_activity_lifecycle,contract_usage_overview,customer_revenue_and_payments,supplier_payouts_profile, andcontract_usage_and_valuenow open with business-first wording instead of service-flavoredпрофиль собран / строк агрегата / строк источника;- ranking and contract replies now preserve user wording better in the visible heading layer, including
минимальный бюджетphrasing for low-turnover active contracts; - targeted ranking/profile tests now protect the new top-block shape, so these families are less likely to regress back into report-like wording during later route/domain work;
- the next human-answer-shaping cleanup pass is now applied to plain list replies in the exact lane:
list_contracts_by_counterparty,list_documents_by_contract,bank_operations_by_counterparty,bank_operations_by_contract, and the generic factual-list fallback no longer leaklive address lane / catalog address lanewording into the user-facing answer;- these list replies now start with direct business-first leads and keep the selected rows below, which preserves factual usefulness without exposing internal routing labels;
- targeted utf8 header tests now explicitly protect against
laneleakage in these list families; - this is still not the end of shaping work: some long evidence-heavy replies and residual catalog-style blocks still need the same cleanup;
- this pass does not yet finish full single-owner continuity, but it narrows one of the remaining seams where route arbitration and scope memory could disagree about whether the session was still grounded.
Next Execution Slice (2026-04-18)
The project is now moving from:
breakpoint recovery
to:
danger-zone exit under explicit gates
This next slice should be executed in the following order:
- Finish continuity authority convergence in the hot runtime path.
- Widen saved-session replay coverage beyond the already repaired flagship chains.
- Tighten human answer shaping on long exact answers without reintroducing template drift.
- Only after that, begin controlled domain-by-domain expansion toward the multi-domain stage.
Current explicit goals for this slice:
- fewer owners independently reconstruct
active context; - more replay breadth before any large expansion claim;
- cleaner user-facing business answers on already-correct truth paths;
- lower risk that new domains multiply orchestration chaos faster than capability growth.
Ready Signal
The project can leave the current breakpoint when:
- mixed live sessions no longer depend on distributed guesswork about active context;
- clarification does not outrank valid restored business continuity;
- recap is grounded and business-useful;
- technical scaffolding is removed from user-facing meta answers;
- the primary mixed replay is green for the real user path, not only for narrow packs.