Post-F: синхронизировать архитектурную карту с кодом

This commit is contained in:
dctouch 2026-04-24 21:02:57 +03:00
parent 837e1fe141
commit 87c440d6fb
4 changed files with 241 additions and 10 deletions

View File

@ -264,3 +264,23 @@ That document formalizes the next pressure point:
- not only growing autonomy breadth,
- but protecting semantic correctness inside the autonomy surface that already exists.
## Cross-Check Update - 2026-04-24
Post-F hardening did not invalidate this D/E/F plan. It confirmed the original boundary:
- D/E/F are now baseline substrate, not the active rescue task;
- the old broad manual replay is a regression amplifier for this substrate, not a request to return to pre-D/E/F behavior;
- manual failures are accepted as useful only when the fix strengthens bounded evidence planning, explicit-subject arbitration, or materialization truth.
The manual failure slice from `assistant-stage1-9liEOh-7JP` was repaired without rolling back D/E/F:
- VAT purchase-date and February 2017 turns now route through confirmed tax-period VAT or materialize forecast periods inside the requested window;
- highest-value customer wording now lands on customer revenue/payment ranking, not lifecycle activity;
- Chepurnov item-flow after stale inventory context now keeps the counterparty/document contour instead of reusing inventory focus.
The current hand-off is therefore:
- keep D/E/F as the bounded autonomy baseline;
- keep Post-F semantic integrity invariants as regression gates;
- continue the next slice by expanding open-world breadth only behind the same evidence and replay discipline.

View File

@ -193,6 +193,29 @@ Replay-backed anchors for the current layer include:
- `address_truth_harness_post_f_cross_stage_canary_agent_20260424_live7`, accepted `24/24`, and saved into autoruns as `AGENT | Post-F cross-stage semantic integrity canary` (`gen-ag04241406-abe4d8`)
- `address_truth_harness_post_f_manual_failures_20260424_live3`, accepted `11/11`, and saved into autoruns as `AGENT | Post-F ручные провалы VAT revenue item-flow live3` (`gen-ag04241710-bdb248`)
## Operational Closure - 2026-04-24
Post-F is now treated as operationally closed as a hardening slice.
The closure is not based on unit tests alone. It is based on code-level wiring plus replay-backed semantic evidence:
- code fix commit: `739e8b8 Post-F: закрыть ручные провалы НДС, выручки и item-flow`;
- runtime artifact tail commit: `837e1fe Post-F: сохранить хвосты ручных runtime-прогонов`;
- cross-stage canary: `address_truth_harness_post_f_cross_stage_canary_agent_20260424_live7`, accepted `24/24`;
- manual failure replay: `address_truth_harness_post_f_manual_failures_20260424_live3`, accepted `11/11`;
- focused runtime slice: `addressQueryRuntimeM23.test.ts` accepted `403/403` before the final manual-failure reinforcement;
- graph snapshot after rebuild: `5892 nodes`, `12772 edges`, `137 communities`.
The code-derived map for this closure is documented in:
- [18 - post_f_code_documentation_sync_2026-04-24.md](./18%20-%20post_f_code_documentation_sync_2026-04-24.md)
The important architectural reading is:
- fixing old manual-run failures did not roll back the fresh bounded-autonomy concept;
- the old broad run acted as a semantic regression oracle;
- each accepted fix strengthened current Post-F invariants: explicit current-turn meaning beats stale scope, exact VAT materialization cannot self-filter away, and counterparty item-flow cannot be hijacked by stale inventory focus.
## Honest Remaining Risk
This phase should not be overclaimed.

View File

@ -0,0 +1,180 @@
# 18 - Post-F Code Documentation Sync (2026-04-24)
## Purpose
This note is the code-derived map after the operational closure of `Post-F Semantic Integrity Hardening`.
It answers a narrow question:
- do the current architecture docs still match the code and replay evidence?
Short answer:
- yes for the active architecture mainline;
- no if old planning notes are read as current status rather than historical context.
So this document becomes the current map layer over the older maps.
## Source-Of-Truth Order
Use the documentation in this order:
1. `README.md` for the current package snapshot and readiness percentages.
2. `16 - data_need_graph_and_open_world_mcp_plan_2026-04-22.md` for the D/E/F bounded-autonomy baseline.
3. `17 - post_f_semantic_integrity_hardening_2026-04-23.md` for the semantic-integrity invariants and closure evidence.
4. This document for the code-level responsibility map after Post-F.
5. Documents `01` through `15` as historical architecture and stabilization trail, not as the current final status.
## Current Code Map
The current Post-F behavior is not owned by one monolithic resolver. It is spread across a few deliberately bounded seams.
### Intent Signal Layer
Primary files:
- `llm_normalizer/backend/src/services/addressCounterpartyIntentSignals.ts`
- `llm_normalizer/backend/src/services/addressIntentResolver.ts`
Responsibilities:
- detect account-60/62/76 open-item tails before they collapse into generic account snapshots;
- detect counterparty item-flow wording such as "what did this counterparty ship" before stale inventory focus can win;
- detect highest-value customer/revenue-ranking wording before it falls into lifecycle activity;
- keep bidirectional value-flow comparison asks away from one-sided exact recipes when the user needs both incoming and outgoing money.
Important Post-F reading:
- new manual-run fixes are not "old behavior restored";
- they are explicit current-turn meaning guards added before stale context can arbitrate the turn incorrectly.
### Follow-Up Retarget Layer
Primary file:
- `llm_normalizer/backend/src/services/address_runtime/decomposeStage.ts`
Responsibilities:
- preserve previous period filters when the user says "за этот период";
- retarget selected-inventory purchase-date VAT questions to confirmed tax-period VAT when the business meaning is a tax-period liability;
- prevent non-inventory counterparty/document follow-ups from being pulled back into inventory purchase-document routing;
- protect selected-object inventory follow-ups only when inventory lineage is actually active.
Important Post-F reading:
- temporal carryover is allowed only when it preserves the active business object;
- it must not revive an old focus object after the user has explicitly changed the subject.
### Recipe And Materialization Layer
Primary file:
- `llm_normalizer/backend/src/services/addressRecipeCatalog.ts`
Responsibilities:
- own the exact recipe definitions for VAT, revenue ranking, documents, contracts, bank operations, and inventory;
- materialize VAT forecast aggregates with a period inside the requested period window via `__PERIOD_EXPR__`;
- keep confirmed VAT tax-period execution separate from raw month extraction;
- expose `address_vat_liability_confirmed_tax_period_v1` and `address_customer_revenue_and_payments_v1` as first-class exact recipes.
Important Post-F reading:
- exact route success is not enough if the rows disappear in a later post-filter;
- materialization period is part of semantic truth, not just a query formatting detail.
### Filter Extraction Layer
Primary file:
- `llm_normalizer/backend/src/services/addressFilterExtractor.ts`
Responsibilities:
- extract explicit counterparty anchors from current-turn wording;
- support instrumental counterparty forms used in passive item-flow questions;
- avoid letting old organization scope masquerade as the newly asked counterparty.
Important Post-F reading:
- current-turn explicit counterparty is stronger than stale organization scope unless the user explicitly asks for an organization-level question.
### MCP Discovery And Response Layer
Primary files:
- `llm_normalizer/backend/src/services/assistantMcpDiscoveryTurnInputAdapter.ts`
- `llm_normalizer/backend/src/services/assistantMcpDiscoveryPilotExecutor.ts`
- `llm_normalizer/backend/src/services/assistantMcpDiscoveryResponsePolicy.ts`
Responsibilities:
- keep raw entity search clean after ranking/value-flow conversations;
- separate metadata scope from data-answer subject;
- keep planner-selected MCP answers from being overwritten by stale exact-lane replies;
- prevent raw internal evidence lines from leaking into user-facing ranked value-flow answers.
Important Post-F reading:
- MCP discovery is now a bounded autonomy substrate;
- Post-F does not replace it, it protects it from stale-scope contamination.
## Replay Evidence Map
The following replay anchors define the current accepted map:
- `address_truth_harness_post_f_cross_stage_canary_agent_20260424_live7`: accepted `24/24`.
- `address_truth_harness_post_f_manual_failures_20260424_live3`: accepted `11/11`.
- `address_truth_harness_phase82_human_mixed_integrity_status_dialog_post_f_account_injection_guard_clean_scope`: accepted `19/19`.
- `address_truth_harness_phase82_human_mixed_integrity_status_dialog_post_m23_rerun_documents_scope_bidirectional`: accepted `19/19`.
- `address_truth_harness_phase67_svk_grounded_counterparty_integrity_live_rerun_vatfix`: accepted.
- `address_truth_harness_phase11_manual_followup_meta_quality_live_rerun_vatfix`: accepted `10/10`.
- `address_truth_harness_phase20_continuity_stabilization_live_rerun_vatfix`: accepted `6/6`.
The saved autorun map includes:
- `AGENT | Post-F cross-stage semantic integrity canary live7`;
- `AGENT | Post-F ручные провалы VAT revenue item-flow live3`.
## Manual Run Interpretation
Old broad manual runs are not obsolete.
They are useful when treated as semantic regression oracles:
- if a failure exposes stale scope, wrong subject carryover, wrong lane arbitration, or materialization loss, it should be fixed as Post-F integrity work;
- if a failure asks for unsupported breadth outside the current primitive/search surface, it should become next-slice bounded enablement work;
- if a failure depends on old accidental behavior that violates the new evidence model, the old expectation must be rejected rather than restored.
The `assistant-stage1-9liEOh-7JP` failures were in the first category:
- VAT purchase-date and February 2017 were supported, but the route/materialization arbitration was wrong;
- highest-value customer was supported, but fell into the wrong lifecycle contour;
- Chepurnov item-flow was supported, but stale inventory focus could hijack the question.
## Current Closure Reading
Post-F is operationally closed as a module because:
- the identified stale-scope and materialization seams have code-level guards;
- the repaired manual failure slice has live replay evidence;
- the cross-stage canary proves older bounded-autonomy contours still survive the new guards;
- the working tree was clean after the closure commits.
Post-F is not a claim that:
- arbitrary unfamiliar 1C asks are now solved;
- `resolveAddressIntent()` no longer carries central pressure;
- every future stale-memory seam is impossible.
The next module should therefore start from "grow open-world bounded autonomy breadth", not from "finish the first Post-F rescue".
Every new breadth step must keep the Post-F invariants as regression gates.
## Remaining Risks To Watch
- central intent pressure is still visible around `addressIntentResolver.ts` and the counterparty signal bridge;
- historical docs before `15` are not current status documents and can be misread without this sync layer;
- full broad test-suite green was not the primary proof for Post-F closure; semantic replay remains the stronger evidence;
- future runtime-job artifacts can still accumulate outside the curated autorun flow and should be committed only when intentionally useful.

View File

@ -35,6 +35,7 @@ This package answers the next question:
15. [15 - mcp_bounded_autonomy_reset_plan_2026-04-21.md](./15%20-%20mcp_bounded_autonomy_reset_plan_2026-04-21.md)
16. [16 - data_need_graph_and_open_world_mcp_plan_2026-04-22.md](./16%20-%20data_need_graph_and_open_world_mcp_plan_2026-04-22.md)
17. [17 - post_f_semantic_integrity_hardening_2026-04-23.md](./17%20-%20post_f_semantic_integrity_hardening_2026-04-23.md)
18. [18 - post_f_code_documentation_sync_2026-04-24.md](./18%20-%20post_f_code_documentation_sync_2026-04-24.md)
## Current Status Snapshot (2026-04-24)
@ -57,14 +58,19 @@ It now documents a turnaround that is already operational in code, already mater
- protect exact and planner-selected pivots from metadata/discovery drift
- keep temporal continuity and repeated lane switches semantically stable
- recover already-supported questions that still look broken to a human user
- the Post-F module is now operationally closed as a hardening slice:
- code fix commit: `739e8b8 Post-F: закрыть ручные провалы НДС, выручки и item-flow`
- runtime artifact tail commit: `837e1fe Post-F: сохранить хвосты ручных runtime-прогонов`
- live map sync: [18 - post_f_code_documentation_sync_2026-04-24.md](./18%20-%20post_f_code_documentation_sync_2026-04-24.md)
Current honest status:
- turnaround implementation progress: `~98%`
- exit-from-danger-zone readiness: `~95%`
- pre-multidomain readiness: `~88%`
- bounded-autonomy foundation readiness: `~86%`
- open-world bounded-autonomy readiness: `~71%`
- turnaround implementation progress: `~99%`
- exit-from-danger-zone readiness: `~97%`
- pre-multidomain readiness: `~90%`
- bounded-autonomy foundation readiness: `~89%`
- open-world bounded-autonomy readiness: `~75%`
- Post-F semantic integrity module progress: `~99%` operationally closed, with remaining risk now treated as next-slice discovery rather than an open blocker inside the closed slice
- graph snapshot after latest rebuild: `5892 nodes`, `12772 edges`, `137 communities`
- current breakpoint:
- the validated hot paths are no longer structurally broken;
@ -93,6 +99,7 @@ Latest live proof now includes:
- `address_truth_harness_phase82_human_mixed_integrity_status_dialog_post_m23_rerun_documents_scope_bidirectional` accepted `19/19`
- `address_truth_harness_phase82_human_mixed_integrity_status_dialog_post_f_account_injection_guard_clean_scope` accepted `19/19`, with the `Жуковке 51` numeric counterparty suffix kept as counterparty scope instead of leaking as account `51`
- `address_truth_harness_post_f_cross_stage_canary_agent_20260424_live7` accepted `24/24`, proving a saved cross-stage AGENT canary across VAT metadata, metadata-scoped organization/document pivots, numeric counterparty suffixes, open-organization value-flow clarification, ranked value-flow year switches, and SVK grounded reset; the saved autorun is `AGENT | Post-F cross-stage semantic integrity canary` (`gen-ag04241406-abe4d8`)
- `address_truth_harness_post_f_manual_failures_20260424_live3` accepted `11/11`, proving the manual failure slice from `assistant-stage1-9liEOh-7JP`: VAT purchase-date, VAT February 2017, highest-value customer, and Chepurnov item-flow after stale inventory context; the saved autorun is `AGENT | Post-F ручные провалы VAT revenue item-flow live3` (`gen-ag04241710-bdb248`)
- `address_truth_harness_phase11_manual_followup_meta_quality_live_rerun_vatfix` accepted `10/10`
- `address_truth_harness_phase20_continuity_stabilization_live_rerun_vatfix` accepted `6/6`
- `addressQueryRuntimeM23.test.ts` full semantic/runtime slice accepted `403/403` after Post-F VAT/date-basis, scope-recovery, open value-flow organization clarification, document-vs-bank arbitration, and reply-shape hardening
@ -102,7 +109,7 @@ Current architectural reading:
- the system is already materially past the dangerous regression breakpoint;
- it is now safe for continued architecture hardening and controlled domain-by-domain enablement under replay gates;
- it is materially closer to pre-multidomain stability, but still not safe to declare broad low-risk expansion over arbitrary unfamiliar 1C questions.
- the practical next target is no longer only `90%+ pre-multidomain readiness`, but trustworthy semantic integrity inside already-enabled contours plus broader open-world bounded autonomy over 1C evidence.
- the practical next target is no longer Post-F rescue itself; it is broader open-world bounded autonomy over 1C evidence while preserving the Post-F semantic-integrity invariants as regression gates.
- from this point onward, readiness must be judged not only by route truth and replay pass rate, but also by whether already-supported questions stay semantically correct through stale memory, pivots, clarifications, and mixed scope resets.
For the detailed audit, current percentages, and remaining debt, read:
@ -152,6 +159,7 @@ Read in this order:
16. `15 - mcp_bounded_autonomy_reset_plan_2026-04-21.md`
17. `16 - data_need_graph_and_open_world_mcp_plan_2026-04-22.md`
18. `17 - post_f_semantic_integrity_hardening_2026-04-23.md`
19. `18 - post_f_code_documentation_sync_2026-04-24.md`
## Planning Rules
@ -171,16 +179,16 @@ and start being described as:
- "a stateful exact-data assistant with explicit transition contracts and isolated truth gating."
As of `2026-04-23`, the project is already materially closer to the target description and is no longer in the same acute collapse state. The remaining blocker is no longer the original continuity failure itself, and no longer only the A/B/C or D/E/F build-out. The active blocker is now the combination of:
As of `2026-04-24`, the project is already materially closer to the target description and is no longer in the same acute collapse state. The remaining blocker is no longer the original continuity failure itself, no longer the A/B/C or D/E/F build-out, and no longer the first Post-F rescue slice. The active blocker is now the combination of:
- unfinished convergence from reviewed bounded MCP chains toward broader open-world autonomy;
- semantic integrity hardening on already-enabled contours, especially where stale scope, repeated pivots, or post-pivot arbitration can still produce a business-wrong answer.
- continued use of Post-F semantic integrity invariants as regression gates while that breadth grows.
The biggest remaining blockers are:
- broader open-world primitive search is still narrower than the future arbitrary 1C blast radius;
- dynamic schema traversal is still not broad enough for many unfamiliar 1C asks outside the repaired families;
- semantic integrity hardening is still needed on stale scope contamination, repeated pivots, and already-supported but semantically fragile follow-up chains;
- new stale-scope or post-pivot seams may still appear in future breadth work and must be treated as regression-gated semantic defects, not as wording polish;
- residual `assistantService` overload;
- central intent pressure in `resolveAddressIntent()`;
- semantic robustness gaps where already-supported questions can still look broken to a human user because of typo sensitivity, short follow-up retarget loss, or human-answer mismatch.
- semantic robustness gaps may still appear where already-supported questions meet new wording, typo pressure, short follow-up retargets, or human-answer mismatch.