diff --git a/docs/ARCH/11 - architecture_turnaround/20 - planner_autonomy_consolidation_2026-05-01.md b/docs/ARCH/11 - architecture_turnaround/20 - planner_autonomy_consolidation_2026-05-01.md index 3b2a99d..0c2c3de 100644 --- a/docs/ARCH/11 - architecture_turnaround/20 - planner_autonomy_consolidation_2026-05-01.md +++ b/docs/ARCH/11 - architecture_turnaround/20 - planner_autonomy_consolidation_2026-05-01.md @@ -120,6 +120,7 @@ The following consolidation step added catalog-level chain-template scoring: - the ranked chain-template matches are now propagated into runtime loop state and debug attachment fields, so replay analysis can inspect catalog-fabric intent without parsing reason-code strings. - `catalog_chain_template_alignment` now records whether the selected chain is the top catalog match, its rank, and whether it appeared in the catalog search results; runtime loop state and debug summary expose the same verdict. - planner reason codes now emit stable catalog-alignment telemetry for evaluated top-match, selected-equals-top, selected-lower-rank, selected-outside-match-set, and unscored selected-chain states. +- `catalog_chain_template_alignment.alignment_status` now carries the same verdict as one enum-like field, and debug summary exposes it as `mcp_discovery_catalog_chain_alignment_status`. ## Why This Matters @@ -243,9 +244,16 @@ Latest validation after catalog-alignment reason-code telemetry: - `npm.cmd run build`: passed - graphify rebuild: `5943 nodes`, `12915 edges`, `136 communities` +Latest validation after explicit catalog-alignment status propagation: + +- targeted planner/runtime/debug tests: passed, `55 passed` +- full MCP-discovery suite: passed, `283 passed`, `9 skipped` +- `npm.cmd run build`: passed +- graphify rebuild: `5943 nodes`, `12915 edges`, `136 communities` + ## Next Step -The next safe step is still to re-run live replay once the 1C side is actively polling the proxy. In parallel, local-only consolidation can continue by using the alignment verdict and reason-code telemetry to find remaining manual branches where selected chains diverge from reviewed catalog-fabric intent. +The next safe step is still to re-run live replay once the 1C side is actively polling the proxy. In parallel, local-only consolidation can continue by using `alignment_status`, alignment reason-code telemetry, and the representative guard to find remaining manual branches where selected chains diverge from reviewed catalog-fabric intent. Recommended order: diff --git a/docs/ARCH/11 - architecture_turnaround/README.md b/docs/ARCH/11 - architecture_turnaround/README.md index 4230504..9cb1055 100644 --- a/docs/ARCH/11 - architecture_turnaround/README.md +++ b/docs/ARCH/11 - architecture_turnaround/README.md @@ -83,6 +83,7 @@ It now documents a turnaround that is already operational in code, already mater - catalog index now scores reviewed chain templates directly from fact/action/axis/comparison/ranking needs, and planner/runtime/debug surfaces expose ranked catalog chain matches through the structured `catalog_chain_template_matches` contract path instead of relying only on reason-code strings; - planner/runtime/debug surfaces now expose `catalog_chain_template_alignment`, so semantic replay can see whether selected chains match the catalog top match, fall back to a lower-ranked template, or bypass catalog search; - planner reason codes now also emit stable catalog-alignment telemetry, so automated replay review can filter top-match, lower-rank, outside-match, and unscored selected-chain states without hand-parsing debug JSON; + - catalog-alignment now carries a single `alignment_status` verdict through planner/runtime/debug, making replay divergence detection explicit instead of reconstructing it from booleans; - explicit-counterparty incoming-vs-outgoing data-need graphs now select the reviewed `value_flow_comparison` chain instead of falling back to generic `value_flow`; - live map sync: [20 - planner_autonomy_consolidation_2026-05-01.md](./20%20-%20planner_autonomy_consolidation_2026-05-01.md) @@ -95,7 +96,7 @@ Current honest status: - open-world bounded-autonomy readiness: `~85%` - Post-F semantic integrity module progress: `~99%` operationally closed, with remaining risk now treated as next-slice discovery rather than an open blocker inside the closed slice - active inventory-stock breadth slice progress: `100%` for the declared scenario pack, not for arbitrary inventory questions -- Planner Autonomy Consolidation progress: `~86%` for the declared module, with catalog-fabric, value-flow arbitration, lifecycle bounded inference, broad-evaluation bridge, inventory catalog templates, inventory runtime-boundary honesty, exact inventory recipe bridging, unambiguous metadata-surface lane inference, catalog chain-template scoring, structured chain-match contract exposure, runtime/debug propagation, subject-aware bidirectional comparison arbitration, structured catalog-alignment verdicts, representative alignment regression guard, and catalog-alignment reason-code telemetry validated locally, but live replay for the new bridge is currently blocked by missing active 1C polling and broader unfamiliar 1C asks still need replay-backed growth +- Planner Autonomy Consolidation progress: `~87%` for the declared module, with catalog-fabric, value-flow arbitration, lifecycle bounded inference, broad-evaluation bridge, inventory catalog templates, inventory runtime-boundary honesty, exact inventory recipe bridging, unambiguous metadata-surface lane inference, catalog chain-template scoring, structured chain-match contract exposure, runtime/debug propagation, subject-aware bidirectional comparison arbitration, structured catalog-alignment verdicts, representative alignment regression guard, catalog-alignment reason-code telemetry, and explicit `alignment_status` propagation validated locally, but live replay for the new bridge is currently blocked by missing active 1C polling and broader unfamiliar 1C asks still need replay-backed growth - graph snapshot after latest rebuild: `5943 nodes`, `12915 edges`, `136 communities` - current breakpoint: - the validated hot paths are no longer structurally broken; @@ -148,6 +149,7 @@ Latest live proof now includes: - structured catalog-alignment verdict accepted locally: planner/runtime/debug slice passed `54/54`; full MCP-discovery slice passed `282/282` with `9` skipped; build passed; graphify rebuilt to `5941 nodes`, `12911 edges`, `136 communities` - representative catalog-alignment regression guard accepted locally: planner slice passed `37/37`; full MCP-discovery slice passed `283/283` with `9` skipped; build passed; graphify rebuilt to `5942 nodes`, `12912 edges`, `140 communities` - catalog-alignment reason-code telemetry accepted locally: planner/runtime slice passed `53/53`; full MCP-discovery suite passed `283/283` with `9` skipped; build passed; graphify rebuilt to `5943 nodes`, `12915 edges`, `136 communities` +- catalog-alignment status verdict accepted locally: planner/runtime/debug slice passed `55/55`; full MCP-discovery suite passed `283/283` with `9` skipped; build passed; graphify rebuilt to `5943 nodes`, `12915 edges`, `136 communities` Current architectural reading: diff --git a/llm_normalizer/backend/dist/services/assistantMcpDiscoveryDebugAttachment.js b/llm_normalizer/backend/dist/services/assistantMcpDiscoveryDebugAttachment.js index a30f397..ebe70e6 100644 --- a/llm_normalizer/backend/dist/services/assistantMcpDiscoveryDebugAttachment.js +++ b/llm_normalizer/backend/dist/services/assistantMcpDiscoveryDebugAttachment.js @@ -56,6 +56,7 @@ function buildAssistantMcpDiscoveryDebugAttachmentFields(input) { mcp_discovery_bridge_status: toNonEmptyString(bridge?.bridge_status), mcp_discovery_selected_chain_id: toNonEmptyString(planner?.selected_chain_id), mcp_discovery_catalog_chain_template_matches: toStringArray(planner?.catalog_chain_template_matches), + mcp_discovery_catalog_chain_alignment_status: toNonEmptyString(chainAlignment?.alignment_status), mcp_discovery_catalog_chain_top_match: toNonEmptyString(chainAlignment?.top_chain_template_match), mcp_discovery_catalog_chain_selected_matches_top: chainAlignment?.selected_chain_matches_top === true, mcp_discovery_answer_mode: toNonEmptyString(answerDraft?.answer_mode), diff --git a/llm_normalizer/backend/dist/services/assistantMcpDiscoveryPlanner.js b/llm_normalizer/backend/dist/services/assistantMcpDiscoveryPlanner.js index 2fd39eb..24ebdc5 100644 --- a/llm_normalizer/backend/dist/services/assistantMcpDiscoveryPlanner.js +++ b/llm_normalizer/backend/dist/services/assistantMcpDiscoveryPlanner.js @@ -41,20 +41,22 @@ function pushAllUnique(target, values) { } } function pushCatalogChainTemplateAlignmentReasons(target, alignment) { - if (alignment.top_chain_template_match) { + if (alignment.alignment_status === "selected_matches_top") { pushReason(target, "planner_catalog_chain_template_alignment_evaluated"); - if (alignment.selected_chain_matches_top) { - pushReason(target, "planner_selected_chain_matches_catalog_top"); - } - else if (alignment.selected_chain_in_catalog_matches) { - pushReason(target, "planner_selected_chain_uses_lower_rank_catalog_match"); - } - else { - pushReason(target, "planner_selected_chain_outside_catalog_match_set"); - } + pushReason(target, "planner_selected_chain_matches_catalog_top"); return; } - if (alignment.selected_chain_is_catalog_template) { + if (alignment.alignment_status === "selected_lower_rank") { + pushReason(target, "planner_catalog_chain_template_alignment_evaluated"); + pushReason(target, "planner_selected_chain_uses_lower_rank_catalog_match"); + return; + } + if (alignment.alignment_status === "selected_outside_match_set") { + pushReason(target, "planner_catalog_chain_template_alignment_evaluated"); + pushReason(target, "planner_selected_chain_outside_catalog_match_set"); + return; + } + if (alignment.alignment_status === "selected_unscored") { pushReason(target, "planner_catalog_chain_template_alignment_unscored"); } } @@ -380,7 +382,17 @@ function catalogChainTemplateMatchesForContract(input, recipe) { function catalogChainTemplateAlignmentForContract(recipe, matches) { const selectedChainIsCatalogTemplate = recipe.chainId !== "metadata_lane_clarification"; const selectedIndex = matches.indexOf(recipe.chainId); + const alignmentStatus = !selectedChainIsCatalogTemplate + ? "not_catalog_template" + : matches.length <= 0 + ? "selected_unscored" + : selectedIndex === 0 + ? "selected_matches_top" + : selectedIndex > 0 + ? "selected_lower_rank" + : "selected_outside_match_set"; return { + alignment_status: alignmentStatus, top_chain_template_match: matches[0] ?? null, selected_chain_template_rank: selectedIndex >= 0 ? selectedIndex + 1 : null, selected_chain_is_catalog_template: selectedChainIsCatalogTemplate, diff --git a/llm_normalizer/backend/src/services/assistantMcpDiscoveryDebugAttachment.ts b/llm_normalizer/backend/src/services/assistantMcpDiscoveryDebugAttachment.ts index 27b5fc6..43c15a2 100644 --- a/llm_normalizer/backend/src/services/assistantMcpDiscoveryDebugAttachment.ts +++ b/llm_normalizer/backend/src/services/assistantMcpDiscoveryDebugAttachment.ts @@ -8,6 +8,7 @@ export interface AssistantMcpDiscoveryDebugAttachmentFields { mcp_discovery_bridge_status: string | null; mcp_discovery_selected_chain_id: string | null; mcp_discovery_catalog_chain_template_matches: string[]; + mcp_discovery_catalog_chain_alignment_status: string | null; mcp_discovery_catalog_chain_top_match: string | null; mcp_discovery_catalog_chain_selected_matches_top: boolean; mcp_discovery_answer_mode: string | null; @@ -86,6 +87,7 @@ export function buildAssistantMcpDiscoveryDebugAttachmentFields( mcp_discovery_bridge_status: toNonEmptyString(bridge?.bridge_status), mcp_discovery_selected_chain_id: toNonEmptyString(planner?.selected_chain_id), mcp_discovery_catalog_chain_template_matches: toStringArray(planner?.catalog_chain_template_matches), + mcp_discovery_catalog_chain_alignment_status: toNonEmptyString(chainAlignment?.alignment_status), mcp_discovery_catalog_chain_top_match: toNonEmptyString(chainAlignment?.top_chain_template_match), mcp_discovery_catalog_chain_selected_matches_top: chainAlignment?.selected_chain_matches_top === true, mcp_discovery_answer_mode: toNonEmptyString(answerDraft?.answer_mode), diff --git a/llm_normalizer/backend/src/services/assistantMcpDiscoveryPlanner.ts b/llm_normalizer/backend/src/services/assistantMcpDiscoveryPlanner.ts index 2f955a2..fa1c8fd 100644 --- a/llm_normalizer/backend/src/services/assistantMcpDiscoveryPlanner.ts +++ b/llm_normalizer/backend/src/services/assistantMcpDiscoveryPlanner.ts @@ -39,6 +39,7 @@ export interface AssistantMcpDiscoveryMetadataSurfaceRef { } export interface AssistantMcpDiscoveryCatalogChainTemplateAlignment { + alignment_status: AssistantMcpDiscoveryCatalogChainTemplateAlignmentStatus; top_chain_template_match: AssistantMcpCatalogChainTemplateId | null; selected_chain_template_rank: number | null; selected_chain_is_catalog_template: boolean; @@ -46,6 +47,13 @@ export interface AssistantMcpDiscoveryCatalogChainTemplateAlignment { selected_chain_matches_top: boolean; } +export type AssistantMcpDiscoveryCatalogChainTemplateAlignmentStatus = + | "selected_matches_top" + | "selected_lower_rank" + | "selected_outside_match_set" + | "selected_unscored" + | "not_catalog_template"; + export type AssistantMcpDiscoveryChainId = | "metadata_inspection" | "catalog_drilldown" @@ -146,19 +154,22 @@ function pushCatalogChainTemplateAlignmentReasons( target: string[], alignment: AssistantMcpDiscoveryCatalogChainTemplateAlignment ): void { - if (alignment.top_chain_template_match) { + if (alignment.alignment_status === "selected_matches_top") { pushReason(target, "planner_catalog_chain_template_alignment_evaluated"); - if (alignment.selected_chain_matches_top) { - pushReason(target, "planner_selected_chain_matches_catalog_top"); - } else if (alignment.selected_chain_in_catalog_matches) { - pushReason(target, "planner_selected_chain_uses_lower_rank_catalog_match"); - } else { - pushReason(target, "planner_selected_chain_outside_catalog_match_set"); - } + pushReason(target, "planner_selected_chain_matches_catalog_top"); return; } - - if (alignment.selected_chain_is_catalog_template) { + if (alignment.alignment_status === "selected_lower_rank") { + pushReason(target, "planner_catalog_chain_template_alignment_evaluated"); + pushReason(target, "planner_selected_chain_uses_lower_rank_catalog_match"); + return; + } + if (alignment.alignment_status === "selected_outside_match_set") { + pushReason(target, "planner_catalog_chain_template_alignment_evaluated"); + pushReason(target, "planner_selected_chain_outside_catalog_match_set"); + return; + } + if (alignment.alignment_status === "selected_unscored") { pushReason(target, "planner_catalog_chain_template_alignment_unscored"); } } @@ -591,7 +602,17 @@ function catalogChainTemplateAlignmentForContract( ): AssistantMcpDiscoveryCatalogChainTemplateAlignment { const selectedChainIsCatalogTemplate = recipe.chainId !== "metadata_lane_clarification"; const selectedIndex = matches.indexOf(recipe.chainId as AssistantMcpCatalogChainTemplateId); + const alignmentStatus: AssistantMcpDiscoveryCatalogChainTemplateAlignmentStatus = !selectedChainIsCatalogTemplate + ? "not_catalog_template" + : matches.length <= 0 + ? "selected_unscored" + : selectedIndex === 0 + ? "selected_matches_top" + : selectedIndex > 0 + ? "selected_lower_rank" + : "selected_outside_match_set"; return { + alignment_status: alignmentStatus, top_chain_template_match: matches[0] ?? null, selected_chain_template_rank: selectedIndex >= 0 ? selectedIndex + 1 : null, selected_chain_is_catalog_template: selectedChainIsCatalogTemplate, diff --git a/llm_normalizer/backend/tests/assistantMcpDiscoveryDebugAttachment.test.ts b/llm_normalizer/backend/tests/assistantMcpDiscoveryDebugAttachment.test.ts index 5e62f8d..d3241a1 100644 --- a/llm_normalizer/backend/tests/assistantMcpDiscoveryDebugAttachment.test.ts +++ b/llm_normalizer/backend/tests/assistantMcpDiscoveryDebugAttachment.test.ts @@ -18,6 +18,7 @@ function entryPointContract(overrides: Record = {}) { selected_chain_id: "value_flow_ranking", catalog_chain_template_matches: ["value_flow_ranking", "value_flow"], catalog_chain_template_alignment: { + alignment_status: "selected_matches_top", top_chain_template_match: "value_flow_ranking", selected_chain_template_rank: 1, selected_chain_is_catalog_template: true, @@ -50,6 +51,7 @@ describe("assistant MCP discovery debug attachment", () => { expect(debug.mcp_discovery_bridge_status).toBe("answer_draft_ready"); expect(debug.mcp_discovery_selected_chain_id).toBe("value_flow_ranking"); expect(debug.mcp_discovery_catalog_chain_template_matches).toEqual(["value_flow_ranking", "value_flow"]); + expect(debug.mcp_discovery_catalog_chain_alignment_status).toBe("selected_matches_top"); expect(debug.mcp_discovery_catalog_chain_top_match).toBe("value_flow_ranking"); expect(debug.mcp_discovery_catalog_chain_selected_matches_top).toBe(true); expect(debug.mcp_discovery_answer_mode).toBe("confirmed_with_bounded_inference"); @@ -71,6 +73,7 @@ describe("assistant MCP discovery debug attachment", () => { expect(debug.mcp_discovery_bridge_status).toBeNull(); expect(debug.mcp_discovery_selected_chain_id).toBeNull(); expect(debug.mcp_discovery_catalog_chain_template_matches).toEqual([]); + expect(debug.mcp_discovery_catalog_chain_alignment_status).toBeNull(); expect(debug.mcp_discovery_catalog_chain_top_match).toBeNull(); expect(debug.mcp_discovery_catalog_chain_selected_matches_top).toBe(false); expect(debug.mcp_discovery_answer_mode).toBeNull(); diff --git a/llm_normalizer/backend/tests/assistantMcpDiscoveryPlanner.test.ts b/llm_normalizer/backend/tests/assistantMcpDiscoveryPlanner.test.ts index 35a0040..ea09638 100644 --- a/llm_normalizer/backend/tests/assistantMcpDiscoveryPlanner.test.ts +++ b/llm_normalizer/backend/tests/assistantMcpDiscoveryPlanner.test.ts @@ -49,6 +49,7 @@ describe("assistant MCP discovery planner", () => { expect(result.data_need_graph?.business_fact_family).toBe("value_flow"); expect(result.catalog_chain_template_matches[0]).toBe("value_flow"); expect(result.catalog_chain_template_alignment).toMatchObject({ + alignment_status: "selected_matches_top", top_chain_template_match: "value_flow", selected_chain_template_rank: 1, selected_chain_is_catalog_template: true, @@ -200,6 +201,7 @@ describe("assistant MCP discovery planner", () => { for (const item of cases) { const result = planAssistantMcpDiscovery(item.input); expect(result.selected_chain_id, item.name).toBe(item.expected); + expect(result.catalog_chain_template_alignment.alignment_status, item.name).toBe("selected_matches_top"); expect(result.catalog_chain_template_alignment.top_chain_template_match, item.name).toBe(item.expected); expect(result.catalog_chain_template_alignment.selected_chain_template_rank, item.name).toBe(1); expect(result.catalog_chain_template_alignment.selected_chain_matches_top, item.name).toBe(true); @@ -1098,6 +1100,7 @@ describe("assistant MCP discovery planner", () => { expect(result.discovery_plan.plan_status).toBe("needs_clarification"); expect(result.catalog_chain_template_matches).toEqual([]); expect(result.catalog_chain_template_alignment).toMatchObject({ + alignment_status: "selected_unscored", top_chain_template_match: null, selected_chain_template_rank: null, selected_chain_in_catalog_matches: false, diff --git a/llm_normalizer/backend/tests/assistantMcpDiscoveryRuntimeBridge.test.ts b/llm_normalizer/backend/tests/assistantMcpDiscoveryRuntimeBridge.test.ts index 6e6a6c8..a861d7d 100644 --- a/llm_normalizer/backend/tests/assistantMcpDiscoveryRuntimeBridge.test.ts +++ b/llm_normalizer/backend/tests/assistantMcpDiscoveryRuntimeBridge.test.ts @@ -148,6 +148,7 @@ describe("assistant MCP discovery runtime bridge", () => { expect(result.loop_state.pending_axes).toContain("organization"); expect(result.loop_state.provided_axes).toContain("aggregate_axis"); expect(result.loop_state.catalog_chain_template_matches[0]).toBe("value_flow_ranking"); + expect(result.loop_state.catalog_chain_template_alignment.alignment_status).toBe("selected_matches_top"); expect(result.loop_state.catalog_chain_template_alignment.selected_chain_matches_top).toBe(true); expect(result.reason_codes).toContain("planner_selected_chain_matches_catalog_top"); expect(result.reason_codes).toContain("runtime_bridge_loop_state_awaiting_clarification");