Planner Autonomy: добавить статус catalog-alignment
This commit is contained in:
parent
755e2521e8
commit
8b5104a2c6
|
|
@ -120,6 +120,7 @@ The following consolidation step added catalog-level chain-template scoring:
|
|||
- the ranked chain-template matches are now propagated into runtime loop state and debug attachment fields, so replay analysis can inspect catalog-fabric intent without parsing reason-code strings.
|
||||
- `catalog_chain_template_alignment` now records whether the selected chain is the top catalog match, its rank, and whether it appeared in the catalog search results; runtime loop state and debug summary expose the same verdict.
|
||||
- planner reason codes now emit stable catalog-alignment telemetry for evaluated top-match, selected-equals-top, selected-lower-rank, selected-outside-match-set, and unscored selected-chain states.
|
||||
- `catalog_chain_template_alignment.alignment_status` now carries the same verdict as one enum-like field, and debug summary exposes it as `mcp_discovery_catalog_chain_alignment_status`.
|
||||
|
||||
## Why This Matters
|
||||
|
||||
|
|
@ -243,9 +244,16 @@ Latest validation after catalog-alignment reason-code telemetry:
|
|||
- `npm.cmd run build`: passed
|
||||
- graphify rebuild: `5943 nodes`, `12915 edges`, `136 communities`
|
||||
|
||||
Latest validation after explicit catalog-alignment status propagation:
|
||||
|
||||
- targeted planner/runtime/debug tests: passed, `55 passed`
|
||||
- full MCP-discovery suite: passed, `283 passed`, `9 skipped`
|
||||
- `npm.cmd run build`: passed
|
||||
- graphify rebuild: `5943 nodes`, `12915 edges`, `136 communities`
|
||||
|
||||
## Next Step
|
||||
|
||||
The next safe step is still to re-run live replay once the 1C side is actively polling the proxy. In parallel, local-only consolidation can continue by using the alignment verdict and reason-code telemetry to find remaining manual branches where selected chains diverge from reviewed catalog-fabric intent.
|
||||
The next safe step is still to re-run live replay once the 1C side is actively polling the proxy. In parallel, local-only consolidation can continue by using `alignment_status`, alignment reason-code telemetry, and the representative guard to find remaining manual branches where selected chains diverge from reviewed catalog-fabric intent.
|
||||
|
||||
Recommended order:
|
||||
|
||||
|
|
|
|||
|
|
@ -83,6 +83,7 @@ It now documents a turnaround that is already operational in code, already mater
|
|||
- catalog index now scores reviewed chain templates directly from fact/action/axis/comparison/ranking needs, and planner/runtime/debug surfaces expose ranked catalog chain matches through the structured `catalog_chain_template_matches` contract path instead of relying only on reason-code strings;
|
||||
- planner/runtime/debug surfaces now expose `catalog_chain_template_alignment`, so semantic replay can see whether selected chains match the catalog top match, fall back to a lower-ranked template, or bypass catalog search;
|
||||
- planner reason codes now also emit stable catalog-alignment telemetry, so automated replay review can filter top-match, lower-rank, outside-match, and unscored selected-chain states without hand-parsing debug JSON;
|
||||
- catalog-alignment now carries a single `alignment_status` verdict through planner/runtime/debug, making replay divergence detection explicit instead of reconstructing it from booleans;
|
||||
- explicit-counterparty incoming-vs-outgoing data-need graphs now select the reviewed `value_flow_comparison` chain instead of falling back to generic `value_flow`;
|
||||
- live map sync: [20 - planner_autonomy_consolidation_2026-05-01.md](./20%20-%20planner_autonomy_consolidation_2026-05-01.md)
|
||||
|
||||
|
|
@ -95,7 +96,7 @@ Current honest status:
|
|||
- open-world bounded-autonomy readiness: `~85%`
|
||||
- Post-F semantic integrity module progress: `~99%` operationally closed, with remaining risk now treated as next-slice discovery rather than an open blocker inside the closed slice
|
||||
- active inventory-stock breadth slice progress: `100%` for the declared scenario pack, not for arbitrary inventory questions
|
||||
- Planner Autonomy Consolidation progress: `~86%` for the declared module, with catalog-fabric, value-flow arbitration, lifecycle bounded inference, broad-evaluation bridge, inventory catalog templates, inventory runtime-boundary honesty, exact inventory recipe bridging, unambiguous metadata-surface lane inference, catalog chain-template scoring, structured chain-match contract exposure, runtime/debug propagation, subject-aware bidirectional comparison arbitration, structured catalog-alignment verdicts, representative alignment regression guard, and catalog-alignment reason-code telemetry validated locally, but live replay for the new bridge is currently blocked by missing active 1C polling and broader unfamiliar 1C asks still need replay-backed growth
|
||||
- Planner Autonomy Consolidation progress: `~87%` for the declared module, with catalog-fabric, value-flow arbitration, lifecycle bounded inference, broad-evaluation bridge, inventory catalog templates, inventory runtime-boundary honesty, exact inventory recipe bridging, unambiguous metadata-surface lane inference, catalog chain-template scoring, structured chain-match contract exposure, runtime/debug propagation, subject-aware bidirectional comparison arbitration, structured catalog-alignment verdicts, representative alignment regression guard, catalog-alignment reason-code telemetry, and explicit `alignment_status` propagation validated locally, but live replay for the new bridge is currently blocked by missing active 1C polling and broader unfamiliar 1C asks still need replay-backed growth
|
||||
- graph snapshot after latest rebuild: `5943 nodes`, `12915 edges`, `136 communities`
|
||||
- current breakpoint:
|
||||
- the validated hot paths are no longer structurally broken;
|
||||
|
|
@ -148,6 +149,7 @@ Latest live proof now includes:
|
|||
- structured catalog-alignment verdict accepted locally: planner/runtime/debug slice passed `54/54`; full MCP-discovery slice passed `282/282` with `9` skipped; build passed; graphify rebuilt to `5941 nodes`, `12911 edges`, `136 communities`
|
||||
- representative catalog-alignment regression guard accepted locally: planner slice passed `37/37`; full MCP-discovery slice passed `283/283` with `9` skipped; build passed; graphify rebuilt to `5942 nodes`, `12912 edges`, `140 communities`
|
||||
- catalog-alignment reason-code telemetry accepted locally: planner/runtime slice passed `53/53`; full MCP-discovery suite passed `283/283` with `9` skipped; build passed; graphify rebuilt to `5943 nodes`, `12915 edges`, `136 communities`
|
||||
- catalog-alignment status verdict accepted locally: planner/runtime/debug slice passed `55/55`; full MCP-discovery suite passed `283/283` with `9` skipped; build passed; graphify rebuilt to `5943 nodes`, `12915 edges`, `136 communities`
|
||||
|
||||
Current architectural reading:
|
||||
|
||||
|
|
|
|||
|
|
@ -56,6 +56,7 @@ function buildAssistantMcpDiscoveryDebugAttachmentFields(input) {
|
|||
mcp_discovery_bridge_status: toNonEmptyString(bridge?.bridge_status),
|
||||
mcp_discovery_selected_chain_id: toNonEmptyString(planner?.selected_chain_id),
|
||||
mcp_discovery_catalog_chain_template_matches: toStringArray(planner?.catalog_chain_template_matches),
|
||||
mcp_discovery_catalog_chain_alignment_status: toNonEmptyString(chainAlignment?.alignment_status),
|
||||
mcp_discovery_catalog_chain_top_match: toNonEmptyString(chainAlignment?.top_chain_template_match),
|
||||
mcp_discovery_catalog_chain_selected_matches_top: chainAlignment?.selected_chain_matches_top === true,
|
||||
mcp_discovery_answer_mode: toNonEmptyString(answerDraft?.answer_mode),
|
||||
|
|
|
|||
|
|
@ -41,20 +41,22 @@ function pushAllUnique(target, values) {
|
|||
}
|
||||
}
|
||||
function pushCatalogChainTemplateAlignmentReasons(target, alignment) {
|
||||
if (alignment.top_chain_template_match) {
|
||||
if (alignment.alignment_status === "selected_matches_top") {
|
||||
pushReason(target, "planner_catalog_chain_template_alignment_evaluated");
|
||||
if (alignment.selected_chain_matches_top) {
|
||||
pushReason(target, "planner_selected_chain_matches_catalog_top");
|
||||
}
|
||||
else if (alignment.selected_chain_in_catalog_matches) {
|
||||
pushReason(target, "planner_selected_chain_uses_lower_rank_catalog_match");
|
||||
}
|
||||
else {
|
||||
pushReason(target, "planner_selected_chain_outside_catalog_match_set");
|
||||
}
|
||||
pushReason(target, "planner_selected_chain_matches_catalog_top");
|
||||
return;
|
||||
}
|
||||
if (alignment.selected_chain_is_catalog_template) {
|
||||
if (alignment.alignment_status === "selected_lower_rank") {
|
||||
pushReason(target, "planner_catalog_chain_template_alignment_evaluated");
|
||||
pushReason(target, "planner_selected_chain_uses_lower_rank_catalog_match");
|
||||
return;
|
||||
}
|
||||
if (alignment.alignment_status === "selected_outside_match_set") {
|
||||
pushReason(target, "planner_catalog_chain_template_alignment_evaluated");
|
||||
pushReason(target, "planner_selected_chain_outside_catalog_match_set");
|
||||
return;
|
||||
}
|
||||
if (alignment.alignment_status === "selected_unscored") {
|
||||
pushReason(target, "planner_catalog_chain_template_alignment_unscored");
|
||||
}
|
||||
}
|
||||
|
|
@ -380,7 +382,17 @@ function catalogChainTemplateMatchesForContract(input, recipe) {
|
|||
function catalogChainTemplateAlignmentForContract(recipe, matches) {
|
||||
const selectedChainIsCatalogTemplate = recipe.chainId !== "metadata_lane_clarification";
|
||||
const selectedIndex = matches.indexOf(recipe.chainId);
|
||||
const alignmentStatus = !selectedChainIsCatalogTemplate
|
||||
? "not_catalog_template"
|
||||
: matches.length <= 0
|
||||
? "selected_unscored"
|
||||
: selectedIndex === 0
|
||||
? "selected_matches_top"
|
||||
: selectedIndex > 0
|
||||
? "selected_lower_rank"
|
||||
: "selected_outside_match_set";
|
||||
return {
|
||||
alignment_status: alignmentStatus,
|
||||
top_chain_template_match: matches[0] ?? null,
|
||||
selected_chain_template_rank: selectedIndex >= 0 ? selectedIndex + 1 : null,
|
||||
selected_chain_is_catalog_template: selectedChainIsCatalogTemplate,
|
||||
|
|
|
|||
|
|
@ -8,6 +8,7 @@ export interface AssistantMcpDiscoveryDebugAttachmentFields {
|
|||
mcp_discovery_bridge_status: string | null;
|
||||
mcp_discovery_selected_chain_id: string | null;
|
||||
mcp_discovery_catalog_chain_template_matches: string[];
|
||||
mcp_discovery_catalog_chain_alignment_status: string | null;
|
||||
mcp_discovery_catalog_chain_top_match: string | null;
|
||||
mcp_discovery_catalog_chain_selected_matches_top: boolean;
|
||||
mcp_discovery_answer_mode: string | null;
|
||||
|
|
@ -86,6 +87,7 @@ export function buildAssistantMcpDiscoveryDebugAttachmentFields(
|
|||
mcp_discovery_bridge_status: toNonEmptyString(bridge?.bridge_status),
|
||||
mcp_discovery_selected_chain_id: toNonEmptyString(planner?.selected_chain_id),
|
||||
mcp_discovery_catalog_chain_template_matches: toStringArray(planner?.catalog_chain_template_matches),
|
||||
mcp_discovery_catalog_chain_alignment_status: toNonEmptyString(chainAlignment?.alignment_status),
|
||||
mcp_discovery_catalog_chain_top_match: toNonEmptyString(chainAlignment?.top_chain_template_match),
|
||||
mcp_discovery_catalog_chain_selected_matches_top: chainAlignment?.selected_chain_matches_top === true,
|
||||
mcp_discovery_answer_mode: toNonEmptyString(answerDraft?.answer_mode),
|
||||
|
|
|
|||
|
|
@ -39,6 +39,7 @@ export interface AssistantMcpDiscoveryMetadataSurfaceRef {
|
|||
}
|
||||
|
||||
export interface AssistantMcpDiscoveryCatalogChainTemplateAlignment {
|
||||
alignment_status: AssistantMcpDiscoveryCatalogChainTemplateAlignmentStatus;
|
||||
top_chain_template_match: AssistantMcpCatalogChainTemplateId | null;
|
||||
selected_chain_template_rank: number | null;
|
||||
selected_chain_is_catalog_template: boolean;
|
||||
|
|
@ -46,6 +47,13 @@ export interface AssistantMcpDiscoveryCatalogChainTemplateAlignment {
|
|||
selected_chain_matches_top: boolean;
|
||||
}
|
||||
|
||||
export type AssistantMcpDiscoveryCatalogChainTemplateAlignmentStatus =
|
||||
| "selected_matches_top"
|
||||
| "selected_lower_rank"
|
||||
| "selected_outside_match_set"
|
||||
| "selected_unscored"
|
||||
| "not_catalog_template";
|
||||
|
||||
export type AssistantMcpDiscoveryChainId =
|
||||
| "metadata_inspection"
|
||||
| "catalog_drilldown"
|
||||
|
|
@ -146,19 +154,22 @@ function pushCatalogChainTemplateAlignmentReasons(
|
|||
target: string[],
|
||||
alignment: AssistantMcpDiscoveryCatalogChainTemplateAlignment
|
||||
): void {
|
||||
if (alignment.top_chain_template_match) {
|
||||
if (alignment.alignment_status === "selected_matches_top") {
|
||||
pushReason(target, "planner_catalog_chain_template_alignment_evaluated");
|
||||
if (alignment.selected_chain_matches_top) {
|
||||
pushReason(target, "planner_selected_chain_matches_catalog_top");
|
||||
} else if (alignment.selected_chain_in_catalog_matches) {
|
||||
pushReason(target, "planner_selected_chain_uses_lower_rank_catalog_match");
|
||||
} else {
|
||||
pushReason(target, "planner_selected_chain_outside_catalog_match_set");
|
||||
}
|
||||
pushReason(target, "planner_selected_chain_matches_catalog_top");
|
||||
return;
|
||||
}
|
||||
|
||||
if (alignment.selected_chain_is_catalog_template) {
|
||||
if (alignment.alignment_status === "selected_lower_rank") {
|
||||
pushReason(target, "planner_catalog_chain_template_alignment_evaluated");
|
||||
pushReason(target, "planner_selected_chain_uses_lower_rank_catalog_match");
|
||||
return;
|
||||
}
|
||||
if (alignment.alignment_status === "selected_outside_match_set") {
|
||||
pushReason(target, "planner_catalog_chain_template_alignment_evaluated");
|
||||
pushReason(target, "planner_selected_chain_outside_catalog_match_set");
|
||||
return;
|
||||
}
|
||||
if (alignment.alignment_status === "selected_unscored") {
|
||||
pushReason(target, "planner_catalog_chain_template_alignment_unscored");
|
||||
}
|
||||
}
|
||||
|
|
@ -591,7 +602,17 @@ function catalogChainTemplateAlignmentForContract(
|
|||
): AssistantMcpDiscoveryCatalogChainTemplateAlignment {
|
||||
const selectedChainIsCatalogTemplate = recipe.chainId !== "metadata_lane_clarification";
|
||||
const selectedIndex = matches.indexOf(recipe.chainId as AssistantMcpCatalogChainTemplateId);
|
||||
const alignmentStatus: AssistantMcpDiscoveryCatalogChainTemplateAlignmentStatus = !selectedChainIsCatalogTemplate
|
||||
? "not_catalog_template"
|
||||
: matches.length <= 0
|
||||
? "selected_unscored"
|
||||
: selectedIndex === 0
|
||||
? "selected_matches_top"
|
||||
: selectedIndex > 0
|
||||
? "selected_lower_rank"
|
||||
: "selected_outside_match_set";
|
||||
return {
|
||||
alignment_status: alignmentStatus,
|
||||
top_chain_template_match: matches[0] ?? null,
|
||||
selected_chain_template_rank: selectedIndex >= 0 ? selectedIndex + 1 : null,
|
||||
selected_chain_is_catalog_template: selectedChainIsCatalogTemplate,
|
||||
|
|
|
|||
|
|
@ -18,6 +18,7 @@ function entryPointContract(overrides: Record<string, unknown> = {}) {
|
|||
selected_chain_id: "value_flow_ranking",
|
||||
catalog_chain_template_matches: ["value_flow_ranking", "value_flow"],
|
||||
catalog_chain_template_alignment: {
|
||||
alignment_status: "selected_matches_top",
|
||||
top_chain_template_match: "value_flow_ranking",
|
||||
selected_chain_template_rank: 1,
|
||||
selected_chain_is_catalog_template: true,
|
||||
|
|
@ -50,6 +51,7 @@ describe("assistant MCP discovery debug attachment", () => {
|
|||
expect(debug.mcp_discovery_bridge_status).toBe("answer_draft_ready");
|
||||
expect(debug.mcp_discovery_selected_chain_id).toBe("value_flow_ranking");
|
||||
expect(debug.mcp_discovery_catalog_chain_template_matches).toEqual(["value_flow_ranking", "value_flow"]);
|
||||
expect(debug.mcp_discovery_catalog_chain_alignment_status).toBe("selected_matches_top");
|
||||
expect(debug.mcp_discovery_catalog_chain_top_match).toBe("value_flow_ranking");
|
||||
expect(debug.mcp_discovery_catalog_chain_selected_matches_top).toBe(true);
|
||||
expect(debug.mcp_discovery_answer_mode).toBe("confirmed_with_bounded_inference");
|
||||
|
|
@ -71,6 +73,7 @@ describe("assistant MCP discovery debug attachment", () => {
|
|||
expect(debug.mcp_discovery_bridge_status).toBeNull();
|
||||
expect(debug.mcp_discovery_selected_chain_id).toBeNull();
|
||||
expect(debug.mcp_discovery_catalog_chain_template_matches).toEqual([]);
|
||||
expect(debug.mcp_discovery_catalog_chain_alignment_status).toBeNull();
|
||||
expect(debug.mcp_discovery_catalog_chain_top_match).toBeNull();
|
||||
expect(debug.mcp_discovery_catalog_chain_selected_matches_top).toBe(false);
|
||||
expect(debug.mcp_discovery_answer_mode).toBeNull();
|
||||
|
|
|
|||
|
|
@ -49,6 +49,7 @@ describe("assistant MCP discovery planner", () => {
|
|||
expect(result.data_need_graph?.business_fact_family).toBe("value_flow");
|
||||
expect(result.catalog_chain_template_matches[0]).toBe("value_flow");
|
||||
expect(result.catalog_chain_template_alignment).toMatchObject({
|
||||
alignment_status: "selected_matches_top",
|
||||
top_chain_template_match: "value_flow",
|
||||
selected_chain_template_rank: 1,
|
||||
selected_chain_is_catalog_template: true,
|
||||
|
|
@ -200,6 +201,7 @@ describe("assistant MCP discovery planner", () => {
|
|||
for (const item of cases) {
|
||||
const result = planAssistantMcpDiscovery(item.input);
|
||||
expect(result.selected_chain_id, item.name).toBe(item.expected);
|
||||
expect(result.catalog_chain_template_alignment.alignment_status, item.name).toBe("selected_matches_top");
|
||||
expect(result.catalog_chain_template_alignment.top_chain_template_match, item.name).toBe(item.expected);
|
||||
expect(result.catalog_chain_template_alignment.selected_chain_template_rank, item.name).toBe(1);
|
||||
expect(result.catalog_chain_template_alignment.selected_chain_matches_top, item.name).toBe(true);
|
||||
|
|
@ -1098,6 +1100,7 @@ describe("assistant MCP discovery planner", () => {
|
|||
expect(result.discovery_plan.plan_status).toBe("needs_clarification");
|
||||
expect(result.catalog_chain_template_matches).toEqual([]);
|
||||
expect(result.catalog_chain_template_alignment).toMatchObject({
|
||||
alignment_status: "selected_unscored",
|
||||
top_chain_template_match: null,
|
||||
selected_chain_template_rank: null,
|
||||
selected_chain_in_catalog_matches: false,
|
||||
|
|
|
|||
|
|
@ -148,6 +148,7 @@ describe("assistant MCP discovery runtime bridge", () => {
|
|||
expect(result.loop_state.pending_axes).toContain("organization");
|
||||
expect(result.loop_state.provided_axes).toContain("aggregate_axis");
|
||||
expect(result.loop_state.catalog_chain_template_matches[0]).toBe("value_flow_ranking");
|
||||
expect(result.loop_state.catalog_chain_template_alignment.alignment_status).toBe("selected_matches_top");
|
||||
expect(result.loop_state.catalog_chain_template_alignment.selected_chain_matches_top).toBe(true);
|
||||
expect(result.reason_codes).toContain("planner_selected_chain_matches_catalog_top");
|
||||
expect(result.reason_codes).toContain("runtime_bridge_loop_state_awaiting_clarification");
|
||||
|
|
|
|||
Loading…
Reference in New Issue