From b625f9af5b1c729f3c592734ec2db377bb6b5661 Mon Sep 17 00:00:00 2001 From: dctouch Date: Sun, 10 May 2026 08:45:37 +0300 Subject: [PATCH] =?UTF-8?q?=D0=90=D0=BA=D1=82=D1=83=D0=B0=D0=BB=D0=B8?= =?UTF-8?q?=D0=B7=D0=B8=D1=80=D0=BE=D0=B2=D0=B0=D1=82=D1=8C=20=D0=B4=D0=BE?= =?UTF-8?q?=D0=BA=D1=83=D0=BC=D0=B5=D0=BD=D1=82=D1=8B=20=D0=BF=D0=BE=20?= =?UTF-8?q?=D0=B0=D0=B3=D0=B5=D0=BD=D1=82=D0=BD=D0=BE=D0=BC=D1=83=20=D1=86?= =?UTF-8?q?=D0=B8=D0=BA=D0=BB=D1=83?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- .../21 - current_status_canon_2026-05-01.md | 43 +++++- ...ne_and_semantic_control_gate_2026-05-05.md | 16 ++ ...ent_loop_and_autorun_hygiene_2026-05-10.md | 138 ++++++++++++++++++ .../11 - architecture_turnaround/README.md | 10 +- docs/TECH/ui_markup_system.md | 10 +- 5 files changed, 206 insertions(+), 11 deletions(-) create mode 100644 docs/ARCH/11 - architecture_turnaround/24 - agentic_semantic_development_loop_and_autorun_hygiene_2026-05-10.md diff --git a/docs/ARCH/11 - architecture_turnaround/21 - current_status_canon_2026-05-01.md b/docs/ARCH/11 - architecture_turnaround/21 - current_status_canon_2026-05-01.md index 8267a95..4163cc7 100644 --- a/docs/ARCH/11 - architecture_turnaround/21 - current_status_canon_2026-05-01.md +++ b/docs/ARCH/11 - architecture_turnaround/21 - current_status_canon_2026-05-01.md @@ -23,6 +23,27 @@ From this point forward: For the current execution spine, read `23 - current_execution_spine_and_semantic_control_gate_2026-05-05.md`. +## 2026-05-10 Overlay - Agentic Loop And Autorun Hygiene + +The next active operating layer is now the repo-native `Agentic Semantic Development Loop`, not another isolated route patch. + +Current interpretation: + +- the Open-World Semantic Control Gate remains the semantic pressure surface; +- the stage-loop is the development operating system around that surface: generate/review/replay/audit/repair/rerun, then save accepted AGENT autoruns only after reviewed acceptance; +- Lead Codex remains the repair brain, while the loop produces strong business-audit artifacts and lead-coder handoff instead of relying on a weak autonomous coder; +- the first dogfood loop artifact is accepted at `artifacts/domain_runs/stage_agent_loops/agentic_semantic_development_loop/domain_loops/asl/final_status.md`; +- manual GUI confirmation remains required after accepted replay artifacts; +- autorun/runtime Cyrillic hygiene is now part of the acceptance surface, because broken saved-session text can invalidate the GUI review even when the backend route is correct. + +Fresh validation cut: + +- commit `3be06b5 Починить восстановление кириллицы в автопрогонах`; +- targeted mojibake/autorun/runtime tests passed `20/20`; +- targeted organization-clarification carryover tests passed `2/2`; +- `npm.cmd run build` passed; +- graphify rebuilt to `6371` nodes, `14048` edges, `141` communities. + ## Current Module Map - `Post-F Semantic Integrity Hardening`: `99%`, operationally closed as a hardening slice and now used as a regression gate. @@ -55,9 +76,11 @@ For the current execution spine, read `23 - current_execution_spine_and_semantic - Completed active slice: `Business Overview Counterparty/Contract Profile Bridge`: business overview now executes reviewed `counterparty_population_and_roles` and `contract_usage_overview` recipes, surfacing active counterparty role split and contract usage without claiming CRM quality, counterparty due diligence, legal completeness, or contract-risk. - Completed active slice: `Business Overview Missing Proof Ledger`: business overview now records machine-readable hard proof gaps for accounting profit/margin, due-date debt aging, inventory reserve/liquidation quality, and vendor/procurement quality, distinguishing proxy-only evidence from reviewed routes that are not wired yet. - Completed semantic-control slice: `W5/W7 Counterparty Value-Flow And Money-Breakdown Integrity`: bank-document/value-flow recipes now materialize explicit counterparty predicates, zero-row supplier-payment checks answer as checked negative evidence, compound money-breakdown wording stays in `business_overview`, and MCP discovery receives active organization scope only when the current turn has no explicit organization. +- Completed operating-system slice: `Agentic Semantic Development Loop Dogfood Gate`: stage manifest, stage pack, stage-loop wrapper, review/status/continue safety, lead-coder handoff, and save-after-acceptance gating are wired and accepted by the `asl` dogfood loop artifact. +- Completed hygiene slice: `Autorun Cyrillic C1 Repair`: old autorun cards/questions/runtime materialization now repair C1-control mojibake before UI or assistant-lane use, including the historical `БОЛЬШОЙ ОБЩИЙ` / `АЛЬТЕРНАТИВА` failure class. - Implementation breadth: `~99% (Open-World Bounded Autonomy Breadth through Slice 25)`. -- Next active slice: `Open-World Semantic Control Gate`, covering garbage-anchor protection, business-overview continuation, intent dominance, frame hygiene, counterparty/organization arbitration, and final-summary answer shape. -- Active module progress: `~99% (Open-World Bounded Autonomy Breadth, active slice: Semantic Control Gate)`. +- Next active slice: continue `Agentic Semantic Development Loop` dogfood over real Open-World/Semantic-Control packs and confirm the latest autorun hygiene fix in the GUI. +- Active module progress: `~99% (Agentic Semantic Development Loop, accepted dogfood loop + autorun hygiene; manual GUI confirmation still required)`. ## Reporting Rule @@ -66,6 +89,7 @@ Use these labels when reporting progress: - `Прогресс модуля: 99% (Post-F Semantic Integrity Hardening, operationally closed/regression gate)` when discussing the Post-F slice itself. - `Прогресс модуля: 100% (Planner Autonomy Consolidation, declared phase83 slice closed)` when discussing the planner-autonomy slice that was just completed. - `Прогресс модуля: 99% (Open-World Bounded Autonomy Breadth, active slice: Semantic Control Gate)` while discussing current module closure after the EHMO-derived critical subset accepted live again with W5/W7 hardening. +- `Прогресс модуля: 99% (Agentic Semantic Development Loop, accepted dogfood loop + autorun hygiene; manual GUI confirmation still required)` when discussing the current development-loop operating layer. - `Open-World Business Overview implementation breadth: ~99%, Semantic Control Gate critical subset accepted, fat GUI pack still pending` when discussing only the already wired Slice 25 breadth. - `Прогресс модуля: X% (Open-World Bounded Autonomy Breadth, active slice: )` for later breadth work after the Semantic Control Gate is accepted. @@ -98,6 +122,8 @@ The project is not yet a universal arbitrary-1C agent. Remaining work belongs to the next breadth module: +- confirm the latest autorun Cyrillic hygiene cut in the GUI after backend refresh and inspect frontend/API payloads if old replacement characters remain visible; +- continue dogfooding the `Agentic Semantic Development Loop` on real stage packs, especially generated-question quality, semantic business audit, repair handoff, and rerun acceptance; - finish closure of the `Open-World Semantic Control Gate` opened by `assistant-stage1-EHMOy3lNFt`; the EHMO-derived critical subset is accepted live after W5/W7 hardening, but the fat GUI pack and residual answer-shape roughness still need final review; - extend `business_overview` beyond money-flow/activity, customer and supplier concentration, document/account-section activity mix, counterparty role split, contract usage, yearly operating-flow dynamics, explicit profit/margin wording boundaries, explicit debt due-date wording boundaries, explicit inventory reserve/liquidation wording boundaries, explicit supplier/procurement-quality wording boundaries, explicit-period VAT/tax, as-of-date debt position, open-settlement concentration, contract-date debt age, debt staleness-risk proxy, as-of-date inventory position, trading-margin proxy, sales-to-stock inventory proxy, warehouse staleness-risk proxy, and the missing-proof ledger into separately proven exact accounting profit/margin, due-date debt aging/overdue, confirmed vendor-risk/procurement-quality analysis, and confirmed reserve/write-off/liquidation inventory evidence families; - broader dynamic schema traversal for unfamiliar 1C asks; @@ -120,11 +146,12 @@ For current planning, read: 1. `README.md` 2. this document -3. `23 - current_execution_spine_and_semantic_control_gate_2026-05-05.md` -4. `22 - open_world_bounded_autonomy_breadth_2026-05-01.md` -5. `20 - planner_autonomy_consolidation_2026-05-01.md` -6. `19 - inventory_stock_open_world_breadth_proof_2026-05-01.md` -7. `17 - post_f_semantic_integrity_hardening_2026-04-23.md` -8. `16 - data_need_graph_and_open_world_mcp_plan_2026-04-22.md` +3. `24 - agentic_semantic_development_loop_and_autorun_hygiene_2026-05-10.md` +4. `23 - current_execution_spine_and_semantic_control_gate_2026-05-05.md` +5. `22 - open_world_bounded_autonomy_breadth_2026-05-01.md` +6. `20 - planner_autonomy_consolidation_2026-05-01.md` +7. `19 - inventory_stock_open_world_breadth_proof_2026-05-01.md` +8. `17 - post_f_semantic_integrity_hardening_2026-04-23.md` +9. `16 - data_need_graph_and_open_world_mcp_plan_2026-04-22.md` Documents `01` through `15` remain valuable, but mostly as the historical architecture trail. diff --git a/docs/ARCH/11 - architecture_turnaround/23 - current_execution_spine_and_semantic_control_gate_2026-05-05.md b/docs/ARCH/11 - architecture_turnaround/23 - current_execution_spine_and_semantic_control_gate_2026-05-05.md index 4f2af98..da38f77 100644 --- a/docs/ARCH/11 - architecture_turnaround/23 - current_execution_spine_and_semantic_control_gate_2026-05-05.md +++ b/docs/ARCH/11 - architecture_turnaround/23 - current_execution_spine_and_semantic_control_gate_2026-05-05.md @@ -73,6 +73,22 @@ This is not a regression from `99%` to `96%`. It is a metric split: - `99%` describes wired breadth; - `99%` describes closure confidence after the EHMO-derived critical subset passed live replay again with W5/W7 hardening; the gate is still not full module closure until the fat manual GUI pack and remaining answer-shape residuals are reviewed. +## 2026-05-10 Status Overlay + +This document remains the Semantic Control Gate spine, but it is no longer the latest operating overlay. + +Read `24 - agentic_semantic_development_loop_and_autorun_hygiene_2026-05-10.md` after this file. + +Newer status: + +- the current development operating layer is `Agentic Semantic Development Loop`; +- the first dogfood loop artifact for `agentic_semantic_development_loop` is accepted under `artifacts/domain_runs/stage_agent_loops/.../asl/final_status.md`; +- the loop uses Lead Codex handoff for repairs, not a weak autonomous coder as the primary repair actor; +- accepted AGENT autoruns should be saved only after reviewed replay/loop acceptance; +- latest autorun/runtime hygiene cut repairs old C1-control Cyrillic mojibake before GUI cards, runtime questions, or assistant-lane continuation. + +The Semantic Control Gate work units below remain valid regression classes, but current execution should go through the stage-loop machinery when a substantial pack is being validated. + ## Current Local Cut Local cut 1 is implemented: diff --git a/docs/ARCH/11 - architecture_turnaround/24 - agentic_semantic_development_loop_and_autorun_hygiene_2026-05-10.md b/docs/ARCH/11 - architecture_turnaround/24 - agentic_semantic_development_loop_and_autorun_hygiene_2026-05-10.md new file mode 100644 index 0000000..e53493c --- /dev/null +++ b/docs/ARCH/11 - architecture_turnaround/24 - agentic_semantic_development_loop_and_autorun_hygiene_2026-05-10.md @@ -0,0 +1,138 @@ +# 24 - Agentic Semantic Development Loop And Autorun Hygiene (2026-05-10) + +## Purpose + +This note is the current status overlay after the Open-World Semantic Control Gate work moved from direct manual Codex operation into a repo-native agentic development loop. + +It exists to keep the project spine clear: + +- the target product is still a bounded MCP-first 1C analyst assistant; +- Post-F, planner autonomy, inventory breadth, and business overview remain regression gates; +- the active operational work is now the development loop that generates, reviews, replays, audits, repairs, reruns, and only then saves AGENT autoruns; +- autorun/runtime hygiene is part of that loop, because broken saved-session text can invalidate the human GUI review even when the backend route is correct. + +## What Changed Since Document 23 + +The previous execution-spine document stopped at the EHMO-derived Semantic Control Gate. + +Since then, the project added a repo-native stage-loop layer: + +- `scripts/stage_agent_loop.py` is the stage-level wrapper for pack-loop execution, review, status, safe continuation, and optional save-to-autoruns after acceptance. +- `docs/orchestration/stage_agent_loop_agentic_semantic_development_loop.json` is the active stage manifest. +- `docs/orchestration/agentic_semantic_development_loop_stage_pack.json` is the dogfood pack for business overview, VAT, stale scope, counterparty pivots, legacy route canaries, and answer-shape quality. +- `docs/orchestration/schemas/stage_agent_loop_manifest.schema.json` defines the manifest contract. +- `scripts/save_agent_semantic_run.py` refuses to save AGENT autoruns before reviewed live replay/loop acceptance unless explicitly forced as a draft. +- `docs/orchestration/agent_semantic_source_catalog.*` remains the reusable source catalog for mixed AGENT pack construction. + +The accepted dogfood status is recorded at: + +- `artifacts/domain_runs/stage_agent_loops/agentic_semantic_development_loop/domain_loops/asl/final_status.md` + +Current recorded result: + +- status: `accepted` +- loop id: `asl` +- repair mode: `lead-handoff` +- target score: `88` +- iterations ran: `1` +- stop reason: analyst accepted plus deterministic gate passed at `iteration_00` +- manual GUI confirmation is still required after acceptance + +## Design Decision + +The project should not rely on a weak autonomous coder as the primary repair actor. + +The chosen model is: + +- Lead Codex remains the responsible repair brain. +- A strong independent semantic/business audit layer reviews the replay from the user's business meaning first. +- The stage-loop produces machine-readable artifacts and a lead-coder handoff instead of silently patching production code. +- The loop can continue safely, but real code repair requires explicit execution mode and must be validated by rerun artifacts. +- Human GUI confirmation remains the final high-signal reality check for accepted AGENT packs. + +This preserves the user's desired automation pattern without delegating high-risk architecture repair to a low-context worker. + +## Autorun Cyrillic Hygiene Cut + +The GUI exposed an old saved-session failure where Cyrillic in autorun cards/questions displayed as replacement-character text. + +Control examples after repair: + +- `БОЛЬШОЙ ОБЩИЙ` +- `АЛЬТЕРНАТИВА` + +Root cause: + +- old autorun history/runtime payloads contained double-decoded Cyrillic plus C1 controls such as `U+0098`; +- the previous repair path encoded those controls as normal Windows-1251 text and lost the raw byte needed to reconstruct UTF-8; +- the UI then displayed the remaining replacement character honestly. + +Current fix: + +- `addressTextRepair.ts` preserves C1 controls as raw bytes during UTF-8 reconstruction; +- `autoRuns.ts` repairs autorun titles/questions before exposing cards or runtime materialization; +- `eval.ts` and `assistantService.ts` now receive repaired scenario/runtime question text; +- known already-lossy fragments with `U+FFFD` inside the organization name are repaired before they can poison organization clarification; +- tests now cover the historical `БОЛЬШОЙ ОБЩИЙ` and `АЛЬТЕРНАТИВА` autorun cases. + +Committed cut: + +- `3be06b5 Починить восстановление кириллицы в автопрогонах` + +Validation recorded during this cut: + +- `npm.cmd test -- assistantOrganizationMatcher.test.ts addressTextRepair.test.ts autoRunsQuestionSplit.test.ts evalRuntimeQuestionSplit.test.ts` passed `20/20`; +- `npm.cmd test -- assistantAddressFollowupContext.test.ts -t "continues the original inventory query after"` passed `2/2`; +- `npm.cmd run build` passed; +- graphify rebuilt to `6371` nodes, `14048` edges, `141` communities. + +## Current Status + +The current large module should be described as: + +`Agentic Semantic Development Loop / Open-World Semantic Control operating system` + +Status: + +- implementation state: operational dogfood loop exists and has an accepted first loop artifact; +- semantic status: accepted loop artifact is useful, but manual GUI confirmation remains required; +- hygiene status: saved autorun/runtime Cyrillic repair is now covered by code and tests; +- risk: medium, because the loop is now infrastructure for future acceptance decisions, not just a local route fix. + +Recommended reporting line: + +`Прогресс модуля: 99% (Agentic Semantic Development Loop, accepted dogfood loop + autorun hygiene; manual GUI confirmation still required)` + +## What Is Not Closed + +Do not treat the stage-loop as a replacement for business-answer review. + +Still open: + +- frontend/browser cache can still show old broken autorun text until the backend is restarted and the UI is refreshed; +- the first accepted dogfood loop proves the mechanism, not all future stage packs; +- generated question quality still needs pressure from real GUI runs and user feedback; +- broad arbitrary 1C autonomy is still bounded by reviewed routes, truth gates, and replay evidence; +- manual GUI confirmation remains required before declaring a fat AGENT pack fully accepted. + +## Next Work + +Next operational pass: + +1. Restart/reload the backend and reopen the affected autorun card from the GUI. +2. Confirm old saved-session text now displays as normal Cyrillic. +3. If the GUI still shows replacement characters, inspect frontend state/cache and API payload side by side. +4. Continue dogfooding the stage-loop on the next real Open-World/agentic pack. +5. Keep Post-F, phase83, inventory, business-overview, and mojibake autorun cases as regression canaries. + +## Canonical Reading Order Update + +For current planning, read: + +1. `README.md` +2. `21 - current_status_canon_2026-05-01.md` +3. this document +4. `23 - current_execution_spine_and_semantic_control_gate_2026-05-05.md` +5. `22 - open_world_bounded_autonomy_breadth_2026-05-01.md` +6. `20 - planner_autonomy_consolidation_2026-05-01.md` +7. `17 - post_f_semantic_integrity_hardening_2026-04-23.md` diff --git a/docs/ARCH/11 - architecture_turnaround/README.md b/docs/ARCH/11 - architecture_turnaround/README.md index b862d6a..2edfd76 100644 --- a/docs/ARCH/11 - architecture_turnaround/README.md +++ b/docs/ARCH/11 - architecture_turnaround/README.md @@ -41,13 +41,18 @@ This package answers the next question: 21. [21 - current_status_canon_2026-05-01.md](./21%20-%20current_status_canon_2026-05-01.md) 22. [22 - open_world_bounded_autonomy_breadth_2026-05-01.md](./22%20-%20open_world_bounded_autonomy_breadth_2026-05-01.md) 23. [23 - current_execution_spine_and_semantic_control_gate_2026-05-05.md](./23%20-%20current_execution_spine_and_semantic_control_gate_2026-05-05.md) +24. [24 - agentic_semantic_development_loop_and_autorun_hygiene_2026-05-10.md](./24%20-%20agentic_semantic_development_loop_and_autorun_hygiene_2026-05-10.md) -## Current Status Snapshot (2026-05-05) +## Current Status Snapshot (2026-05-10) This package is no longer planning-only. Status canon for planning: +- The current operational overlay is now [24 - agentic_semantic_development_loop_and_autorun_hygiene_2026-05-10.md](./24%20-%20agentic_semantic_development_loop_and_autorun_hygiene_2026-05-10.md). +- The active engineering surface is no longer only individual route hardening; it is the repo-native AGENT/stage-loop operating system that should generate/review/replay/audit/repair/rerun current-stage packs before saving accepted autoruns. +- The first dogfood stage loop for `agentic_semantic_development_loop` is accepted in artifacts, but manual GUI confirmation remains required before treating a fat AGENT pack as fully human-accepted. +- Autorun/runtime Cyrillic hygiene is now a current regression gate: old saved-session mojibake with C1 controls must be repaired before cards, questions, and runtime jobs reach the GUI or assistant lane. - Post-F Semantic Integrity Hardening is operationally closed at `99%` and should now be used as a regression gate, not as the active module denominator. - Planner Autonomy Consolidation is closed at `100%` for the declared phase83 planner-brain slice. - The active next module is now `Open-World Bounded Autonomy Breadth` over unfamiliar 1C asks, with Post-F and phase83 retained as semantic canaries. @@ -78,8 +83,11 @@ Status canon for planning: - The `assistant-stage1-EHMOy3lNFt` manual GUI replay opened the next acceptance gate: `Open-World Semantic Control Gate`. - The `~99%` Open-World number now means implementation breadth through Slice 25, not accepted semantic closure under broad human dialogue pressure. - The active breadth slice is semantic control rather than new proof-family expansion: garbage-anchor protection, business-overview continuation, intent dominance, frame hygiene, counterparty/organization arbitration, and final-summary answer shape. +- The current accepted dogfood infrastructure slice is `Agentic Semantic Development Loop`: stage manifest, stage pack, loop wrapper, status/continue safety, strong business-audit handoff, and save-after-acceptance gating are wired and validated by the `asl` accepted loop artifact. +- The latest hygiene slice is `Autorun Cyrillic C1 Repair`: `addressTextRepair`, `autoRuns`, `eval`, and `assistantService` now preserve C1 bytes while repairing old saved-session Russian text, preventing replacement-character autorun cards or runtime turns from leaking into the user path after backend refresh. - The short source of truth for status wording is [21 - current_status_canon_2026-05-01.md](./21%20-%20current_status_canon_2026-05-01.md). - The current execution spine after EHMO is [23 - current_execution_spine_and_semantic_control_gate_2026-05-05.md](./23%20-%20current_execution_spine_and_semantic_control_gate_2026-05-05.md). +- The current stage-loop/hygiene overlay after the AGENT dogfood cut is [24 - agentic_semantic_development_loop_and_autorun_hygiene_2026-05-10.md](./24%20-%20agentic_semantic_development_loop_and_autorun_hygiene_2026-05-10.md). It now documents a turnaround that is already operational in code, already materially past the acute regression breakpoint, and already moved through bounded MCP autonomy, Post-F hardening, inventory breadth proof, and the declared Planner Autonomy slice: diff --git a/docs/TECH/ui_markup_system.md b/docs/TECH/ui_markup_system.md index 5ba4ce5..3606024 100644 --- a/docs/TECH/ui_markup_system.md +++ b/docs/TECH/ui_markup_system.md @@ -3,7 +3,7 @@ Документ описывает практический контур, который используется оператором в интерфейсе `История автопрогонов`: генерация вопросов, запуск прогонов, разметка ответов, закрытие кейсов и пост-анализ. -Дата актуализации: `2026-04-09` +Дата актуализации: `2026-05-10` --- @@ -213,6 +213,13 @@ Queue mapping: 6. Фильтр "скрыть выполненные" корректно исключает `resolved=true`. 7. Пост-анализ показывает очереди и кандидатов. 8. Текст в интерфейсе читается без mojibake. +9. Старые сохраненные автопрогоны с C1-control mojibake в `history.json` и runtime job payload должны отдаваться через backend уже в восстановленной кириллице; контрольные примеры: `БОЛЬШОЙ ОБЩИЙ`, `АЛЬТЕРНАТИВА`. + +Важно после правок encoding/autorun: + +- перезапустить backend, чтобы UI получил новый repair-слой; +- обновить список автопрогонов в браузере; +- если replacement-character `U+FFFD` остается видимым, сравнить API payload `GET /api/autoruns/autogen/history` с состоянием frontend/browser cache. --- @@ -221,4 +228,3 @@ Queue mapping: 1. Async run ограничен `assistant_stage1`. 2. Качество live-данных зависит от заполнения session-файлов на стороне рантайма. 3. Пост-анализ основан на фактической ручной разметке; без нее очереди пустые. -