Актуализировать документы по агентному циклу

2026-05-10 08:45:37 +03:00 · 2026-05-10 08:45:37 +03:00 · b625f9af5b
parent 3be06b5f93
commit b625f9af5b
5 changed files with 206 additions and 11 deletions
--- a/current_status_canon_2026-05-01.md
+++ b/current_status_canon_2026-05-01.md
@ -23,6 +23,27 @@ From this point forward:

 For the current execution spine, read `23 - current_execution_spine_and_semantic_control_gate_2026-05-05.md`.

+## 2026-05-10 Overlay - Agentic Loop And Autorun Hygiene
+
+The next active operating layer is now the repo-native `Agentic Semantic Development Loop`, not another isolated route patch.
+
+Current interpretation:
+
+- the Open-World Semantic Control Gate remains the semantic pressure surface;
+- the stage-loop is the development operating system around that surface: generate/review/replay/audit/repair/rerun, then save accepted AGENT autoruns only after reviewed acceptance;
+- Lead Codex remains the repair brain, while the loop produces strong business-audit artifacts and lead-coder handoff instead of relying on a weak autonomous coder;
+- the first dogfood loop artifact is accepted at `artifacts/domain_runs/stage_agent_loops/agentic_semantic_development_loop/domain_loops/asl/final_status.md`;
+- manual GUI confirmation remains required after accepted replay artifacts;
+- autorun/runtime Cyrillic hygiene is now part of the acceptance surface, because broken saved-session text can invalidate the GUI review even when the backend route is correct.
+
+Fresh validation cut:
+
+- commit `3be06b5 Починить восстановление кириллицы в автопрогонах`;
+- targeted mojibake/autorun/runtime tests passed `20/20`;
+- targeted organization-clarification carryover tests passed `2/2`;
+- `npm.cmd run build` passed;
+- graphify rebuilt to `6371` nodes, `14048` edges, `141` communities.
+
 ## Current Module Map

 - `Post-F Semantic Integrity Hardening`: `99%`, operationally closed as a hardening slice and now used as a regression gate.
@ -55,9 +76,11 @@ For the current execution spine, read `23 - current_execution_spine_and_semantic
 - Completed active slice: `Business Overview Counterparty/Contract Profile Bridge`: business overview now executes reviewed `counterparty_population_and_roles` and `contract_usage_overview` recipes, surfacing active counterparty role split and contract usage without claiming CRM quality, counterparty due diligence, legal completeness, or contract-risk.
 - Completed active slice: `Business Overview Missing Proof Ledger`: business overview now records machine-readable hard proof gaps for accounting profit/margin, due-date debt aging, inventory reserve/liquidation quality, and vendor/procurement quality, distinguishing proxy-only evidence from reviewed routes that are not wired yet.
 - Completed semantic-control slice: `W5/W7 Counterparty Value-Flow And Money-Breakdown Integrity`: bank-document/value-flow recipes now materialize explicit counterparty predicates, zero-row supplier-payment checks answer as checked negative evidence, compound money-breakdown wording stays in `business_overview`, and MCP discovery receives active organization scope only when the current turn has no explicit organization.
+- Completed operating-system slice: `Agentic Semantic Development Loop Dogfood Gate`: stage manifest, stage pack, stage-loop wrapper, review/status/continue safety, lead-coder handoff, and save-after-acceptance gating are wired and accepted by the `asl` dogfood loop artifact.
+- Completed hygiene slice: `Autorun Cyrillic C1 Repair`: old autorun cards/questions/runtime materialization now repair C1-control mojibake before UI or assistant-lane use, including the historical `БОЛЬШОЙ ОБЩИЙ` / `АЛЬТЕРНАТИВА` failure class.
 - Implementation breadth: `~99% (Open-World Bounded Autonomy Breadth through Slice 25)`.
- Next active slice: `Open-World Semantic Control Gate`, covering garbage-anchor protection, business-overview continuation, intent dominance, frame hygiene, counterparty/organization arbitration, and final-summary answer shape.
- Active module progress: `~99% (Open-World Bounded Autonomy Breadth, active slice: Semantic Control Gate)`.
+- Next active slice: continue `Agentic Semantic Development Loop` dogfood over real Open-World/Semantic-Control packs and confirm the latest autorun hygiene fix in the GUI.
+- Active module progress: `~99% (Agentic Semantic Development Loop, accepted dogfood loop + autorun hygiene; manual GUI confirmation still required)`.

 ## Reporting Rule

@ -66,6 +89,7 @@ Use these labels when reporting progress:
 - `Прогресс модуля: 99% (Post-F Semantic Integrity Hardening, operationally closed/regression gate)` when discussing the Post-F slice itself.
 - `Прогресс модуля: 100% (Planner Autonomy Consolidation, declared phase83 slice closed)` when discussing the planner-autonomy slice that was just completed.
 - `Прогресс модуля: 99% (Open-World Bounded Autonomy Breadth, active slice: Semantic Control Gate)` while discussing current module closure after the EHMO-derived critical subset accepted live again with W5/W7 hardening.
+- `Прогресс модуля: 99% (Agentic Semantic Development Loop, accepted dogfood loop + autorun hygiene; manual GUI confirmation still required)` when discussing the current development-loop operating layer.
 - `Open-World Business Overview implementation breadth: ~99%, Semantic Control Gate critical subset accepted, fat GUI pack still pending` when discussing only the already wired Slice 25 breadth.
 - `Прогресс модуля: X% (Open-World Bounded Autonomy Breadth, active slice: <name>)` for later breadth work after the Semantic Control Gate is accepted.

@ -98,6 +122,8 @@ The project is not yet a universal arbitrary-1C agent.

 Remaining work belongs to the next breadth module:

+- confirm the latest autorun Cyrillic hygiene cut in the GUI after backend refresh and inspect frontend/API payloads if old replacement characters remain visible;
+- continue dogfooding the `Agentic Semantic Development Loop` on real stage packs, especially generated-question quality, semantic business audit, repair handoff, and rerun acceptance;
 - finish closure of the `Open-World Semantic Control Gate` opened by `assistant-stage1-EHMOy3lNFt`; the EHMO-derived critical subset is accepted live after W5/W7 hardening, but the fat GUI pack and residual answer-shape roughness still need final review;
 - extend `business_overview` beyond money-flow/activity, customer and supplier concentration, document/account-section activity mix, counterparty role split, contract usage, yearly operating-flow dynamics, explicit profit/margin wording boundaries, explicit debt due-date wording boundaries, explicit inventory reserve/liquidation wording boundaries, explicit supplier/procurement-quality wording boundaries, explicit-period VAT/tax, as-of-date debt position, open-settlement concentration, contract-date debt age, debt staleness-risk proxy, as-of-date inventory position, trading-margin proxy, sales-to-stock inventory proxy, warehouse staleness-risk proxy, and the missing-proof ledger into separately proven exact accounting profit/margin, due-date debt aging/overdue, confirmed vendor-risk/procurement-quality analysis, and confirmed reserve/write-off/liquidation inventory evidence families;
 - broader dynamic schema traversal for unfamiliar 1C asks;
@ -120,11 +146,12 @@ For current planning, read:

 1. `README.md`
 2. this document
-3. `23 - current_execution_spine_and_semantic_control_gate_2026-05-05.md`
-4. `22 - open_world_bounded_autonomy_breadth_2026-05-01.md`
-5. `20 - planner_autonomy_consolidation_2026-05-01.md`
-6. `19 - inventory_stock_open_world_breadth_proof_2026-05-01.md`
-7. `17 - post_f_semantic_integrity_hardening_2026-04-23.md`
-8. `16 - data_need_graph_and_open_world_mcp_plan_2026-04-22.md`
+3. `24 - agentic_semantic_development_loop_and_autorun_hygiene_2026-05-10.md`
+4. `23 - current_execution_spine_and_semantic_control_gate_2026-05-05.md`
+5. `22 - open_world_bounded_autonomy_breadth_2026-05-01.md`
+6. `20 - planner_autonomy_consolidation_2026-05-01.md`
+7. `19 - inventory_stock_open_world_breadth_proof_2026-05-01.md`
+8. `17 - post_f_semantic_integrity_hardening_2026-04-23.md`
+9. `16 - data_need_graph_and_open_world_mcp_plan_2026-04-22.md`

 Documents `01` through `15` remain valuable, but mostly as the historical architecture trail.
--- a/current_execution_spine_and_semantic_control_gate_2026-05-05.md
+++ b/current_execution_spine_and_semantic_control_gate_2026-05-05.md
@ -73,6 +73,22 @@ This is not a regression from `99%` to `96%`. It is a metric split:
 - `99%` describes wired breadth;
 - `99%` describes closure confidence after the EHMO-derived critical subset passed live replay again with W5/W7 hardening; the gate is still not full module closure until the fat manual GUI pack and remaining answer-shape residuals are reviewed.

+## 2026-05-10 Status Overlay
+
+This document remains the Semantic Control Gate spine, but it is no longer the latest operating overlay.
+
+Read `24 - agentic_semantic_development_loop_and_autorun_hygiene_2026-05-10.md` after this file.
+
+Newer status:
+
+- the current development operating layer is `Agentic Semantic Development Loop`;
+- the first dogfood loop artifact for `agentic_semantic_development_loop` is accepted under `artifacts/domain_runs/stage_agent_loops/.../asl/final_status.md`;
+- the loop uses Lead Codex handoff for repairs, not a weak autonomous coder as the primary repair actor;
+- accepted AGENT autoruns should be saved only after reviewed replay/loop acceptance;
+- latest autorun/runtime hygiene cut repairs old C1-control Cyrillic mojibake before GUI cards, runtime questions, or assistant-lane continuation.
+
+The Semantic Control Gate work units below remain valid regression classes, but current execution should go through the stage-loop machinery when a substantial pack is being validated.
+
 ## Current Local Cut

 Local cut 1 is implemented:
--- a/agentic_semantic_development_loop_and_autorun_hygiene_2026-05-10.md
+++ b/agentic_semantic_development_loop_and_autorun_hygiene_2026-05-10.md
@ -0,0 +1,138 @@
+# 24 - Agentic Semantic Development Loop And Autorun Hygiene (2026-05-10)
+
+## Purpose
+
+This note is the current status overlay after the Open-World Semantic Control Gate work moved from direct manual Codex operation into a repo-native agentic development loop.
+
+It exists to keep the project spine clear:
+
+- the target product is still a bounded MCP-first 1C analyst assistant;
+- Post-F, planner autonomy, inventory breadth, and business overview remain regression gates;
+- the active operational work is now the development loop that generates, reviews, replays, audits, repairs, reruns, and only then saves AGENT autoruns;
+- autorun/runtime hygiene is part of that loop, because broken saved-session text can invalidate the human GUI review even when the backend route is correct.
+
+## What Changed Since Document 23
+
+The previous execution-spine document stopped at the EHMO-derived Semantic Control Gate.
+
+Since then, the project added a repo-native stage-loop layer:
+
+- `scripts/stage_agent_loop.py` is the stage-level wrapper for pack-loop execution, review, status, safe continuation, and optional save-to-autoruns after acceptance.
+- `docs/orchestration/stage_agent_loop_agentic_semantic_development_loop.json` is the active stage manifest.
+- `docs/orchestration/agentic_semantic_development_loop_stage_pack.json` is the dogfood pack for business overview, VAT, stale scope, counterparty pivots, legacy route canaries, and answer-shape quality.
+- `docs/orchestration/schemas/stage_agent_loop_manifest.schema.json` defines the manifest contract.
+- `scripts/save_agent_semantic_run.py` refuses to save AGENT autoruns before reviewed live replay/loop acceptance unless explicitly forced as a draft.
+- `docs/orchestration/agent_semantic_source_catalog.*` remains the reusable source catalog for mixed AGENT pack construction.
+
+The accepted dogfood status is recorded at:
+
+- `artifacts/domain_runs/stage_agent_loops/agentic_semantic_development_loop/domain_loops/asl/final_status.md`
+
+Current recorded result:
+
+- status: `accepted`
+- loop id: `asl`
+- repair mode: `lead-handoff`
+- target score: `88`
+- iterations ran: `1`
+- stop reason: analyst accepted plus deterministic gate passed at `iteration_00`
+- manual GUI confirmation is still required after acceptance
+
+## Design Decision
+
+The project should not rely on a weak autonomous coder as the primary repair actor.
+
+The chosen model is:
+
+- Lead Codex remains the responsible repair brain.
+- A strong independent semantic/business audit layer reviews the replay from the user's business meaning first.
+- The stage-loop produces machine-readable artifacts and a lead-coder handoff instead of silently patching production code.
+- The loop can continue safely, but real code repair requires explicit execution mode and must be validated by rerun artifacts.
+- Human GUI confirmation remains the final high-signal reality check for accepted AGENT packs.
+
+This preserves the user's desired automation pattern without delegating high-risk architecture repair to a low-context worker.
+
+## Autorun Cyrillic Hygiene Cut
+
+The GUI exposed an old saved-session failure where Cyrillic in autorun cards/questions displayed as replacement-character text.
+
+Control examples after repair:
+
+- `БОЛЬШОЙ ОБЩИЙ`
+- `АЛЬТЕРНАТИВА`
+
+Root cause:
+
+- old autorun history/runtime payloads contained double-decoded Cyrillic plus C1 controls such as `U+0098`;
+- the previous repair path encoded those controls as normal Windows-1251 text and lost the raw byte needed to reconstruct UTF-8;
+- the UI then displayed the remaining replacement character honestly.
+
+Current fix:
+
+- `addressTextRepair.ts` preserves C1 controls as raw bytes during UTF-8 reconstruction;
+- `autoRuns.ts` repairs autorun titles/questions before exposing cards or runtime materialization;
+- `eval.ts` and `assistantService.ts` now receive repaired scenario/runtime question text;
+- known already-lossy fragments with `U+FFFD` inside the organization name are repaired before they can poison organization clarification;
+- tests now cover the historical `БОЛЬШОЙ ОБЩИЙ` and `АЛЬТЕРНАТИВА` autorun cases.
+
+Committed cut:
+
+- `3be06b5 Починить восстановление кириллицы в автопрогонах`
+
+Validation recorded during this cut:
+
+- `npm.cmd test -- assistantOrganizationMatcher.test.ts addressTextRepair.test.ts autoRunsQuestionSplit.test.ts evalRuntimeQuestionSplit.test.ts` passed `20/20`;
+- `npm.cmd test -- assistantAddressFollowupContext.test.ts -t "continues the original inventory query after"` passed `2/2`;
+- `npm.cmd run build` passed;
+- graphify rebuilt to `6371` nodes, `14048` edges, `141` communities.
+
+## Current Status
+
+The current large module should be described as:
+
+`Agentic Semantic Development Loop / Open-World Semantic Control operating system`
+
+Status:
+
+- implementation state: operational dogfood loop exists and has an accepted first loop artifact;
+- semantic status: accepted loop artifact is useful, but manual GUI confirmation remains required;
+- hygiene status: saved autorun/runtime Cyrillic repair is now covered by code and tests;
+- risk: medium, because the loop is now infrastructure for future acceptance decisions, not just a local route fix.
+
+Recommended reporting line:
+
+`Прогресс модуля: 99% (Agentic Semantic Development Loop, accepted dogfood loop + autorun hygiene; manual GUI confirmation still required)`
+
+## What Is Not Closed
+
+Do not treat the stage-loop as a replacement for business-answer review.
+
+Still open:
+
+- frontend/browser cache can still show old broken autorun text until the backend is restarted and the UI is refreshed;
+- the first accepted dogfood loop proves the mechanism, not all future stage packs;
+- generated question quality still needs pressure from real GUI runs and user feedback;
+- broad arbitrary 1C autonomy is still bounded by reviewed routes, truth gates, and replay evidence;
+- manual GUI confirmation remains required before declaring a fat AGENT pack fully accepted.
+
+## Next Work
+
+Next operational pass:
+
+1. Restart/reload the backend and reopen the affected autorun card from the GUI.
+2. Confirm old saved-session text now displays as normal Cyrillic.
+3. If the GUI still shows replacement characters, inspect frontend state/cache and API payload side by side.
+4. Continue dogfooding the stage-loop on the next real Open-World/agentic pack.
+5. Keep Post-F, phase83, inventory, business-overview, and mojibake autorun cases as regression canaries.
+
+## Canonical Reading Order Update
+
+For current planning, read:
+
+1. `README.md`
+2. `21 - current_status_canon_2026-05-01.md`
+3. this document
+4. `23 - current_execution_spine_and_semantic_control_gate_2026-05-05.md`
+5. `22 - open_world_bounded_autonomy_breadth_2026-05-01.md`
+6. `20 - planner_autonomy_consolidation_2026-05-01.md`
+7. `17 - post_f_semantic_integrity_hardening_2026-04-23.md`
--- a/architecture_turnaround/README.md
+++ b/architecture_turnaround/README.md
@ -41,13 +41,18 @@ This package answers the next question:
 21. [21 - current_status_canon_2026-05-01.md](./21%20-%20current_status_canon_2026-05-01.md)
 22. [22 - open_world_bounded_autonomy_breadth_2026-05-01.md](./22%20-%20open_world_bounded_autonomy_breadth_2026-05-01.md)
 23. [23 - current_execution_spine_and_semantic_control_gate_2026-05-05.md](./23%20-%20current_execution_spine_and_semantic_control_gate_2026-05-05.md)
+24. [24 - agentic_semantic_development_loop_and_autorun_hygiene_2026-05-10.md](./24%20-%20agentic_semantic_development_loop_and_autorun_hygiene_2026-05-10.md)

-## Current Status Snapshot (2026-05-05)
+## Current Status Snapshot (2026-05-10)

 This package is no longer planning-only.

 Status canon for planning:

+- The current operational overlay is now [24 - agentic_semantic_development_loop_and_autorun_hygiene_2026-05-10.md](./24%20-%20agentic_semantic_development_loop_and_autorun_hygiene_2026-05-10.md).
+- The active engineering surface is no longer only individual route hardening; it is the repo-native AGENT/stage-loop operating system that should generate/review/replay/audit/repair/rerun current-stage packs before saving accepted autoruns.
+- The first dogfood stage loop for `agentic_semantic_development_loop` is accepted in artifacts, but manual GUI confirmation remains required before treating a fat AGENT pack as fully human-accepted.
+- Autorun/runtime Cyrillic hygiene is now a current regression gate: old saved-session mojibake with C1 controls must be repaired before cards, questions, and runtime jobs reach the GUI or assistant lane.
 - Post-F Semantic Integrity Hardening is operationally closed at `99%` and should now be used as a regression gate, not as the active module denominator.
 - Planner Autonomy Consolidation is closed at `100%` for the declared phase83 planner-brain slice.
 - The active next module is now `Open-World Bounded Autonomy Breadth` over unfamiliar 1C asks, with Post-F and phase83 retained as semantic canaries.
@ -78,8 +83,11 @@ Status canon for planning:
 - The `assistant-stage1-EHMOy3lNFt` manual GUI replay opened the next acceptance gate: `Open-World Semantic Control Gate`.
 - The `~99%` Open-World number now means implementation breadth through Slice 25, not accepted semantic closure under broad human dialogue pressure.
 - The active breadth slice is semantic control rather than new proof-family expansion: garbage-anchor protection, business-overview continuation, intent dominance, frame hygiene, counterparty/organization arbitration, and final-summary answer shape.
+- The current accepted dogfood infrastructure slice is `Agentic Semantic Development Loop`: stage manifest, stage pack, loop wrapper, status/continue safety, strong business-audit handoff, and save-after-acceptance gating are wired and validated by the `asl` accepted loop artifact.
+- The latest hygiene slice is `Autorun Cyrillic C1 Repair`: `addressTextRepair`, `autoRuns`, `eval`, and `assistantService` now preserve C1 bytes while repairing old saved-session Russian text, preventing replacement-character autorun cards or runtime turns from leaking into the user path after backend refresh.
 - The short source of truth for status wording is [21 - current_status_canon_2026-05-01.md](./21%20-%20current_status_canon_2026-05-01.md).
 - The current execution spine after EHMO is [23 - current_execution_spine_and_semantic_control_gate_2026-05-05.md](./23%20-%20current_execution_spine_and_semantic_control_gate_2026-05-05.md).
+- The current stage-loop/hygiene overlay after the AGENT dogfood cut is [24 - agentic_semantic_development_loop_and_autorun_hygiene_2026-05-10.md](./24%20-%20agentic_semantic_development_loop_and_autorun_hygiene_2026-05-10.md).

 It now documents a turnaround that is already operational in code, already materially past the acute regression breakpoint, and already moved through bounded MCP autonomy, Post-F hardening, inventory breadth proof, and the declared Planner Autonomy slice:

--- a/docs/TECH/ui_markup_system.md
+++ b/docs/TECH/ui_markup_system.md
@ -3,7 +3,7 @@
 Документ описывает практический контур, который используется оператором в интерфейсе
 `История автопрогонов`: генерация вопросов, запуск прогонов, разметка ответов, закрытие кейсов и пост-анализ.

-Дата актуализации: `2026-04-09`
+Дата актуализации: `2026-05-10`

 ---

@ -213,6 +213,13 @@ Queue mapping:
 6. Фильтр "скрыть выполненные" корректно исключает `resolved=true`.
 7. Пост-анализ показывает очереди и кандидатов.
 8. Текст в интерфейсе читается без mojibake.
+9. Старые сохраненные автопрогоны с C1-control mojibake в `history.json` и runtime job payload должны отдаваться через backend уже в восстановленной кириллице; контрольные примеры: `БОЛЬШОЙ ОБЩИЙ`, `АЛЬТЕРНАТИВА`.
+
+Важно после правок encoding/autorun:
+
+- перезапустить backend, чтобы UI получил новый repair-слой;
+- обновить список автопрогонов в браузере;
+- если replacement-character `U+FFFD` остается видимым, сравнить API payload `GET /api/autoruns/autogen/history` с состоянием frontend/browser cache.

 ---

@ -221,4 +228,3 @@ Queue mapping:
 1. Async run ограничен `assistant_stage1`.
 2. Качество live-данных зависит от заполнения session-файлов на стороне рантайма.
 3. Пост-анализ основан на фактической ручной разметке; без нее очереди пустые.
-