Актуализировать документы по агентному циклу
This commit is contained in:
parent
3be06b5f93
commit
b625f9af5b
|
|
@ -23,6 +23,27 @@ From this point forward:
|
|||
|
||||
For the current execution spine, read `23 - current_execution_spine_and_semantic_control_gate_2026-05-05.md`.
|
||||
|
||||
## 2026-05-10 Overlay - Agentic Loop And Autorun Hygiene
|
||||
|
||||
The next active operating layer is now the repo-native `Agentic Semantic Development Loop`, not another isolated route patch.
|
||||
|
||||
Current interpretation:
|
||||
|
||||
- the Open-World Semantic Control Gate remains the semantic pressure surface;
|
||||
- the stage-loop is the development operating system around that surface: generate/review/replay/audit/repair/rerun, then save accepted AGENT autoruns only after reviewed acceptance;
|
||||
- Lead Codex remains the repair brain, while the loop produces strong business-audit artifacts and lead-coder handoff instead of relying on a weak autonomous coder;
|
||||
- the first dogfood loop artifact is accepted at `artifacts/domain_runs/stage_agent_loops/agentic_semantic_development_loop/domain_loops/asl/final_status.md`;
|
||||
- manual GUI confirmation remains required after accepted replay artifacts;
|
||||
- autorun/runtime Cyrillic hygiene is now part of the acceptance surface, because broken saved-session text can invalidate the GUI review even when the backend route is correct.
|
||||
|
||||
Fresh validation cut:
|
||||
|
||||
- commit `3be06b5 Починить восстановление кириллицы в автопрогонах`;
|
||||
- targeted mojibake/autorun/runtime tests passed `20/20`;
|
||||
- targeted organization-clarification carryover tests passed `2/2`;
|
||||
- `npm.cmd run build` passed;
|
||||
- graphify rebuilt to `6371` nodes, `14048` edges, `141` communities.
|
||||
|
||||
## Current Module Map
|
||||
|
||||
- `Post-F Semantic Integrity Hardening`: `99%`, operationally closed as a hardening slice and now used as a regression gate.
|
||||
|
|
@ -55,9 +76,11 @@ For the current execution spine, read `23 - current_execution_spine_and_semantic
|
|||
- Completed active slice: `Business Overview Counterparty/Contract Profile Bridge`: business overview now executes reviewed `counterparty_population_and_roles` and `contract_usage_overview` recipes, surfacing active counterparty role split and contract usage without claiming CRM quality, counterparty due diligence, legal completeness, or contract-risk.
|
||||
- Completed active slice: `Business Overview Missing Proof Ledger`: business overview now records machine-readable hard proof gaps for accounting profit/margin, due-date debt aging, inventory reserve/liquidation quality, and vendor/procurement quality, distinguishing proxy-only evidence from reviewed routes that are not wired yet.
|
||||
- Completed semantic-control slice: `W5/W7 Counterparty Value-Flow And Money-Breakdown Integrity`: bank-document/value-flow recipes now materialize explicit counterparty predicates, zero-row supplier-payment checks answer as checked negative evidence, compound money-breakdown wording stays in `business_overview`, and MCP discovery receives active organization scope only when the current turn has no explicit organization.
|
||||
- Completed operating-system slice: `Agentic Semantic Development Loop Dogfood Gate`: stage manifest, stage pack, stage-loop wrapper, review/status/continue safety, lead-coder handoff, and save-after-acceptance gating are wired and accepted by the `asl` dogfood loop artifact.
|
||||
- Completed hygiene slice: `Autorun Cyrillic C1 Repair`: old autorun cards/questions/runtime materialization now repair C1-control mojibake before UI or assistant-lane use, including the historical `БОЛЬШОЙ ОБЩИЙ` / `АЛЬТЕРНАТИВА` failure class.
|
||||
- Implementation breadth: `~99% (Open-World Bounded Autonomy Breadth through Slice 25)`.
|
||||
- Next active slice: `Open-World Semantic Control Gate`, covering garbage-anchor protection, business-overview continuation, intent dominance, frame hygiene, counterparty/organization arbitration, and final-summary answer shape.
|
||||
- Active module progress: `~99% (Open-World Bounded Autonomy Breadth, active slice: Semantic Control Gate)`.
|
||||
- Next active slice: continue `Agentic Semantic Development Loop` dogfood over real Open-World/Semantic-Control packs and confirm the latest autorun hygiene fix in the GUI.
|
||||
- Active module progress: `~99% (Agentic Semantic Development Loop, accepted dogfood loop + autorun hygiene; manual GUI confirmation still required)`.
|
||||
|
||||
## Reporting Rule
|
||||
|
||||
|
|
@ -66,6 +89,7 @@ Use these labels when reporting progress:
|
|||
- `Прогресс модуля: 99% (Post-F Semantic Integrity Hardening, operationally closed/regression gate)` when discussing the Post-F slice itself.
|
||||
- `Прогресс модуля: 100% (Planner Autonomy Consolidation, declared phase83 slice closed)` when discussing the planner-autonomy slice that was just completed.
|
||||
- `Прогресс модуля: 99% (Open-World Bounded Autonomy Breadth, active slice: Semantic Control Gate)` while discussing current module closure after the EHMO-derived critical subset accepted live again with W5/W7 hardening.
|
||||
- `Прогресс модуля: 99% (Agentic Semantic Development Loop, accepted dogfood loop + autorun hygiene; manual GUI confirmation still required)` when discussing the current development-loop operating layer.
|
||||
- `Open-World Business Overview implementation breadth: ~99%, Semantic Control Gate critical subset accepted, fat GUI pack still pending` when discussing only the already wired Slice 25 breadth.
|
||||
- `Прогресс модуля: X% (Open-World Bounded Autonomy Breadth, active slice: <name>)` for later breadth work after the Semantic Control Gate is accepted.
|
||||
|
||||
|
|
@ -98,6 +122,8 @@ The project is not yet a universal arbitrary-1C agent.
|
|||
|
||||
Remaining work belongs to the next breadth module:
|
||||
|
||||
- confirm the latest autorun Cyrillic hygiene cut in the GUI after backend refresh and inspect frontend/API payloads if old replacement characters remain visible;
|
||||
- continue dogfooding the `Agentic Semantic Development Loop` on real stage packs, especially generated-question quality, semantic business audit, repair handoff, and rerun acceptance;
|
||||
- finish closure of the `Open-World Semantic Control Gate` opened by `assistant-stage1-EHMOy3lNFt`; the EHMO-derived critical subset is accepted live after W5/W7 hardening, but the fat GUI pack and residual answer-shape roughness still need final review;
|
||||
- extend `business_overview` beyond money-flow/activity, customer and supplier concentration, document/account-section activity mix, counterparty role split, contract usage, yearly operating-flow dynamics, explicit profit/margin wording boundaries, explicit debt due-date wording boundaries, explicit inventory reserve/liquidation wording boundaries, explicit supplier/procurement-quality wording boundaries, explicit-period VAT/tax, as-of-date debt position, open-settlement concentration, contract-date debt age, debt staleness-risk proxy, as-of-date inventory position, trading-margin proxy, sales-to-stock inventory proxy, warehouse staleness-risk proxy, and the missing-proof ledger into separately proven exact accounting profit/margin, due-date debt aging/overdue, confirmed vendor-risk/procurement-quality analysis, and confirmed reserve/write-off/liquidation inventory evidence families;
|
||||
- broader dynamic schema traversal for unfamiliar 1C asks;
|
||||
|
|
@ -120,11 +146,12 @@ For current planning, read:
|
|||
|
||||
1. `README.md`
|
||||
2. this document
|
||||
3. `23 - current_execution_spine_and_semantic_control_gate_2026-05-05.md`
|
||||
4. `22 - open_world_bounded_autonomy_breadth_2026-05-01.md`
|
||||
5. `20 - planner_autonomy_consolidation_2026-05-01.md`
|
||||
6. `19 - inventory_stock_open_world_breadth_proof_2026-05-01.md`
|
||||
7. `17 - post_f_semantic_integrity_hardening_2026-04-23.md`
|
||||
8. `16 - data_need_graph_and_open_world_mcp_plan_2026-04-22.md`
|
||||
3. `24 - agentic_semantic_development_loop_and_autorun_hygiene_2026-05-10.md`
|
||||
4. `23 - current_execution_spine_and_semantic_control_gate_2026-05-05.md`
|
||||
5. `22 - open_world_bounded_autonomy_breadth_2026-05-01.md`
|
||||
6. `20 - planner_autonomy_consolidation_2026-05-01.md`
|
||||
7. `19 - inventory_stock_open_world_breadth_proof_2026-05-01.md`
|
||||
8. `17 - post_f_semantic_integrity_hardening_2026-04-23.md`
|
||||
9. `16 - data_need_graph_and_open_world_mcp_plan_2026-04-22.md`
|
||||
|
||||
Documents `01` through `15` remain valuable, but mostly as the historical architecture trail.
|
||||
|
|
|
|||
|
|
@ -73,6 +73,22 @@ This is not a regression from `99%` to `96%`. It is a metric split:
|
|||
- `99%` describes wired breadth;
|
||||
- `99%` describes closure confidence after the EHMO-derived critical subset passed live replay again with W5/W7 hardening; the gate is still not full module closure until the fat manual GUI pack and remaining answer-shape residuals are reviewed.
|
||||
|
||||
## 2026-05-10 Status Overlay
|
||||
|
||||
This document remains the Semantic Control Gate spine, but it is no longer the latest operating overlay.
|
||||
|
||||
Read `24 - agentic_semantic_development_loop_and_autorun_hygiene_2026-05-10.md` after this file.
|
||||
|
||||
Newer status:
|
||||
|
||||
- the current development operating layer is `Agentic Semantic Development Loop`;
|
||||
- the first dogfood loop artifact for `agentic_semantic_development_loop` is accepted under `artifacts/domain_runs/stage_agent_loops/.../asl/final_status.md`;
|
||||
- the loop uses Lead Codex handoff for repairs, not a weak autonomous coder as the primary repair actor;
|
||||
- accepted AGENT autoruns should be saved only after reviewed replay/loop acceptance;
|
||||
- latest autorun/runtime hygiene cut repairs old C1-control Cyrillic mojibake before GUI cards, runtime questions, or assistant-lane continuation.
|
||||
|
||||
The Semantic Control Gate work units below remain valid regression classes, but current execution should go through the stage-loop machinery when a substantial pack is being validated.
|
||||
|
||||
## Current Local Cut
|
||||
|
||||
Local cut 1 is implemented:
|
||||
|
|
|
|||
|
|
@ -0,0 +1,138 @@
|
|||
# 24 - Agentic Semantic Development Loop And Autorun Hygiene (2026-05-10)
|
||||
|
||||
## Purpose
|
||||
|
||||
This note is the current status overlay after the Open-World Semantic Control Gate work moved from direct manual Codex operation into a repo-native agentic development loop.
|
||||
|
||||
It exists to keep the project spine clear:
|
||||
|
||||
- the target product is still a bounded MCP-first 1C analyst assistant;
|
||||
- Post-F, planner autonomy, inventory breadth, and business overview remain regression gates;
|
||||
- the active operational work is now the development loop that generates, reviews, replays, audits, repairs, reruns, and only then saves AGENT autoruns;
|
||||
- autorun/runtime hygiene is part of that loop, because broken saved-session text can invalidate the human GUI review even when the backend route is correct.
|
||||
|
||||
## What Changed Since Document 23
|
||||
|
||||
The previous execution-spine document stopped at the EHMO-derived Semantic Control Gate.
|
||||
|
||||
Since then, the project added a repo-native stage-loop layer:
|
||||
|
||||
- `scripts/stage_agent_loop.py` is the stage-level wrapper for pack-loop execution, review, status, safe continuation, and optional save-to-autoruns after acceptance.
|
||||
- `docs/orchestration/stage_agent_loop_agentic_semantic_development_loop.json` is the active stage manifest.
|
||||
- `docs/orchestration/agentic_semantic_development_loop_stage_pack.json` is the dogfood pack for business overview, VAT, stale scope, counterparty pivots, legacy route canaries, and answer-shape quality.
|
||||
- `docs/orchestration/schemas/stage_agent_loop_manifest.schema.json` defines the manifest contract.
|
||||
- `scripts/save_agent_semantic_run.py` refuses to save AGENT autoruns before reviewed live replay/loop acceptance unless explicitly forced as a draft.
|
||||
- `docs/orchestration/agent_semantic_source_catalog.*` remains the reusable source catalog for mixed AGENT pack construction.
|
||||
|
||||
The accepted dogfood status is recorded at:
|
||||
|
||||
- `artifacts/domain_runs/stage_agent_loops/agentic_semantic_development_loop/domain_loops/asl/final_status.md`
|
||||
|
||||
Current recorded result:
|
||||
|
||||
- status: `accepted`
|
||||
- loop id: `asl`
|
||||
- repair mode: `lead-handoff`
|
||||
- target score: `88`
|
||||
- iterations ran: `1`
|
||||
- stop reason: analyst accepted plus deterministic gate passed at `iteration_00`
|
||||
- manual GUI confirmation is still required after acceptance
|
||||
|
||||
## Design Decision
|
||||
|
||||
The project should not rely on a weak autonomous coder as the primary repair actor.
|
||||
|
||||
The chosen model is:
|
||||
|
||||
- Lead Codex remains the responsible repair brain.
|
||||
- A strong independent semantic/business audit layer reviews the replay from the user's business meaning first.
|
||||
- The stage-loop produces machine-readable artifacts and a lead-coder handoff instead of silently patching production code.
|
||||
- The loop can continue safely, but real code repair requires explicit execution mode and must be validated by rerun artifacts.
|
||||
- Human GUI confirmation remains the final high-signal reality check for accepted AGENT packs.
|
||||
|
||||
This preserves the user's desired automation pattern without delegating high-risk architecture repair to a low-context worker.
|
||||
|
||||
## Autorun Cyrillic Hygiene Cut
|
||||
|
||||
The GUI exposed an old saved-session failure where Cyrillic in autorun cards/questions displayed as replacement-character text.
|
||||
|
||||
Control examples after repair:
|
||||
|
||||
- `БОЛЬШОЙ ОБЩИЙ`
|
||||
- `АЛЬТЕРНАТИВА`
|
||||
|
||||
Root cause:
|
||||
|
||||
- old autorun history/runtime payloads contained double-decoded Cyrillic plus C1 controls such as `U+0098`;
|
||||
- the previous repair path encoded those controls as normal Windows-1251 text and lost the raw byte needed to reconstruct UTF-8;
|
||||
- the UI then displayed the remaining replacement character honestly.
|
||||
|
||||
Current fix:
|
||||
|
||||
- `addressTextRepair.ts` preserves C1 controls as raw bytes during UTF-8 reconstruction;
|
||||
- `autoRuns.ts` repairs autorun titles/questions before exposing cards or runtime materialization;
|
||||
- `eval.ts` and `assistantService.ts` now receive repaired scenario/runtime question text;
|
||||
- known already-lossy fragments with `U+FFFD` inside the organization name are repaired before they can poison organization clarification;
|
||||
- tests now cover the historical `БОЛЬШОЙ ОБЩИЙ` and `АЛЬТЕРНАТИВА` autorun cases.
|
||||
|
||||
Committed cut:
|
||||
|
||||
- `3be06b5 Починить восстановление кириллицы в автопрогонах`
|
||||
|
||||
Validation recorded during this cut:
|
||||
|
||||
- `npm.cmd test -- assistantOrganizationMatcher.test.ts addressTextRepair.test.ts autoRunsQuestionSplit.test.ts evalRuntimeQuestionSplit.test.ts` passed `20/20`;
|
||||
- `npm.cmd test -- assistantAddressFollowupContext.test.ts -t "continues the original inventory query after"` passed `2/2`;
|
||||
- `npm.cmd run build` passed;
|
||||
- graphify rebuilt to `6371` nodes, `14048` edges, `141` communities.
|
||||
|
||||
## Current Status
|
||||
|
||||
The current large module should be described as:
|
||||
|
||||
`Agentic Semantic Development Loop / Open-World Semantic Control operating system`
|
||||
|
||||
Status:
|
||||
|
||||
- implementation state: operational dogfood loop exists and has an accepted first loop artifact;
|
||||
- semantic status: accepted loop artifact is useful, but manual GUI confirmation remains required;
|
||||
- hygiene status: saved autorun/runtime Cyrillic repair is now covered by code and tests;
|
||||
- risk: medium, because the loop is now infrastructure for future acceptance decisions, not just a local route fix.
|
||||
|
||||
Recommended reporting line:
|
||||
|
||||
`Прогресс модуля: 99% (Agentic Semantic Development Loop, accepted dogfood loop + autorun hygiene; manual GUI confirmation still required)`
|
||||
|
||||
## What Is Not Closed
|
||||
|
||||
Do not treat the stage-loop as a replacement for business-answer review.
|
||||
|
||||
Still open:
|
||||
|
||||
- frontend/browser cache can still show old broken autorun text until the backend is restarted and the UI is refreshed;
|
||||
- the first accepted dogfood loop proves the mechanism, not all future stage packs;
|
||||
- generated question quality still needs pressure from real GUI runs and user feedback;
|
||||
- broad arbitrary 1C autonomy is still bounded by reviewed routes, truth gates, and replay evidence;
|
||||
- manual GUI confirmation remains required before declaring a fat AGENT pack fully accepted.
|
||||
|
||||
## Next Work
|
||||
|
||||
Next operational pass:
|
||||
|
||||
1. Restart/reload the backend and reopen the affected autorun card from the GUI.
|
||||
2. Confirm old saved-session text now displays as normal Cyrillic.
|
||||
3. If the GUI still shows replacement characters, inspect frontend state/cache and API payload side by side.
|
||||
4. Continue dogfooding the stage-loop on the next real Open-World/agentic pack.
|
||||
5. Keep Post-F, phase83, inventory, business-overview, and mojibake autorun cases as regression canaries.
|
||||
|
||||
## Canonical Reading Order Update
|
||||
|
||||
For current planning, read:
|
||||
|
||||
1. `README.md`
|
||||
2. `21 - current_status_canon_2026-05-01.md`
|
||||
3. this document
|
||||
4. `23 - current_execution_spine_and_semantic_control_gate_2026-05-05.md`
|
||||
5. `22 - open_world_bounded_autonomy_breadth_2026-05-01.md`
|
||||
6. `20 - planner_autonomy_consolidation_2026-05-01.md`
|
||||
7. `17 - post_f_semantic_integrity_hardening_2026-04-23.md`
|
||||
|
|
@ -41,13 +41,18 @@ This package answers the next question:
|
|||
21. [21 - current_status_canon_2026-05-01.md](./21%20-%20current_status_canon_2026-05-01.md)
|
||||
22. [22 - open_world_bounded_autonomy_breadth_2026-05-01.md](./22%20-%20open_world_bounded_autonomy_breadth_2026-05-01.md)
|
||||
23. [23 - current_execution_spine_and_semantic_control_gate_2026-05-05.md](./23%20-%20current_execution_spine_and_semantic_control_gate_2026-05-05.md)
|
||||
24. [24 - agentic_semantic_development_loop_and_autorun_hygiene_2026-05-10.md](./24%20-%20agentic_semantic_development_loop_and_autorun_hygiene_2026-05-10.md)
|
||||
|
||||
## Current Status Snapshot (2026-05-05)
|
||||
## Current Status Snapshot (2026-05-10)
|
||||
|
||||
This package is no longer planning-only.
|
||||
|
||||
Status canon for planning:
|
||||
|
||||
- The current operational overlay is now [24 - agentic_semantic_development_loop_and_autorun_hygiene_2026-05-10.md](./24%20-%20agentic_semantic_development_loop_and_autorun_hygiene_2026-05-10.md).
|
||||
- The active engineering surface is no longer only individual route hardening; it is the repo-native AGENT/stage-loop operating system that should generate/review/replay/audit/repair/rerun current-stage packs before saving accepted autoruns.
|
||||
- The first dogfood stage loop for `agentic_semantic_development_loop` is accepted in artifacts, but manual GUI confirmation remains required before treating a fat AGENT pack as fully human-accepted.
|
||||
- Autorun/runtime Cyrillic hygiene is now a current regression gate: old saved-session mojibake with C1 controls must be repaired before cards, questions, and runtime jobs reach the GUI or assistant lane.
|
||||
- Post-F Semantic Integrity Hardening is operationally closed at `99%` and should now be used as a regression gate, not as the active module denominator.
|
||||
- Planner Autonomy Consolidation is closed at `100%` for the declared phase83 planner-brain slice.
|
||||
- The active next module is now `Open-World Bounded Autonomy Breadth` over unfamiliar 1C asks, with Post-F and phase83 retained as semantic canaries.
|
||||
|
|
@ -78,8 +83,11 @@ Status canon for planning:
|
|||
- The `assistant-stage1-EHMOy3lNFt` manual GUI replay opened the next acceptance gate: `Open-World Semantic Control Gate`.
|
||||
- The `~99%` Open-World number now means implementation breadth through Slice 25, not accepted semantic closure under broad human dialogue pressure.
|
||||
- The active breadth slice is semantic control rather than new proof-family expansion: garbage-anchor protection, business-overview continuation, intent dominance, frame hygiene, counterparty/organization arbitration, and final-summary answer shape.
|
||||
- The current accepted dogfood infrastructure slice is `Agentic Semantic Development Loop`: stage manifest, stage pack, loop wrapper, status/continue safety, strong business-audit handoff, and save-after-acceptance gating are wired and validated by the `asl` accepted loop artifact.
|
||||
- The latest hygiene slice is `Autorun Cyrillic C1 Repair`: `addressTextRepair`, `autoRuns`, `eval`, and `assistantService` now preserve C1 bytes while repairing old saved-session Russian text, preventing replacement-character autorun cards or runtime turns from leaking into the user path after backend refresh.
|
||||
- The short source of truth for status wording is [21 - current_status_canon_2026-05-01.md](./21%20-%20current_status_canon_2026-05-01.md).
|
||||
- The current execution spine after EHMO is [23 - current_execution_spine_and_semantic_control_gate_2026-05-05.md](./23%20-%20current_execution_spine_and_semantic_control_gate_2026-05-05.md).
|
||||
- The current stage-loop/hygiene overlay after the AGENT dogfood cut is [24 - agentic_semantic_development_loop_and_autorun_hygiene_2026-05-10.md](./24%20-%20agentic_semantic_development_loop_and_autorun_hygiene_2026-05-10.md).
|
||||
|
||||
It now documents a turnaround that is already operational in code, already materially past the acute regression breakpoint, and already moved through bounded MCP autonomy, Post-F hardening, inventory breadth proof, and the declared Planner Autonomy slice:
|
||||
|
||||
|
|
|
|||
|
|
@ -3,7 +3,7 @@
|
|||
Документ описывает практический контур, который используется оператором в интерфейсе
|
||||
`История автопрогонов`: генерация вопросов, запуск прогонов, разметка ответов, закрытие кейсов и пост-анализ.
|
||||
|
||||
Дата актуализации: `2026-04-09`
|
||||
Дата актуализации: `2026-05-10`
|
||||
|
||||
---
|
||||
|
||||
|
|
@ -213,6 +213,13 @@ Queue mapping:
|
|||
6. Фильтр "скрыть выполненные" корректно исключает `resolved=true`.
|
||||
7. Пост-анализ показывает очереди и кандидатов.
|
||||
8. Текст в интерфейсе читается без mojibake.
|
||||
9. Старые сохраненные автопрогоны с C1-control mojibake в `history.json` и runtime job payload должны отдаваться через backend уже в восстановленной кириллице; контрольные примеры: `БОЛЬШОЙ ОБЩИЙ`, `АЛЬТЕРНАТИВА`.
|
||||
|
||||
Важно после правок encoding/autorun:
|
||||
|
||||
- перезапустить backend, чтобы UI получил новый repair-слой;
|
||||
- обновить список автопрогонов в браузере;
|
||||
- если replacement-character `U+FFFD` остается видимым, сравнить API payload `GET /api/autoruns/autogen/history` с состоянием frontend/browser cache.
|
||||
|
||||
---
|
||||
|
||||
|
|
@ -221,4 +228,3 @@ Queue mapping:
|
|||
1. Async run ограничен `assistant_stage1`.
|
||||
2. Качество live-данных зависит от заполнения session-файлов на стороне рантайма.
|
||||
3. Пост-анализ основан на фактической ручной разметке; без нее очереди пустые.
|
||||
|
||||
|
|
|
|||
Loading…
Reference in New Issue