NODEDC_1C/AGENTS.md at 62f9bad750c369960d8e5aacc4c44a05bc1c18c5

11 KiB

Raw Blame History

encoding_rule

All source/code/config/docs files must be saved and edited in UTF-8 without BOM; never write mojibake placeholders or replacement characters.

commit_message_rule

After applying fixes, always provide the user with a ready commit title in Russian.

change_risk_rule

After applying fixes, always provide Степень опасности правки: X/10 immediately above the ready commit title.
The score must use an integer scale from 1 to 10, where:
- 1 = low-risk local change with narrow blast radius;
- 10 = high-risk architecture/runtime change with broad blast radius and mandatory close validation.
The score must reflect real project risk, not optimism, and should help the user decide how much manual attention and replay validation the change deserves.

closeout_risk_reporting_rule

After applying fixes, always provide Потенциал регресса на текущем этапе: X%.
After applying fixes, always provide Необходимость жирного ручного прогона: X%.
These two lines must be emitted together with the change-risk score and the ready commit title in every close-out.
Both percentages must use an integer scale from 0% to 100%.
Потенциал регресса на текущем этапе must reflect the real probability that nearby or not-yet-covered contours can regress at the current stabilization stage.
Необходимость жирного ручного прогона must reflect how strongly the current change still needs a broad manual reality-check beyond unit tests, narrow replay, and build verification.
The percentages must be honest, architecture-aware, and useful for deciding whether the current pass is safe enough to trust without additional human validation.

graphify

This project has a graphify knowledge graph at graphify-out/.

Rules:

Before answering architecture or codebase questions, read graphify-out/GRAPH_REPORT.md for god nodes and community structure
If graphify-out/wiki/index.md exists, navigate it instead of reading raw files
After modifying code files in this session, run python -c "from graphify.watch import _rebuild_code; from pathlib import Path; _rebuild_code(Path('.'))" to keep the graph current

codex_domain_loop

Project-scoped Codex orchestration lives under .codex/.
Use .codex/skills/domain-case-loop for repeatable domain hardening loops on one concrete case.
Prefer docs/orchestration/active_domain_contract.json as the single mutable source of truth for the current domain/scenario pack; keep the agent canon stable and swap only this file when the active domain changes.
The same skill/launcher also supports multi-step domain scenarios with shared assistant session state under artifacts/domain_runs/<scenario_id>/steps/.
For full domain question pools, use pack mode and aggregate artifacts under artifacts/domain_runs/<pack_id>/scenarios/.
Preserve current architecture: domain loop may automate capture, review, rerun, and artifact storage, but must not rewrite runtime foundations.
Prefer machine-readable case artifacts in artifacts/domain_runs/<case_id>/, especially baseline_turn.json / rerun_turn.json, over ad hoc prose-only summaries.
For cascading user questions in one domain, prefer scenario artifacts (scenario_manifest.json, scenario_state.json, per-step turn.json) over separate unlinked case folders.
For follow-up-heavy domains, treat acceptance as scenario-tree coverage: root node, critical child nodes, critical edges, and the primary user path must be validated explicitly.
Do not accept a domain when only the root snapshot works but selected-object or drilldown follow-up edges still fail.
For critical branches, validate at least canonical wording, colloquial wording, and UI-generated selected-object wording when that UX exists.
Treat temporal carryover, selected-object carryover, answer-shape match, and ordering semantics as first-class acceptance invariants rather than optional polish.
Treat direct-answer-first behavior, business usefulness, selected-object memory, and field truthfulness as first-class analyst criteria rather than optional presentation polish.
Treat stable focus_object, reusable bundles such as provenance_bundle, and pronoun-style follow-up resolution (по ней, по этой позиции) as first-class analyst criteria in follow-up-heavy domains.
Treat action-first selected-object follow-ups, layered answer shape, stable answer_object, and temporal honesty about out-of-window evidence as first-class analyst criteria rather than optional polish.
If a case falls outside the current routed contour because the route/intent/capability is not wired yet, treat it as domain enablement work for this project, not as automatic out-of-scope rejection.
For new unmarked domains, needs_exact_capability means "bootstrap or extend the contour" rather than "close the case as unsupported".
A case can be marked accepted only when analyst verdict is at least 80/100, no unresolved P0 remains, and the rerun does not mask heuristic output as confirmed.

agent_semantic_runs

АГЕНТНЫЙ ПРОГОН is a targeted full semantic replay for the current architecture fix, not a generic smoke test.
Use it to validate human user questions, human model answers, technical chats, business logic, and system routing together.
Build question lists around the active fix: mix direct domain questions with contextual chains, meta interruptions, cross-domain pivots, and follow-up edges that specifically hit the architecture change under validation.
Do not run or save an АГЕНТНЫЙ ПРОГОН on every turn by default.
Run it when the user explicitly asks for it, or when a substantial architecture/domain fix needs critical semantic proof beyond unit tests and narrow synthetic checks.
АГЕНТНЫЙ ПРОГОН has a mandatory execution order. The correct order is:
1. prepare or update the replay spec;
2. run the replay live against the real assistant runtime;
3. inspect machine artifacts and judge business/logic/technical quality;
4. patch architecture/domain code if needed;
5. rerun the same replay until the scenario is semantically clean;
6. only after that, save the question pack into autoruns as legacy.
Do not treat "questions were saved into autoruns" as "the AGENT run was executed". Saving questions is not the run. It is only a post-run persistence step.
Preferred repo-native system tools for АГЕНТНЫЙ ПРОГОН are:
- build/update a mixed pack from reusable sources: python scripts/agent_semantic_pack_builder.py build-pack --recipe <recipe> --output-spec docs/orchestration/<spec>.json
- bootstrap a spec from a technical export: python scripts/domain_truth_harness.py bootstrap --export <export.md> --output docs/orchestration/<spec>.json --scenario-id <scenario_id> --domain <domain>
- execute the real replay: python scripts/domain_truth_harness.py run-live --spec docs/orchestration/<spec>.json --output-dir artifacts/domain_runs/<run_id>
- save the already-validated replay into autoruns: python scripts/save_agent_semantic_run.py --spec docs/orchestration/<spec>.json
The default artifact-reading order after run-live is:
- artifacts/domain_runs/<run_id>/final_status.md
- artifacts/domain_runs/<run_id>/truth_review.md
- artifacts/domain_runs/<run_id>/pack_state.json
- artifacts/domain_runs/<run_id>/steps/<step_id>/turn.json
- artifacts/domain_runs/<run_id>/steps/<step_id>/output.md
When reviewing a replay, do not trust only the top-level accepted/pass flag. A run can still hide a semantic bug if the step-level answer is business-wrong, logically wrong, context-leaking, or routed through the wrong lane.
Do not mislabel a valid clarification as a bug. If the assistant correctly asks the user to choose an organization/company because the active contour is ambiguous, that is normal behavior, not a regression.
For multi-company contours, the AGENT run must continue the same session after the clarification and explicitly choose the company needed for the scenario. Do not stop the analysis at "уточните организацию"; extend the replay with the natural next user turn that selects the company and then continue hardening the real business path.
If the replay reveals business-answer defects, logic defects, stale carryover, answer-shape mismatch, or technical routing bugs, fix the architecture/domain code first and rerun the same spec before saving anything to autoruns.
If the replay reveals a capability gap rather than a regression, do not frame it as "the system is buggy". Frame it as unfinished contour/domain enablement work and keep iterating until the missing path is either implemented or honestly bounded.
A blocked answer inside the replay is not the end of the analysis. The agent must ask why the system could not answer, inspect reachable MCP/1C evidence, and decide whether the missing business answer can be recovered by a new route, a new capability, or an evidence-based derived answer.
When the direct fact is unavailable in the current contour but recoverable from 1C activity evidence, prefer domain enablement work: fetch the supporting evidence via MCP/1C, derive the business-useful answer carefully, and state the derivation basis honestly. Example: if legal registration age is unavailable, the system may answer with age/activity duration inferred from the first and latest confirmed 1C activity, explicitly marked as an inference rather than a legal registration fact.
When a fact cannot be proven exactly, the user-facing answer must say what is confirmed, what is inferred, and what remains unknown. Do not present an inferred business estimate as a юридический or formally confirmed fact.
Save agent-built question packs into autoruns under Пользовательские сессии with title prefix AGENT | ... only after the live replay has been executed and reviewed.
Agent semantic runs saved into autoruns must remain runnable by the user from the UI like any other saved user session.
If a pack was saved too early by mistake, treat it as an invalid intermediate artifact: remove its files from llm_normalizer/data/autorun_generators/saved_sessions/, llm_normalizer/data/eval_cases/, and its record from llm_normalizer/data/autorun_generators/history.json, then regenerate it only after the successful replay.
The goal of an AGENT run is not only to confirm routes but to actively improve the assistant until the problematic questions are handled acceptably. Run, inspect, fix, rerun, and repeat until the critical business questions in the scenario are no longer broken, misleading, or underpowered.
Evaluate the replay primarily through the user-facing business answer. Internal labels, raw route ids, capability ids, debug enums, snapshot_items, bank_operations_by_*, answer_object, and other service metadata are for diagnosis only; they must not leak into the user-facing answer and must not dominate the analyst verdict.
Treat "technical garbage in the final answer" as a real quality defect even when the underlying route is correct. The hardened assistant should surface business meaning first and keep internal mechanics out of the user's head unless the user explicitly asks for technical detail.

11 KiB Raw Blame History Unescape Escape

encoding_rule

commit_message_rule

change_risk_rule

closeout_risk_reporting_rule

graphify

codex_domain_loop

agent_semantic_runs

11 KiB

Raw Blame History