2.6 KiB
2.6 KiB
Repo runtime map
Existing runtime pieces
- Assistant backend:
llm_normalizer/backend - Technical export formatter:
llm_normalizer/frontend/src/utils/conversationExport.ts - Async single-case runner:
POST /api/eval/run-async/start - Async polling:
GET /api/eval/run-async/:job_id - Session store path:
llm_normalizer/data/assistant_sessions/<session_id>.json - Session API:
GET /api/assistant/session/:session_id
Capture strategy for this repo
Current mutable domain source of truth:
docs/orchestration/active_domain_contract.json
- Prefer automated capture with:
python scripts/domain_case_loop.py run-case ...
- For linked multi-step scenarios, capture with:
python scripts/domain_case_loop.py run-scenario --manifest ...
- For full domain pools grouped into several scenarios, capture with:
python scripts/domain_case_loop.py run-pack --manifest ...
- For autonomous analyst/coder improvement over a full pack, run:
python scripts/domain_case_loop.py run-pack-loop --manifest ...
- If baseline already exists as copied markdown export, import it with:
python scripts/domain_case_loop.py import-export ...
- Use
baseline_turn.json/rerun_turn.jsonas canonical analyst input for case mode. - Use
scenario_state.jsonplus per-stepturn.jsonas canonical analyst input for scenario mode. - Use
pack_state.jsonplus per-scenarioscenario_state.jsonas canonical analyst input for pack mode. - Use
loop_state.jsonplus per-iterationanalyst_verdict.json/coder_result.jsonas canonical analyst input for autonomous pack-loop mode. - Use
baseline_output.md/rerun_output.mdor per-stepoutput.mdas human-readable paired artifacts. - For follow-up-heavy domains, use
scenario_acceptance_matrix.mdas the canonical coverage view for scenario-tree nodes, edges, and paraphrase families.
Default run assumptions
- backend URL:
http://127.0.0.1:8787 - eval target:
assistant_stage1 - single-case async run uses generated case id
AUTO-001 - artifact root:
artifacts/domain_runs/<case_id>/ - scenario capture uses
POST /api/assistant/messageandGET /api/assistant/session/:session_id - live runners perform backend preflight via
GET /api/health run-pack-loopdefaults togpt-5.4for analyst andgpt-5.4-minifor coder
Important constraints
- Reuse current assistant runtime; do not build a parallel execution lane.
- Preserve UTF-8 without BOM for every generated artifact.
- Do not overwrite existing AGENTS rules; extend them.
- Do not treat a root node success as domain acceptance when selected-object or drilldown edges on the primary user path are still broken.