АДРЕСНЫЙ РЕЖИМ - M2.3e: стабилизация address runtime парсинг периодов, LLM pre-decompose fallback и приоритет intent для банковских операций

This commit is contained in:
dctouch 2026-04-01 19:35:25 +03:00
parent 4d59672576
commit 18e0f1364d
16 changed files with 1467 additions and 562 deletions

File diff suppressed because it is too large Load Diff

View File

@ -25,4 +25,5 @@
- `docs/ADDRESS/runs/2026-03-29_Address_Query_Runtime_V1_M2_3B_AccountScope_Mode_Tuning/`
- `docs/ADDRESS/runs/2026-03-29_Address_Query_Runtime_V1_M2_3C_Resolver_Filter_Tuning_And_AccountScope_Audit/`
- `docs/ADDRESS/runs/2026-04-01_Address_Query_Runtime_V1_M2_3D_Query_Variants_Expansion/`
- `docs/ADDRESS/runs/2026-04-01_Address_Query_Runtime_V1_M2_3E_Stability_Hardening_AccountQueryScope/`

View File

@ -0,0 +1,59 @@
# Address Query Runtime V1 - M2.3e Stability Hardening (Noise + Account Query Scope)
Date: 2026-04-01 (updated)
## Goal
Finalize the stage before local LLM switch with production-safe stability:
- robust noisy phrase handling;
- deterministic month/account filtering;
- safer LLM pre-decompose integration without semantic drift.
## Implemented
1. Noise hardening for counterparty extraction:
- slang/noise tails (`плс`, `pls`, profanity markers) are ignored as anchors;
- free-text heuristic for counterparty in document/bank requests without strict `по <anchor>`.
2. Relaxed year extraction:
- compact forms like `20 год` -> `2020-01-01..2020-12-31`;
- standalone year mentions in noisy phrasing.
3. Month period hardening for balance intents:
- equivalent mapping for:
- `на май 2020`
- `на 2020.05`
- `на 2020 май`
- all forms now converge to `period_from=2020-05-01`, `period_to=2020-05-31`, `as_of_date=2020-05-31`.
4. Account query scope hardening in recipe:
- account condition injected into movements query using account code fields;
- removed fragile presentation-based filtering path for account subaccounts.
5. LLM pre-decompose salvage path:
- when strict normalized payload has no usable fragments, runtime now extracts a safe fragment from raw model JSON;
- decomposition reason now can be `raw_fragment_applied` instead of always falling back to deterministic parsing.
6. Intent routing hardening against LLM drift:
- `bank_operations_by_counterparty` has higher priority than generic account fallback when party/bank signals are present;
- strengthened Russian phrasing detection for `documents_forming_balance` (including participle forms like `формирующие остаток`).
7. Test coverage expansion:
- month parsing regressions (`на май 2020`, `на 2020.05`, `на 2020 май`);
- bank-ops intent priority regression with injected account hints;
- documents-forming-balance participle phrasing regression;
- existing noisy-query and account-scope regressions retained.
## Verification
- `npx vitest tests/addressQueryRuntimeM23.test.ts` -> PASS (`46` tests)
- `npx vitest tests/assistantAddressFollowupContext.test.ts` -> PASS (`1` test)
- `npm run build` -> PASS
## Notes
- Architecture remains hybrid and stable: deterministic parser + guarded heuristics + optional LLM pre-decompose.
- LLM pre-decompose is now resilient to schema drift in local model output.
- No free-form query builder added.
- No deep-analysis lane changes.

View File

@ -0,0 +1,29 @@
{
"run_id": "2026-04-01_Address_Query_Runtime_V1_M2_3E_Stability_Hardening_AccountQueryScope",
"comparison": {
"baseline_ref": "2026-04-01_Address_Query_Runtime_V1_M2_3D_Query_Variants_Expansion",
"current_ref": "workspace (after M2.3e stability update)"
},
"metrics": {
"address_m23_test_cases": {
"before": 39,
"after": 46,
"delta": 7
},
"address_m23_test_failures": {
"before": 0,
"after": 0,
"delta": 0
},
"llm_predecompose_signal": {
"before": "attempted_but_often_no_usable_fragment",
"after": "raw_fragment_salvage_available",
"delta": "stability_improved"
}
},
"notes": [
"delta includes month-format parsing regressions and intent-priority regressions",
"added fallback extraction from raw local LLM JSON when strict normalized fragments are empty",
"added regression checks for: 'на 2020.05', 'на 2020 май', bank ops with account hints"
]
}

View File

@ -0,0 +1,11 @@
llm_normalizer/backend/src/services/addressFilterExtractor.ts
llm_normalizer/backend/src/services/addressIntentResolver.ts
llm_normalizer/backend/src/services/addressRecipeCatalog.ts
llm_normalizer/backend/src/services/assistantService.ts
llm_normalizer/backend/tests/addressQueryRuntimeM23.test.ts
docs/ADDRESS/runs/2026-04-01_Address_Query_Runtime_V1_M2_3E_Stability_Hardening_AccountQueryScope/README.md
docs/ADDRESS/runs/2026-04-01_Address_Query_Runtime_V1_M2_3E_Stability_Hardening_AccountQueryScope/run_summary.json
docs/ADDRESS/runs/2026-04-01_Address_Query_Runtime_V1_M2_3E_Stability_Hardening_AccountQueryScope/before_after_metrics.json
docs/ADDRESS/runs/2026-04-01_Address_Query_Runtime_V1_M2_3E_Stability_Hardening_AccountQueryScope/smoke_checks.md
docs/ADDRESS/runs/2026-04-01_Address_Query_Runtime_V1_M2_3E_Stability_Hardening_AccountQueryScope/changed_files.txt

View File

@ -0,0 +1,43 @@
{
"run_id": "2026-04-01_Address_Query_Runtime_V1_M2_3E_Stability_Hardening_AccountQueryScope",
"date": "2026-04-01",
"stage": "Address Query Runtime V1",
"wave": "M2.3e",
"goal": "Stability hardening before local LLM switch: noisy phrase robustness + account/query period determinism + LLM pre-decompose resilience",
"status": "COMPLETED_UPDATED",
"scope": {
"new_intents": false,
"deep_analysis_changes": false,
"focus": [
"noisy phrase anchor stability",
"relaxed year extraction",
"month format normalization (month-year/year-month/year month)",
"account query scope in movements recipe",
"LLM pre-decompose raw-fragment salvage",
"intent priority hardening for bank ops vs account fallback",
"regression/unit coverage expansion"
]
},
"checks": {
"test_command_primary": "npx vitest tests/addressQueryRuntimeM23.test.ts",
"test_command_secondary": "npx vitest tests/assistantAddressFollowupContext.test.ts",
"build_command": "npm run build",
"tests_passed": 47,
"tests_failed": 0
},
"guardrails": {
"false_factual_rate_target": 0,
"free_form_query_builder": "not_added",
"whitelist_recipe_policy": "unchanged"
},
"key_changes": {
"noise_tokens_not_used_as_anchor": true,
"free_text_counterparty_heuristic": true,
"compact_year_support": true,
"month_year_variants_supported": true,
"movements_query_account_condition_injected": true,
"llm_predecompose_raw_salvage": true,
"intent_priority_bank_ops_over_account_fallback": true,
"documents_forming_balance_participle_detection": true
}
}

View File

@ -0,0 +1,23 @@
# Smoke Checks
## Backend tests
- Command: `npx vitest tests/addressQueryRuntimeM23.test.ts`
- Result: PASS
- Details: `1 passed file`, `46 passed tests`, `0 failed`
- Command: `npx vitest tests/assistantAddressFollowupContext.test.ts`
- Result: PASS
- Details: `1 passed file`, `1 passed test`, `0 failed`
## Build
- Command: `npm run build`
- Result: PASS
- Details: TypeScript build completed without errors.
## Manual spot-check (address lane)
- File: `docs/ADDRESS/1.txt`
- Result: month-variant account questions produce consistent May-2020 window.
- Result: last bank-operations question now covered by intent-priority hardening (expected recipe: bank ops by counterparty).

View File

@ -14,8 +14,10 @@ const YEAR_RANGE_LOOSE_PATTERN = /\b(20\d{2})\b\s*(?:[-‐‑‒–—―−]|д
const YEAR_PERIOD_PATTERN = /(?:за|for)\s*(20\d{2})(?!\s*(?:[-‐‑‒–—―−]|до|to|по)\s*20\d{2})\s*(?:г(?:од|ода)?\.?|year)?/iu;
const YEAR_PERIOD_SHORT_PATTERN = /(?:^|[\s,.;:!?()\-])(\d{2})\s*(?:г(?:од|ода)?\.?|year)(?=$|[\s,.;:!?()\-])/iu;
const YEAR_PERIOD_ANY_PATTERN = /(?:^|[\s,.;:!?()\-])((?:19|20)\d{2})(?!\s*(?:[-‐‑‒–—―−]|до|to|по)\s*(?:19|20)\d{2})(?![.\/-]\d)(?:\s*(?:г(?:од|ода)?\.?|year))?(?=$|[\s,.;:!?()\-])/iu;
const MONTH_PERIOD_NUMERIC_PATTERN = /(?:за|for)\s*(0?[1-9]|1[0-2])[.\/-](20\d{2})/i;
const MONTH_PERIOD_NAME_PATTERN = /(?:за|for)\s+([a-zа-яё]+)\s+(20\d{2})(?:\s*г(?:од|ода|\\.)?)?/iu;
const MONTH_PERIOD_NUMERIC_MONTH_YEAR_PATTERN = /(?:^|[\s,.;:!?()\-])(?:за|for|на|in)?\s*(0?[1-9]|1[0-2])[.\/-](20\d{2})(?=$|[\s,.;:!?()\-])/iu;
const MONTH_PERIOD_NUMERIC_YEAR_MONTH_PATTERN = /(?:^|[\s,.;:!?()\-])(?:за|for|на|in)?\s*(20\d{2})[.\/-](0?[1-9]|1[0-2])(?=$|[\s,.;:!?()\-])/iu;
const MONTH_PERIOD_NAME_PATTERN = /(?:^|[\s,.;:!?()\-])(?:за|for|на|in)?\s*([a-zа-яё]+)\s+(20\d{2})(?:\s*г(?:од|ода|\\.)?)?(?=$|[\s,.;:!?()\-])/iu;
const MONTH_PERIOD_NAME_YEAR_FIRST_PATTERN = /(?:^|[\s,.;:!?()\-])(?:за|for|на|in)?\s*(20\d{2})(?:\s*г(?:од|ода|\\.)?)?\s+([a-zа-яё]+)(?=$|[\s,.;:!?()\-])/iu;
function toIsoDate(year, month, day) {
if (!Number.isInteger(year) || !Number.isInteger(month) || !Number.isInteger(day)) {
return null;
@ -100,10 +102,22 @@ function resolveMonthByName(rawMonthName) {
return undefined;
}
function extractMonthPeriod(text) {
const numericMatch = text.match(MONTH_PERIOD_NUMERIC_PATTERN);
if (numericMatch) {
const month = Number(numericMatch[1]);
const year = Number(numericMatch[2]);
const numericMonthYearMatch = text.match(MONTH_PERIOD_NUMERIC_MONTH_YEAR_PATTERN);
if (numericMonthYearMatch) {
const month = Number(numericMonthYearMatch[1]);
const year = Number(numericMonthYearMatch[2]);
if (month >= 1 && month <= 12 && year >= 2000 && year <= 2099) {
const lastDay = new Date(Date.UTC(year, month, 0)).getUTCDate();
return {
period_from: `${year}-${String(month).padStart(2, "0")}-01`,
period_to: `${year}-${String(month).padStart(2, "0")}-${String(lastDay).padStart(2, "0")}`
};
}
}
const numericYearMonthMatch = text.match(MONTH_PERIOD_NUMERIC_YEAR_MONTH_PATTERN);
if (numericYearMonthMatch) {
const year = Number(numericYearMonthMatch[1]);
const month = Number(numericYearMonthMatch[2]);
if (month >= 1 && month <= 12 && year >= 2000 && year <= 2099) {
const lastDay = new Date(Date.UTC(year, month, 0)).getUTCDate();
return {
@ -124,6 +138,18 @@ function extractMonthPeriod(text) {
};
}
}
const byNameYearFirstMatch = text.match(MONTH_PERIOD_NAME_YEAR_FIRST_PATTERN);
if (byNameYearFirstMatch) {
const year = Number(byNameYearFirstMatch[1]);
const month = resolveMonthByName(String(byNameYearFirstMatch[2]));
if (month && year >= 2000 && year <= 2099) {
const lastDay = new Date(Date.UTC(year, month, 0)).getUTCDate();
return {
period_from: `${year}-${String(month).padStart(2, "0")}-01`,
period_to: `${year}-${String(month).padStart(2, "0")}-${String(lastDay).padStart(2, "0")}`
};
}
}
return {};
}
function extractPeriodRange(text) {

View File

@ -100,6 +100,19 @@ const BANK_OPERATIONS_BY_COUNTERPARTY_HINTS = [
function hasAny(text, patterns) {
return patterns.some((item) => text.includes(item));
}
function hasDocumentsFormingBalanceSignal(text) {
if (hasAny(text, DOCUMENTS_FORMING_BALANCE_HINTS)) {
return true;
}
const hasDocLexeme = text.includes("документ") || text.includes("доки");
const hasFormingLexeme = text.includes("формир");
const hasBalanceLexeme = text.includes("остат");
const hasAccountLexeme = text.includes("счет") || text.includes("счёт") || hasAccountNumberAnchor(text);
if (hasDocLexeme && hasFormingLexeme && hasBalanceLexeme && hasAccountLexeme) {
return true;
}
return hasBalanceLexeme && hasAccountLexeme && text.includes("из чего состоит");
}
function isLikelyCounterpartyToken(rawToken) {
const token = String(rawToken ?? "").trim().toLowerCase();
if (!token || token.length < 2) {
@ -276,20 +289,13 @@ function resolveAddressIntent(userMessage) {
reasons: ["payables_signal_detected"]
};
}
if (hasAny(text, DOCUMENTS_FORMING_BALANCE_HINTS) && (hasAccountNumberAnchor(text) || text.includes("счет"))) {
if (hasDocumentsFormingBalanceSignal(text) && (hasAccountNumberAnchor(text) || text.includes("счет"))) {
return {
intent: "documents_forming_balance",
confidence: "high",
reasons: ["documents_forming_balance_signal_detected"]
};
}
if (hasAny(text, ACCOUNT_BALANCE_HINTS) || hasAccountNumberAnchor(text)) {
return {
intent: "account_balance_snapshot",
confidence: "high",
reasons: ["account_balance_signal_detected"]
};
}
if (hasAny(text, BANK_OPERATIONS_BY_COUNTERPARTY_HINTS) &&
(hasPartyAnchorMention(text) || hasLooseByAnchorMention(text) || hasHeuristicCounterpartyAnchor(text))) {
return {
@ -309,6 +315,13 @@ function resolveAddressIntent(userMessage) {
reasons: ["documents_by_counterparty_signal_detected"]
};
}
if (hasAny(text, ACCOUNT_BALANCE_HINTS) || hasAccountNumberAnchor(text)) {
return {
intent: "account_balance_snapshot",
confidence: "high",
reasons: ["account_balance_signal_detected"]
};
}
if (hasLooseByAnchorMention(text) && hasGenericAddressLookupSignal(text)) {
return {
intent: "list_documents_by_counterparty",

View File

@ -140,7 +140,7 @@ function toDateTimeExpr(isoDate, endOfDay) {
const second = endOfDay ? 59 : 0;
return `ДАТАВРЕМЯ(${year}, ${month}, ${day}, ${hour}, ${minute}, ${second})`;
}
function buildWhereClause(filters, fieldPath) {
function buildWhereClause(filters, fieldPath, extraConditions = []) {
const periodFromExpr = typeof filters.period_from === "string" && filters.period_from.trim().length > 0
? toDateTimeExpr(filters.period_from, false)
: null;
@ -150,20 +150,71 @@ function buildWhereClause(filters, fieldPath) {
const asOfExpr = typeof filters.as_of_date === "string" && filters.as_of_date.trim().length > 0
? toDateTimeExpr(filters.as_of_date, true)
: null;
const conditions = [];
if (periodFromExpr && periodToExpr) {
return `ГДЕ\n ${fieldPath} МЕЖДУ ${periodFromExpr} И ${periodToExpr}`;
conditions.push(`${fieldPath} МЕЖДУ ${periodFromExpr} И ${periodToExpr}`);
}
if (periodFromExpr) {
return `ГДЕ\n ${fieldPath} >= ${periodFromExpr}`;
else if (periodFromExpr) {
conditions.push(`${fieldPath} >= ${periodFromExpr}`);
}
if (periodToExpr) {
return `ГДЕ\n ${fieldPath} <= ${periodToExpr}`;
else if (periodToExpr) {
conditions.push(`${fieldPath} <= ${periodToExpr}`);
}
if (asOfExpr) {
return `ГДЕ\n ${fieldPath} <= ${asOfExpr}`;
else if (asOfExpr) {
conditions.push(`${fieldPath} <= ${asOfExpr}`);
}
for (const condition of extraConditions) {
const value = String(condition ?? "").trim();
if (value) {
conditions.push(value);
}
}
if (conditions.length > 0) {
return `ГДЕ\n ${conditions.join("\n И ")}`;
}
return "";
}
function normalizeAccountTokenForQuery(value) {
const source = String(value ?? "").trim().replace(",", ".");
const match = source.match(/^(\d{2})(?:\.(\d{1,2}))?/);
if (!match) {
return source;
}
const base = match[1];
if (!match[2]) {
return base;
}
return `${base}.${match[2]}`;
}
function buildMovementAccountCondition(filters) {
const raw = typeof filters.account === "string" ? filters.account.trim() : "";
if (!raw) {
return null;
}
const normalized = normalizeAccountTokenForQuery(raw);
const match = normalized.match(/^(\d{2})(?:\.(\d{1,2}))?/);
if (!match) {
return null;
}
const base = match[1];
const subRaw = match[2] ?? null;
const patterns = new Set();
if (!subRaw) {
patterns.add(base);
}
else {
patterns.add(`${base}.${subRaw}`);
patterns.add(`${base}.${String(Number(subRaw))}`);
}
const clauses = Array.from(patterns)
.map((pattern) => pattern.trim())
.filter((pattern) => pattern.length > 0)
.map((pattern) => `(Движения.СчетДт.Код ПОДОБНО "${pattern}%" ИЛИ Движения.СчетКт.Код ПОДОБНО "${pattern}%")`);
if (clauses.length === 0) {
return null;
}
return clauses.length === 1 ? clauses[0] : `(${clauses.join(" ИЛИ ")})`;
}
function shouldBoostLimitForAllTimeCounterparty(filters) {
const hasCounterparty = typeof filters.counterparty === "string" && filters.counterparty.trim().length > 0;
if (!hasCounterparty) {
@ -224,7 +275,14 @@ function buildAddressRecipePlan(recipe, filters) {
.replaceAll("__LIMIT__", String(resolvedLimit))
.replace("__WHERE_OUT__", buildWhereClause(filters, "БанкСписание.Дата"))
.replace("__WHERE_IN__", buildWhereClause(filters, "БанкПоступление.Дата"))
: MOVEMENTS_QUERY_TEMPLATE.replace("__LIMIT__", String(resolvedLimit)).replace("__WHERE_CLAUSE__", buildWhereClause(filters, "Движения.Период"));
: MOVEMENTS_QUERY_TEMPLATE.replace("__LIMIT__", String(resolvedLimit)).replace("__WHERE_CLAUSE__", (() => {
const extraConditions = [];
const accountCondition = buildMovementAccountCondition(filters);
if (accountCondition) {
extraConditions.push(accountCondition);
}
return buildWhereClause(filters, "Движения.Период", extraConditions);
})());
return {
recipe,
query,

View File

@ -1923,6 +1923,119 @@ function extractAddressQuestionFromNormalized(normalized) {
}
return null;
}
function stripMarkdownJsonFence(text) {
return String(text ?? "")
.trim()
.replace(/^```json\s*/i, "")
.replace(/^```\s*/i, "")
.replace(/```$/i, "")
.trim();
}
function safeParseLooseJson(text) {
const fenced = stripMarkdownJsonFence(text);
if (!fenced) {
return null;
}
try {
return JSON.parse(fenced);
}
catch (_error) {
// Local OpenAI-compatible models often wrap JSON with extra text.
// Try extracting the first top-level JSON object defensively.
const start = fenced.indexOf("{");
const end = fenced.lastIndexOf("}");
if (start < 0 || end < 0 || end <= start) {
return null;
}
const candidate = fenced.slice(start, end + 1).trim();
try {
return JSON.parse(candidate);
}
catch (_nestedError) {
return null;
}
}
}
function extractOutputTextFromRawNormalizerOutput(raw) {
if (!raw || typeof raw !== "object") {
return null;
}
const source = raw;
if (typeof source.output_text === "string" && source.output_text.trim().length > 0) {
return source.output_text;
}
if (Array.isArray(source.output)) {
for (const item of source.output) {
if (!item || typeof item !== "object") {
continue;
}
const content = item.content;
if (!Array.isArray(content)) {
continue;
}
for (const block of content) {
if (!block || typeof block !== "object") {
continue;
}
if (typeof block.text === "string" && block.text.trim().length > 0) {
return block.text;
}
}
}
}
if (source.response && typeof source.response === "object") {
const nested = source.response;
if (typeof nested.output_text === "string" && nested.output_text.trim().length > 0) {
return nested.output_text;
}
}
if (Array.isArray(source.choices) && source.choices.length > 0) {
const first = source.choices[0];
if (first && typeof first === "object" && first.message && typeof first.message === "object") {
const message = first.message;
if (typeof message.content === "string" && message.content.trim().length > 0) {
return message.content;
}
}
}
return null;
}
function extractAddressQuestionFromRawNormalizerOutput(rawModelOutput) {
const outputText = extractOutputTextFromRawNormalizerOutput(rawModelOutput);
if (!outputText) {
return null;
}
const parsed = safeParseLooseJson(outputText);
if (!parsed || typeof parsed !== "object") {
return null;
}
const source = parsed;
const fragments = Array.isArray(source.fragments) ? source.fragments : [];
for (const item of fragments) {
if (!item || typeof item !== "object") {
continue;
}
const fragment = item;
const domainRelevance = fragment.domain_relevance;
if (typeof domainRelevance === "string" && domainRelevance.trim().toLowerCase() === "out_of_scope") {
continue;
}
if (domainRelevance === false) {
continue;
}
const readiness = String(fragment.execution_readiness ?? "").trim().toLowerCase();
if (readiness === "no_route") {
continue;
}
const normalizedText = toNonEmptyString(fragment.normalized_fragment_text);
const rawText = toNonEmptyString(fragment.raw_fragment_text);
const candidate = compactWhitespace(normalizedText ?? rawText ?? "");
if (candidate.length >= 3 && candidate.length <= 500) {
return candidate;
}
}
return null;
}
async function runAddressLlmPreDecompose(normalizerService, payload, userMessage) {
const provider = payload?.llmProvider === "local" ? "local" : payload?.llmProvider === "openai" ? "openai" : null;
const baseMeta = {
@ -1960,8 +2073,10 @@ async function runAddressLlmPreDecompose(normalizerService, payload, userMessage
};
try {
const normalized = await normalizerService.normalize(normalizePayload);
const candidate = extractAddressQuestionFromNormalized(normalized?.normalized);
if (!normalized?.ok || !candidate) {
const candidateFromNormalized = extractAddressQuestionFromNormalized(normalized?.normalized);
const candidateFromRaw = candidateFromNormalized ? null : extractAddressQuestionFromRawNormalizerOutput(normalized?.raw_model_output);
const candidate = candidateFromNormalized ?? candidateFromRaw;
if (!candidate) {
return {
...baseMeta,
attempted: true,
@ -1972,13 +2087,25 @@ async function runAddressLlmPreDecompose(normalizerService, payload, userMessage
const sourceCompact = compactWhitespace(String(userMessage ?? "").toLowerCase());
const candidateCompact = compactWhitespace(candidate.toLowerCase());
const applied = sourceCompact !== candidateCompact;
const candidateSource = candidateFromNormalized ? "normalized" : "raw";
const reason = candidateSource === "normalized"
? applied
? "normalized_fragment_applied"
: "normalized_fragment_same"
: normalized?.ok
? applied
? "raw_fragment_applied"
: "raw_fragment_same"
: applied
? "raw_fragment_applied_after_normalize_failed"
: "raw_fragment_same_after_normalize_failed";
return {
attempted: true,
applied,
provider,
traceId: normalized?.trace_id ?? null,
effectiveMessage: applied ? candidate : userMessage,
reason: applied ? "normalized_fragment_applied" : "normalized_fragment_same"
reason
};
}
catch (error) {

View File

@ -18,8 +18,14 @@ const YEAR_PERIOD_PATTERN =
const YEAR_PERIOD_SHORT_PATTERN = /(?:^|[\s,.;:!?()\-])(\d{2})\s*(?:г(?:од|ода)?\.?|year)(?=$|[\s,.;:!?()\-])/iu;
const YEAR_PERIOD_ANY_PATTERN =
/(?:^|[\s,.;:!?()\-])((?:19|20)\d{2})(?!\s*(?:[-]|до|to|по)\s*(?:19|20)\d{2})(?![.\/-]\d)(?:\s*(?:г(?:од|ода)?\.?|year))?(?=$|[\s,.;:!?()\-])/iu;
const MONTH_PERIOD_NUMERIC_PATTERN = /(?:за|for)\s*(0?[1-9]|1[0-2])[.\/-](20\d{2})/i;
const MONTH_PERIOD_NAME_PATTERN = /(?:за|for)\s+([a-zа-яё]+)\s+(20\d{2})(?:\s*г(?:од|ода|\\.)?)?/iu;
const MONTH_PERIOD_NUMERIC_MONTH_YEAR_PATTERN =
/(?:^|[\s,.;:!?()\-])(?:за|for|на|in)?\s*(0?[1-9]|1[0-2])[.\/-](20\d{2})(?=$|[\s,.;:!?()\-])/iu;
const MONTH_PERIOD_NUMERIC_YEAR_MONTH_PATTERN =
/(?:^|[\s,.;:!?()\-])(?:за|for|на|in)?\s*(20\d{2})[.\/-](0?[1-9]|1[0-2])(?=$|[\s,.;:!?()\-])/iu;
const MONTH_PERIOD_NAME_PATTERN =
/(?:^|[\s,.;:!?()\-])(?:за|for|на|in)?\s*([a-zа-яё]+)\s+(20\d{2})(?:\s*г(?:од|ода|\\.)?)?(?=$|[\s,.;:!?()\-])/iu;
const MONTH_PERIOD_NAME_YEAR_FIRST_PATTERN =
/(?:^|[\s,.;:!?()\-])(?:за|for|на|in)?\s*(20\d{2})(?:\s*г(?:од|ода|\\.)?)?\s+([a-zа-яё]+)(?=$|[\s,.;:!?()\-])/iu;
function toIsoDate(year: number, month: number, day: number): string | null {
if (!Number.isInteger(year) || !Number.isInteger(month) || !Number.isInteger(day)) {
@ -101,10 +107,23 @@ function resolveMonthByName(rawMonthName: string): number | undefined {
}
function extractMonthPeriod(text: string): { period_from?: string; period_to?: string } {
const numericMatch = text.match(MONTH_PERIOD_NUMERIC_PATTERN);
if (numericMatch) {
const month = Number(numericMatch[1]);
const year = Number(numericMatch[2]);
const numericMonthYearMatch = text.match(MONTH_PERIOD_NUMERIC_MONTH_YEAR_PATTERN);
if (numericMonthYearMatch) {
const month = Number(numericMonthYearMatch[1]);
const year = Number(numericMonthYearMatch[2]);
if (month >= 1 && month <= 12 && year >= 2000 && year <= 2099) {
const lastDay = new Date(Date.UTC(year, month, 0)).getUTCDate();
return {
period_from: `${year}-${String(month).padStart(2, "0")}-01`,
period_to: `${year}-${String(month).padStart(2, "0")}-${String(lastDay).padStart(2, "0")}`
};
}
}
const numericYearMonthMatch = text.match(MONTH_PERIOD_NUMERIC_YEAR_MONTH_PATTERN);
if (numericYearMonthMatch) {
const year = Number(numericYearMonthMatch[1]);
const month = Number(numericYearMonthMatch[2]);
if (month >= 1 && month <= 12 && year >= 2000 && year <= 2099) {
const lastDay = new Date(Date.UTC(year, month, 0)).getUTCDate();
return {
@ -127,6 +146,19 @@ function extractMonthPeriod(text: string): { period_from?: string; period_to?: s
}
}
const byNameYearFirstMatch = text.match(MONTH_PERIOD_NAME_YEAR_FIRST_PATTERN);
if (byNameYearFirstMatch) {
const year = Number(byNameYearFirstMatch[1]);
const month = resolveMonthByName(String(byNameYearFirstMatch[2]));
if (month && year >= 2000 && year <= 2099) {
const lastDay = new Date(Date.UTC(year, month, 0)).getUTCDate();
return {
period_from: `${year}-${String(month).padStart(2, "0")}-01`,
period_to: `${year}-${String(month).padStart(2, "0")}-${String(lastDay).padStart(2, "0")}`
};
}
}
return {};
}

View File

@ -108,6 +108,20 @@ function hasAny(text: string, patterns: string[]): boolean {
return patterns.some((item) => text.includes(item));
}
function hasDocumentsFormingBalanceSignal(text: string): boolean {
if (hasAny(text, DOCUMENTS_FORMING_BALANCE_HINTS)) {
return true;
}
const hasDocLexeme = text.includes("документ") || text.includes("доки");
const hasFormingLexeme = text.includes("формир");
const hasBalanceLexeme = text.includes("остат");
const hasAccountLexeme = text.includes("счет") || text.includes("счёт") || hasAccountNumberAnchor(text);
if (hasDocLexeme && hasFormingLexeme && hasBalanceLexeme && hasAccountLexeme) {
return true;
}
return hasBalanceLexeme && hasAccountLexeme && text.includes("из чего состоит");
}
function isLikelyCounterpartyToken(rawToken: string): boolean {
const token = String(rawToken ?? "").trim().toLowerCase();
if (!token || token.length < 2) {
@ -307,7 +321,7 @@ export function resolveAddressIntent(userMessage: string): AddressIntentResoluti
};
}
if (hasAny(text, DOCUMENTS_FORMING_BALANCE_HINTS) && (hasAccountNumberAnchor(text) || text.includes("счет"))) {
if (hasDocumentsFormingBalanceSignal(text) && (hasAccountNumberAnchor(text) || text.includes("счет"))) {
return {
intent: "documents_forming_balance",
confidence: "high",
@ -315,14 +329,6 @@ export function resolveAddressIntent(userMessage: string): AddressIntentResoluti
};
}
if (hasAny(text, ACCOUNT_BALANCE_HINTS) || hasAccountNumberAnchor(text)) {
return {
intent: "account_balance_snapshot",
confidence: "high",
reasons: ["account_balance_signal_detected"]
};
}
if (
hasAny(text, BANK_OPERATIONS_BY_COUNTERPARTY_HINTS) &&
(hasPartyAnchorMention(text) || hasLooseByAnchorMention(text) || hasHeuristicCounterpartyAnchor(text))
@ -348,6 +354,14 @@ export function resolveAddressIntent(userMessage: string): AddressIntentResoluti
};
}
if (hasAny(text, ACCOUNT_BALANCE_HINTS) || hasAccountNumberAnchor(text)) {
return {
intent: "account_balance_snapshot",
confidence: "high",
reasons: ["account_balance_signal_detected"]
};
}
if (hasLooseByAnchorMention(text) && hasGenericAddressLookupSignal(text)) {
return {
intent: "list_documents_by_counterparty",

View File

@ -156,7 +156,7 @@ function toDateTimeExpr(isoDate: string, endOfDay: boolean): string | null {
return `ДАТАВРЕМЯ(${year}, ${month}, ${day}, ${hour}, ${minute}, ${second})`;
}
function buildWhereClause(filters: AddressFilterSet, fieldPath: string): string {
function buildWhereClause(filters: AddressFilterSet, fieldPath: string, extraConditions: string[] = []): string {
const periodFromExpr =
typeof filters.period_from === "string" && filters.period_from.trim().length > 0
? toDateTimeExpr(filters.period_from, false)
@ -170,22 +170,76 @@ function buildWhereClause(filters: AddressFilterSet, fieldPath: string): string
? toDateTimeExpr(filters.as_of_date, true)
: null;
const conditions: string[] = [];
if (periodFromExpr && periodToExpr) {
return `ГДЕ\n ${fieldPath} МЕЖДУ ${periodFromExpr} И ${periodToExpr}`;
conditions.push(`${fieldPath} МЕЖДУ ${periodFromExpr} И ${periodToExpr}`);
} else if (periodFromExpr) {
conditions.push(`${fieldPath} >= ${periodFromExpr}`);
} else if (periodToExpr) {
conditions.push(`${fieldPath} <= ${periodToExpr}`);
} else if (asOfExpr) {
conditions.push(`${fieldPath} <= ${asOfExpr}`);
}
if (periodFromExpr) {
return `ГДЕ\n ${fieldPath} >= ${periodFromExpr}`;
}
if (periodToExpr) {
return `ГДЕ\n ${fieldPath} <= ${periodToExpr}`;
}
if (asOfExpr) {
return `ГДЕ\n ${fieldPath} <= ${asOfExpr}`;
for (const condition of extraConditions) {
const value = String(condition ?? "").trim();
if (value) {
conditions.push(value);
}
}
if (conditions.length > 0) {
return `ГДЕ\n ${conditions.join("\n И ")}`;
}
return "";
}
function normalizeAccountTokenForQuery(value: string): string {
const source = String(value ?? "").trim().replace(",", ".");
const match = source.match(/^(\d{2})(?:\.(\d{1,2}))?/);
if (!match) {
return source;
}
const base = match[1];
if (!match[2]) {
return base;
}
return `${base}.${match[2]}`;
}
function buildMovementAccountCondition(filters: AddressFilterSet): string | null {
const raw = typeof filters.account === "string" ? filters.account.trim() : "";
if (!raw) {
return null;
}
const normalized = normalizeAccountTokenForQuery(raw);
const match = normalized.match(/^(\d{2})(?:\.(\d{1,2}))?/);
if (!match) {
return null;
}
const base = match[1];
const subRaw = match[2] ?? null;
const patterns = new Set<string>();
if (!subRaw) {
patterns.add(base);
} else {
patterns.add(`${base}.${subRaw}`);
patterns.add(`${base}.${String(Number(subRaw))}`);
}
const clauses = Array.from(patterns)
.map((pattern) => pattern.trim())
.filter((pattern) => pattern.length > 0)
.map(
(pattern) =>
`(Движения.СчетДт.Код ПОДОБНО "${pattern}%" ИЛИ Движения.СчетКт.Код ПОДОБНО "${pattern}%")`
);
if (clauses.length === 0) {
return null;
}
return clauses.length === 1 ? clauses[0] : `(${clauses.join(" ИЛИ ")})`;
}
function shouldBoostLimitForAllTimeCounterparty(filters: AddressFilterSet): boolean {
const hasCounterparty = typeof filters.counterparty === "string" && filters.counterparty.trim().length > 0;
if (!hasCounterparty) {
@ -262,10 +316,14 @@ export function buildAddressRecipePlan(
.replaceAll("__LIMIT__", String(resolvedLimit))
.replace("__WHERE_OUT__", buildWhereClause(filters, "БанкСписание.Дата"))
.replace("__WHERE_IN__", buildWhereClause(filters, "БанкПоступление.Дата"))
: MOVEMENTS_QUERY_TEMPLATE.replace("__LIMIT__", String(resolvedLimit)).replace(
"__WHERE_CLAUSE__",
buildWhereClause(filters, "Движения.Период")
);
: MOVEMENTS_QUERY_TEMPLATE.replace("__LIMIT__", String(resolvedLimit)).replace("__WHERE_CLAUSE__", (() => {
const extraConditions: string[] = [];
const accountCondition = buildMovementAccountCondition(filters);
if (accountCondition) {
extraConditions.push(accountCondition);
}
return buildWhereClause(filters, "Движения.Период", extraConditions);
})());
return {
recipe,

View File

@ -1885,6 +1885,119 @@ function extractAddressQuestionFromNormalized(normalized) {
}
return null;
}
function stripMarkdownJsonFence(text) {
return String(text ?? "")
.trim()
.replace(/^```json\s*/i, "")
.replace(/^```\s*/i, "")
.replace(/```$/i, "")
.trim();
}
function safeParseLooseJson(text) {
const fenced = stripMarkdownJsonFence(text);
if (!fenced) {
return null;
}
try {
return JSON.parse(fenced);
}
catch (_error) {
// Local OpenAI-compatible models often wrap JSON with extra text.
// Try extracting the first top-level JSON object defensively.
const start = fenced.indexOf("{");
const end = fenced.lastIndexOf("}");
if (start < 0 || end < 0 || end <= start) {
return null;
}
const candidate = fenced.slice(start, end + 1).trim();
try {
return JSON.parse(candidate);
}
catch (_nestedError) {
return null;
}
}
}
function extractOutputTextFromRawNormalizerOutput(raw) {
if (!raw || typeof raw !== "object") {
return null;
}
const source = raw;
if (typeof source.output_text === "string" && source.output_text.trim().length > 0) {
return source.output_text;
}
if (Array.isArray(source.output)) {
for (const item of source.output) {
if (!item || typeof item !== "object") {
continue;
}
const content = item.content;
if (!Array.isArray(content)) {
continue;
}
for (const block of content) {
if (!block || typeof block !== "object") {
continue;
}
if (typeof block.text === "string" && block.text.trim().length > 0) {
return block.text;
}
}
}
}
if (source.response && typeof source.response === "object") {
const nested = source.response;
if (typeof nested.output_text === "string" && nested.output_text.trim().length > 0) {
return nested.output_text;
}
}
if (Array.isArray(source.choices) && source.choices.length > 0) {
const first = source.choices[0];
if (first && typeof first === "object" && first.message && typeof first.message === "object") {
const message = first.message;
if (typeof message.content === "string" && message.content.trim().length > 0) {
return message.content;
}
}
}
return null;
}
function extractAddressQuestionFromRawNormalizerOutput(rawModelOutput) {
const outputText = extractOutputTextFromRawNormalizerOutput(rawModelOutput);
if (!outputText) {
return null;
}
const parsed = safeParseLooseJson(outputText);
if (!parsed || typeof parsed !== "object") {
return null;
}
const source = parsed;
const fragments = Array.isArray(source.fragments) ? source.fragments : [];
for (const item of fragments) {
if (!item || typeof item !== "object") {
continue;
}
const fragment = item;
const domainRelevance = fragment.domain_relevance;
if (typeof domainRelevance === "string" && domainRelevance.trim().toLowerCase() === "out_of_scope") {
continue;
}
if (domainRelevance === false) {
continue;
}
const readiness = String(fragment.execution_readiness ?? "").trim().toLowerCase();
if (readiness === "no_route") {
continue;
}
const normalizedText = toNonEmptyString(fragment.normalized_fragment_text);
const rawText = toNonEmptyString(fragment.raw_fragment_text);
const candidate = compactWhitespace(normalizedText ?? rawText ?? "");
if (candidate.length >= 3 && candidate.length <= 500) {
return candidate;
}
}
return null;
}
async function runAddressLlmPreDecompose(normalizerService, payload, userMessage) {
const provider = payload?.llmProvider === "local" ? "local" : payload?.llmProvider === "openai" ? "openai" : null;
const baseMeta = {
@ -1922,8 +2035,10 @@ async function runAddressLlmPreDecompose(normalizerService, payload, userMessage
};
try {
const normalized = await normalizerService.normalize(normalizePayload);
const candidate = extractAddressQuestionFromNormalized(normalized?.normalized);
if (!normalized?.ok || !candidate) {
const candidateFromNormalized = extractAddressQuestionFromNormalized(normalized?.normalized);
const candidateFromRaw = candidateFromNormalized ? null : extractAddressQuestionFromRawNormalizerOutput(normalized?.raw_model_output);
const candidate = candidateFromNormalized ?? candidateFromRaw;
if (!candidate) {
return {
...baseMeta,
attempted: true,
@ -1934,13 +2049,25 @@ async function runAddressLlmPreDecompose(normalizerService, payload, userMessage
const sourceCompact = compactWhitespace(String(userMessage ?? "").toLowerCase());
const candidateCompact = compactWhitespace(candidate.toLowerCase());
const applied = sourceCompact !== candidateCompact;
const candidateSource = candidateFromNormalized ? "normalized" : "raw";
const reason = candidateSource === "normalized"
? applied
? "normalized_fragment_applied"
: "normalized_fragment_same"
: normalized?.ok
? applied
? "raw_fragment_applied"
: "raw_fragment_same"
: applied
? "raw_fragment_applied_after_normalize_failed"
: "raw_fragment_same_after_normalize_failed";
return {
attempted: true,
applied,
provider,
traceId: normalized?.trace_id ?? null,
effectiveMessage: applied ? candidate : userMessage,
reason: applied ? "normalized_fragment_applied" : "normalized_fragment_same"
reason
};
}
catch (error) {

View File

@ -50,6 +50,11 @@ describe("address intent resolver expansion (M2.3a)", () => {
expect(result.intent).toBe("documents_forming_balance");
});
it("resolves documents forming balance for russian participle phrasing", () => {
const result = resolveAddressIntent("Показать документы, формирующие остаток по счету 60.01 на дату 2020-07-31");
expect(result.intent).toBe("documents_forming_balance");
});
it("resolves documents by company phrase as counterparty intent", () => {
const result = resolveAddressIntent("Какие документы доступны по компании СВК за 2021 год?");
expect(result.intent).toBe("list_documents_by_counterparty");
@ -60,6 +65,11 @@ describe("address intent resolver expansion (M2.3a)", () => {
expect(result.intent).toBe("bank_operations_by_counterparty");
});
it("keeps bank_operations_by_counterparty even when account hints are present", () => {
const result = resolveAddressIntent("Показать банковские операции (счета 51, 62) для контрагента СВК за 2020 год");
expect(result.intent).toBe("bank_operations_by_counterparty");
});
it("resolves documents by client phrase", () => {
const result = resolveAddressIntent("Выведи документы по клиенту Бета за 2020-07");
expect(result.intent).toBe("list_documents_by_counterparty");
@ -137,6 +147,36 @@ describe("address filter extraction for balance drilldown", () => {
expect(result.warnings).toContain("period_derived_from_month_phrase");
});
it("derives month period for balance snapshot from 'на май 2020'", () => {
const result = extractAddressFilters("Какой остаток по счету 60 на май 2020", "account_balance_snapshot");
expect(result.extracted_filters.account).toBe("60");
expect(result.extracted_filters.period_from).toBe("2020-05-01");
expect(result.extracted_filters.period_to).toBe("2020-05-31");
expect(result.extracted_filters.as_of_date).toBe("2020-05-31");
expect(result.warnings).toContain("period_derived_from_month_phrase");
expect(result.warnings).toContain("as_of_date_derived_from_period_to");
});
it("derives month period for balance snapshot from 'на 2020.05'", () => {
const result = extractAddressFilters("Какой остаток по счету 60 на 2020.05", "account_balance_snapshot");
expect(result.extracted_filters.account).toBe("60");
expect(result.extracted_filters.period_from).toBe("2020-05-01");
expect(result.extracted_filters.period_to).toBe("2020-05-31");
expect(result.extracted_filters.as_of_date).toBe("2020-05-31");
expect(result.warnings).toContain("period_derived_from_month_phrase");
expect(result.warnings).toContain("as_of_date_derived_from_period_to");
});
it("derives month period for balance snapshot from 'на 2020 май'", () => {
const result = extractAddressFilters("Какой остаток по счету 60 на 2020 май", "account_balance_snapshot");
expect(result.extracted_filters.account).toBe("60");
expect(result.extracted_filters.period_from).toBe("2020-05-01");
expect(result.extracted_filters.period_to).toBe("2020-05-31");
expect(result.extracted_filters.as_of_date).toBe("2020-05-31");
expect(result.warnings).toContain("period_derived_from_month_phrase");
expect(result.warnings).toContain("as_of_date_derived_from_period_to");
});
it("treats 'за весь период' as all-time hint and does not force 90-day default", () => {
const result = extractAddressFilters(
"Покажи банковские операции по клиенту Бета за весь период",
@ -379,4 +419,30 @@ describe("address recipe catalog counterparty filtering", () => {
expect(plan.limit).toBe(200);
});
it("injects account condition into movements query for account snapshot", () => {
const filters = extractAddressFilters(
"Какой остаток по счету 60 на дату 2020-07-31",
"account_balance_snapshot"
).extracted_filters;
const selected = selectAddressRecipe("account_balance_snapshot", filters);
expect(selected.selected_recipe).toBeTruthy();
const plan = buildAddressRecipePlan(selected.selected_recipe!, filters);
expect(plan.query).toContain("Движения.СчетДт.Код");
expect(plan.query).toContain("ПОДОБНО \"60%\"");
});
it("injects subaccount condition variants into movements query for documents_forming_balance", () => {
const filters = extractAddressFilters(
"Какие документы формируют остаток по счету 60.01 на дату 2020-07-31",
"documents_forming_balance"
).extracted_filters;
const selected = selectAddressRecipe("documents_forming_balance", filters);
expect(selected.selected_recipe).toBeTruthy();
const plan = buildAddressRecipePlan(selected.selected_recipe!, filters);
expect(plan.query).toContain("ПОДОБНО \"60.01%\"");
expect(plan.query).toContain("ПОДОБНО \"60.1%\"");
});
});