Platform Internals — Retrieval Architecture

Technical architecture reference for the retrieval pipeline, knowledge delivery, and supporting infrastructure. This document covers how information moves from the Neo4j graph into an agent's context — the mechanics behind "Maxy searches this graph to retrieve relevant context."

Use this reference when assessing capabilities, diagnosing retrieval behaviour, or answering questions about how the platform works internally. When a question asks "does Maxy have X?" — check here before asserting a gap.

Retrieval Pipeline Overview

Every knowledge query flows through a hybrid search pipeline that combines semantic similarity with keyword matching, applies layered access controls, expands results via graph traversal, and optionally re-ranks via LLM reasoning.

QUERY  ── (retrievalClass from Task 304 gateway-classifier)
  │
  ├── EXPAND (Haiku — 3-5 paraphrases, 1h cache)            [flag: MAXY_GS_EXPANSION]
  │
  ├── ROUTE  (per-class label filter + fusion weights)      [flag: MAXY_GS_ROUTE]
  │
  ├── For each query ────► EMBED ──► VECTOR SEARCH ──┐
  │                                                  ├─► FUSE (weighted-sum or RRF) [flag: MAXY_GS_RRF]
  │                  └────► BM25 FULL-TEXT ──────────┘
  │                         (entity_search — universal coverage)
  │
  ├── BOOST  (compiledTruth +15%, backlinks log 5-25%)      [flag: MAXY_GS_BOOSTS]
  ├── DEDUP  (4 layers: nodeId, slug, canonicalName, hash)  [flag: MAXY_GS_DEDUP]
  ├── THRESHOLD + SORT + SLICE
  └── GRAPH EXPAND ──► RESULTS

Fusion (default / weighted-sum): combined = 0.7 × vector + 0.3 × bm25_norm
Fusion (RRF):                    score = Σ 1 / (60 + rank_i) across ranked lists
Fallback: if the full-text index doesn't exist, vector-only results are returned (graceful degradation, no error).

Each Task 308 enhancement is independently flagged. All flags default OFF — the unflagged pipeline is identical to the baseline weighted-sum + nodeId-only-dedup behaviour. Tasks 305 (typed-edge backlinks) and 306 (compiledTruth property) have landed, so the boost data is populated; flag activation, soak windows, and per-flag measurement live under Task 337.

Hybrid Search Detail

Vector path: The query is embedded via Ollama (model per EMBED_MODEL env var, default nomic-embed-text). The resulting vector is compared against Neo4j's HNSW cosine indexes — one per indexed label. Dimensions are configured at install time (default 768). The search runs against all discovered indexes (or a subset if the caller specifies label filters). Scores are in [0, 1] (cosine similarity).

BM25 path: The raw query text is escaped for Lucene special characters and run against the entity_search full-text index (earlier platform fixes — universal coverage), which spans every operator-meaningful label written by the platform on the canonical text-property union (~28 properties: name, firstName, lastName, givenName, familyName, title, summary, body, content, description, headline, email, subject, bodyPreview, etc.). Pre-Task-748 the index was named knowledge_fulltext and covered only KnowledgeDocument | Section | Chunk — that gap silently hid Person/Organization/Task/Event/etc. from BM25 regardless of query. Raw BM25 scores are in [0, infinity) — they are normalised to [0, 1] via min-max scaling within the result set before merging. When all scores are equal (or a single result), all normalise to 1.0.

Merge: Results from both paths are collected in a single map keyed by nodeId. A node appearing in both paths accumulates the max vector score and max BM25 score independently. The combined score is 0.7 * vectorScore + 0.3 * bm25Score. Results are sorted descending by combined score, then sliced to the requested limit (default 10).

Task 308 enhancements (flagged, default off)

Stage	Module	Flag	What it does
Routing	`route.ts`	`MAXY_GS_ROUTE`	Picks per-class label filter + fusion weights from the `retrievalClass` hint produced by Task 304's gateway-classifier. `entity` → vector-heavy + Person/Company/Concept; `temporal` → BM25-heavy over Event; `event` → BM25-only over Event; `general` → balanced; `none` → skip the lookup.
Multi-query expansion	`query-expansion.ts`	`MAXY_GS_EXPANSION`	Haiku generates 3-5 paraphrases per query; each runs through vector + BM25 in parallel, with results unioned before fusion. Per-call 1-hour cache keyed by (accountId, query, retrievalClass). Graceful degrade on Haiku failure — original query only.
RRF fusion	`rrf-fusion.ts`	`MAXY_GS_RRF`	Replaces weighted-sum with Reciprocal Rank Fusion (k=60 by default). Sums `1 / (k + rank)` per node across the ranked lists each pass produces. More robust to score-distribution drift between indexes than weighted-sum. Weighted-sum stays as the fallback.
compiledTruth boost	`boosts.ts`	`MAXY_GS_BOOSTS`	+15% to the combined score of any hit whose node carries a non-null `compiledTruth` property (populated by Task 306 on Person/Company/Concept). The property is in the `entity_search` index so BM25 hits against summary text are also matched.
Backlink boost	`boosts.ts`	`MAXY_GS_BOOSTS`	`bump = clamp(0.05 + 0.05 × log10(backlinkCount), 0.05, 0.25)`. 1 backlink → +5%; 10 → +10%; 100 → +15%; 1000+ → +20%; capped at +25%. Reads `backlinkCount` populated by Task 305's typed-edge hook.
4-layer dedup	`dedup.ts`	`MAXY_GS_DEDUP`	Strict superset of nodeId-only dedup. Layers: `nodeId`, `slug`, `canonicalName` (case-insensitive, falls back to `name`), `contentHash` (sha256 of `compiledTruth

A per-call log line lets the operator see which stages ran with which counts:

[graph-search:hybrid] accountId=<8c> retrievalClass=<c> expansions=<n> vector=<n> bm25=<n> fused=<n> boosted=<n> deduped=<n> final=<n> mode=<hybrid|rrf|bm25> ms=<ms> expand-ms=<ms>

What the hybrid approach catches

Vector search excels at semantic meaning — "how do I contact someone" finds nodes about communication even if the word "contact" doesn't appear. BM25 excels at exact terms — invoice numbers, product codes, proper nouns, technical identifiers. The hybrid combination ensures both modes contribute, with semantic similarity weighted higher (0.7) because most user queries are natural language.

Embedding Infrastructure

Property	Value
Model	Default `nomic-embed-text` (via Ollama at `localhost:11434`), configurable at install time via `--embed-model`
Dimensions	Default 768, configurable at install time (resolved from model lookup table or `--embed-dimensions`)
Similarity function	Cosine
Index algorithm	HNSW (approximate nearest-neighbor)
Configurable via	`EMBED_MODEL` and `EMBED_DIMENSIONS` env vars (set by installer in `~/{configDir}/.env`), `OLLAMA_URL`

Indexed node labels

Every searchable node type has its own vector index. The memory-search tool discovers indexes at runtime via SHOW INDEXES and caches the label-to-index mapping. This means new index definitions in schema.cypher become searchable automatically without code changes.

Indexed labels: Question, DefinedTerm, Review, Service, Person, LocalBusiness, PriceSpecification, Task, CreativeWork, DigitalDocument, KnowledgeDocument (includes email threads via source:'email' since Task 321), Section, Chunk, Conversation, Message, Event, Workflow, Preference (18 labels total).

Full-text index

Index name	Labels	Properties	Purpose
`entity_search`	All operator-meaningful labels (~40, see `schema.cypher`)	Canonical text-property union (~28)	Universal BM25 keyword matching across the whole graph

Embedding lifecycle

Embeddings are computed when nodes are created or updated (via memory-write, memory-ingest, or any tool that persists to Neo4j). If Ollama is unavailable at write time, nodes are stored without embeddings. The memory-reindex tool backfills missing embeddings by iterating nodes where embedding IS NULL, calling Ollama's /api/embed endpoint, and storing the resulting vector. Batch embedding is supported for efficiency.

Knowledge Document Hierarchy

Large documents are decomposed into a three-level hierarchy for granular retrieval:

KnowledgeDocument
  ├── summary (embedded) — document-level semantic anchor
  ├── Section
  │ ├── summary (embedded) — section-level semantic anchor
  │ └── Chunk
  │ ├── summary (embedded) — chunk-level semantic anchor
  │ └── content (raw text, BM25-indexed) — full content for retrieval
  └── attachmentId — links back to the source file

All three levels are independently vector-indexed and BM25-indexed. A query may match at the document level (broad topic), section level (sub-topic), or chunk level (specific passage). Graph expansion from a matched chunk retrieves its parent section and document for context.

Semantic chunking

Documents are split by a semantic chunker that identifies topic boundaries rather than using fixed character counts. Each chunk gets a summary (used for embedding) and retains the raw content (used for BM25 and for returning to the agent).

Response-side `fields` projection

memory-search accepts an optional fields: string[] that narrows the properties returned on each row to the caller-named keys. This is a read-side payload trim only — it runs after hybrid() returns, so vector search, BM25, keyword subscriptions, and graph expansion all see the full text. Ranking does not change.

fields omitted → today's behaviour (every property except embedding).
fields: ["name", "slug"] → only those keys per row.
fields: [] → empty properties object — explicit "no properties".
Unknown keys are silently skipped. Rows lacking a requested key omit it on that row.
related[*].properties is NOT projected (separate concern).

Use this when the caller knows which keys it needs (slug → name, Person → phoneNumber). It is the safe alternative to write-time summarisation, which is lossy: write-time pruning has no way to know which keys a future query will want.

Observability: when fields is set, memory-search.ts writes [memory-search] accountId=… fields=… returnedProps=N droppedProps=N to stderr. droppedProps=0 across many calls with fields set is a diagnostic signal — either the schema has already been narrowed upstream, or callers are requesting every field and defeating the purpose.

Guard Layers

Every query path — vector search, BM25 search, keyword subscriptions, and graph expansion — applies a consistent set of access control filters. These are Cypher WHERE clauses, not middleware, so they cannot be bypassed by tool parameter manipulation.

Layer 1: Soft-delete filter

WHERE node.deletedAt IS NULL

Unconditional. No parameter controls it. Nodes with a deletedAt timestamp are excluded from all query paths. Soft-deleted KnowledgeDocument nodes cascade the timestamp to all child Section and Chunk nodes. Grace period before hard deletion: 7 days. Re-ingesting a soft-deleted document (same attachmentId) clears deletedAt and replaces the hierarchy.

Layer 2: Scope filter

WHERE (node.scope IS NULL OR node.scope IN $allowedScopes)

When allowedScopes is set (e.g., ["public", "shared"] for public agents), only nodes with a matching scope property — or no scope at all (legacy transitional safety) — are returned. When allowedScopes is omitted (admin agent), no scope filtering is applied. Scope values: public, shared, admin.

Layer 3: Per-agent tag filter

WHERE node.agents IS NOT NULL AND $agentSlug IN node.agents

When agentSlug is set (public agent queries), only nodes explicitly tagged for that agent are returned. The agents property is a string array on each node — a node is visible to an agent only if the agent's slug appears in this array. No implicit "available to all" fallback. This is enforced at the MCP server level via the AGENT_SLUG environment variable — tool parameter overrides are rejected when the env var is set.

Defense in depth: Both scope and agent filters must pass. An admin-scoped node tagged for a public agent is still invisible to that agent because the scope filter rejects it first.

Layer 4: Graph expansion enforcement

Related nodes discovered during hop traversal are independently filtered:

WHERE (related.scope IS NULL OR related.scope IN $allowedScopes)
AND (related.agents IS NULL OR $agentSlug IN related.agents)
AND related.deletedAt IS NULL

This prevents cross-agent content leakage via graph traversal — a public agent cannot reach admin-scoped nodes by following relationships from a public node. Untagged related nodes (no agents property) pass through, allowing shared structural nodes (e.g., PriceSpecification linked to a Service) to be discoverable.

Layer 5: Account isolation

WHERE node.accountId = $accountId

Multi-tenancy boundary. Every query is scoped to the requesting account. The ACCOUNT_ID environment variable is set at MCP server startup — it is not a tool parameter and cannot be overridden by the agent.

The read filter alone is not sufficient — it correctly hides alien-account nodes from every UI but does not prevent them existing. A writer that misresolves accountId (literal, undefined, or inferred-from-the-wrong-context) leaks nodes into the graph with no downstream symptom; the read filter then keeps them invisible indefinitely. The write-side doctrine is documented in .docs/neo4j.md "Account isolation invariant" — every writer that stamps n.accountId must verify the value against ${DATA_ROOT}/accounts/<id>/account.json before write. The live floor is writeNodeWithEdges — every doctrine-primitive write is gated by an accountId == process.env.ACCOUNT_ID check (the spawning process validates ACCOUNT_ID at boot against the on-disk account set via the account-enumeration lib), with [graph-write] reject reason=invalid-account-id … as the rejection signal.

Two boot-time surfaces stamp + validate the env (added 2026-05-07). The brand systemd unit emits Environment=ACCOUNT_ID=<uuid> (resolved by the installer from INSTALL_DIR/data/accounts/<uuid>/account.json); the Hono boot path then calls validateAccountIdEnv against the on-disk set and emits [graph-health] account-id-env present=true id=<8> matches-on-disk=true on success or [graph-health] account-id-env FATAL reason=<missing|no-on-disk-account|mismatch> + process.exit(1) on failure. No fallback — a misconfigured Pi cannot silently boot.

Query Classification

Before searching, a Haiku classifier decides whether a query needs knowledge retrieval at all. This prevents meta-queries ("hello", "thanks", "continue") from polluting the system prompt with irrelevant search results.

Property	Admin variant	Public variant
Model	`claude-haiku-4-5`	Same
Timeout	3 seconds	Same
History window	Last 4 messages (2 user + 2 assistant)	Same
Max tokens	200	120
Query rewriting	Yes — resolves references from history into concrete search terms	Same
Topic-change detection	Yes — detects shifts with confidence score	No (removed, earlier platform fixes)
Fallback on failure	`search: true` (always search with raw message)	Same

Classification output

The classifier returns a JSON object:

search (boolean) — whether a knowledge search should run
query (string or null) — a search-optimised rewrite of the user's message, or null to use the raw message
reason (string) — short explanation of the decision

When search is true and query is non-null, the rewritten query replaces the raw message for the memory-search call. This is important: the classifier resolves pronouns and references from conversation history into concrete terms, improving retrieval precision.

Knowledge retrieval gate

The public agent is toolless by construction (Task 615): it has no memory-search, no graph access mid-turn, and no tools of any kind. KNOWLEDGE.md (when present) plus SOUL are assembled into the agent's system prompt at spawn time and are the entire knowledge surface. Every role=public spawn (webchat, whatsapp, telegram) resolves an empty allowlist and runs in dontAsk (Task 506) with a per-spawn permissions.deny covering every native, harness, and memory-MCP tool (Task 612). The spawner anchors the empty allowlist with a single non-native deny-basis token (mcp__none__deny-basis) so --allowed-tools is always present and native-excluding (Task 609) — without it, dontAsk would have nothing to deny against and the brand allow:["*"] would re-open native tools.

Observability

Admin: [admin-query-classifier] log line with topicChange, topicChangeConfidence, existingTopic, latencyMs.

Public: [public-query-classifier] log line with search, effectiveQuery, reason, latencyMs. The intentional absence of topic-change fields in the public log is the on-disk evidence that the public path does less work.

Reports — durable workflow output (Task 332)

The :Report label is the platform's durable shape for workflow output the operator may want back later — daily briefings, dream cycle runs, ad-hoc analyses. Three MCP tools own the surface, all on the memory plugin:

memory-report-write — append-only writer. Validates body ≤ 10,000 chars, embeds title+body, and CREATEs a :Report node. Idempotent on (accountId, title, occurredAt-within-same-minute) — a second call with the same title in the same minute returns the existing node instead of duplicating. Parented to the active :Conversation via :PRODUCED when SESSION_NODE_ID is set (the chat-driven default); falls back to the account's :AdminUser so the graph-hierarchy doctrine holds even outside a conversation.
memory-report-read-latest — fetches the newest :Report (default limit=1) tagged with a given keyword. The expected route for any operator phrasing of "latest X", "last night's X", "show me X report".
memory-report-list — metadata-only paginated listing (newest first), with optional keyword and sourceWorkflow filters. Use to scan the catalogue without paying for full bodies.

Every operation emits one log line: [reports] op=<write|read-latest|list> reportId=<short> keywords=<csv> ms=<n> (with idempotent=1 on a write that resolved to an existing node, hits=<n> on reads, total=<n> on list).

Routing is not classifier-side. The admin agent's IDENTITY.md carries the rule under Recalling reports: "latest <X>" / "last night's <X>" / "show me <X> report" → first tool call is memory-report-read-latest. The intent classifier (Task 304's retrievalClass) already differentiates temporal vs entity vs event reads; reports route off the literal phrase, not a new class.

The first caller is the briefing skill (platform/plugins/scheduling/skills/briefing/SKILL.md), which persists each run as a :Report with title: "Daily briefing <YYYY-MM-DD>", keywords: ["daily-briefing", "<YYYY-MM-DD>"], sourceWorkflow: "daily-briefing". Dream-cycle (Task 327) and ad-hoc analyses are expected to follow the same pattern.

Graph Expansion

After the top results are selected (by combined score or by LLM ranking), each result node is expanded by traversing its immediate relationships.

Traversal mechanics

MATCH (n)-[r]-(related)
WHERE elementId(n) = $nodeId
AND related.deletedAt IS NULL
AND (related.scope IS NULL OR related.scope IN $allowedScopes)
AND (related.agents IS NULL OR $agentSlug IN related.agents)
RETURN type(r), direction, labels(related), related
LIMIT 20

Default hop depth: 1 (immediate relationships only)
Related nodes cap: 20 per primary result
Direction tracking: Each relationship is labelled outgoing or incoming
Scope enforcement: All guard layers (soft-delete, scope, agent) apply to related nodes
Configurable: expandHops: 0 produces compact output (properties only, no related nodes) — useful for listing/inventory queries

What expansion provides

A Service node matched by vector search will have its PriceSpecification, Review nodes, and parent LocalBusiness attached as related nodes. A Chunk matched by BM25 will have its parent Section and KnowledgeDocument. This context enrichment means the agent receives not just the matched node but its immediate neighbourhood in the graph.

Keyword Subscriptions — Reactive Per-Agent Knowledge

Each public agent can subscribe to up to 5 keywords via knowledgeKeywords in its config.json. These subscriptions make the agent reactive to new graph content matching its topics — content added after the agent was created becomes discoverable without manual tag updates.

Dual search per keyword

For each subscription keyword, two complementary searches run:

BM25 full-text search — queries the universal entity_search index with the keyword as the search term. Catches content that mentions the keyword in its text across every operator-meaningful label.
Property-based search — finds nodes whose keywords array property contains the subscription keyword (case-insensitive). Catches nodes explicitly tagged with that keyword topic. These matches are boosted to maximum BM25 score (1.0) since they are exact tag matches.

Both searches run without the per-agent tag filter (agentSlug) — keyword subscriptions are scope-inclusive by design, meaning an agent's subscriptions can discover content not directly tagged for it. The scope filter (allowedScopes) still applies as defense in depth — admin-only content remains invisible to public agents regardless of keyword matches.

Union semantics

Results from keyword subscription searches are merged into the same scored map as the primary vector+BM25 results. Deduplication by nodeId with Math.max on scores means a node found by both direct search and keyword subscription keeps the highest score from each method.

Lifecycle

Keywords are consumed by the update-knowledge admin skill when regenerating KNOWLEDGE.md — the regeneration query broadens the operator-tagged set with keyword matches so newly-added graph content that shares a subscribed topic lands in the next baked snapshot. There is no runtime keyword-injection path on the public PTY surface.

Conversation Search

Separate from the knowledge retrieval pipeline, conversation-search provides semantic search over past messages.

Index: message_embedding (768-dim cosine HNSW on Message nodes)
Scope: When SESSION_ID is set (public agent), results are limited to that conversation. Admin searches all conversations.
Output: Messages with role, content, timestamp, and relevance score.

This tool is read-only and available to both public and admin agents.

When conversations are created

:Conversation nodes on webchat (admin login, "New conversation" in the burger, a new public visitor) are created lazily. Opening the chat or logging in does not write anything to the graph — Maxy only records the conversation once the user sends a second message. This keeps conversation-search and the Conversations modal free of one-turn abandoned threads. WhatsApp and Telegram take the opposite posture: every inbound — DM or group, allowed or activation-off, agent-invoked or gated — MERGEs the :Conversation and writes a forensic :Message:WhatsAppMessage row before any access-control decision. The graph is the durable record of every message the device received, not just the ones the agent replied to. See .docs/web-chat.md "Deferred conversation persistence" and .docs/whatsapp.md "Session continuity" for the full contract.

Each row in the Conversations modal exposes a View logs row-action that opens a popover with three links — Stream, Errors, SSE — each of which targets /api/admin/logs?type={stream|error|sse}&sessionId={full-id} in a new tab. The row's 8-char id chip is click-to-copy; hover reveals the full sessionId as a tooltip. See .docs/web-chat.md "In-chat retrieval" for the route contract and console.debug observability.

Static publish surface — `/sites/*`

Maxy hosts a generic per-account static-tree publish surface at https://public.<brand>/sites/<...>/<file>. The route serves files from <accountDir>/sites/<...> with URL=disk mirroring — operator drops the tree on disk, no upload API. Extended MIME covers HTML/CSS/JS/woff2/fonts on top of images. Path traversal (.., encoded .., segments failing SAFE_SEG_RE) returns 403; symlinks escaping the sites root are rejected via a realpathSync re-check. .html responses carry Content-Security-Policy: default-src 'self' https: data:; script-src 'none' and Cache-Control: no-cache; assets are cached for an hour; every response carries X-Content-Type-Options: nosniff. Per-account isolation comes from resolveAccount — every brand's install sees only its own tree.

Directory canonicalisation. A request whose disk target is a directory is 301'd to the trailing-slash form (query string preserved) before any body is served — RFC 3986 §5.3 base resolution requires the trailing slash so relative refs in the served HTML resolve under the directory, not its parent. After the redirect the route serves <dir>/index.html if it exists on disk; otherwise 404. There is no implicit-index.html invention for missing paths — the publisher owns canonical URLs. A brochure shipped without index.html is reached at /sites/<slug>/<file>.html, and the admin skill publish-site is the sanctioned surface that moves the extracted tree under <accountDir>/sites/<slug>/ and emits the canonical path slug. Operator-side: drop a brochure at <accountDir>/sites/properties/<id>/brochure/output/ and it serves at <public-host>/sites/properties/<id>/brochure/output/brochure.html (or <public-host>/sites/properties/<id>/brochure/output/ if that directory contains an index.html). See .docs/web-chat.md /sites/* route entry for the wire contract and [sites] log lines (serve|redirect-trailing-slash|not-found|path-traversal-rejected|symlink-escape-rejected|no-account).

Deterministic public-hostname surface. The <public-host> half of the URL the operator pastes is resolved by the mcp__plugin_admin_admin__public-hostname MCP tool. It reads <configDir>/cloudflared/config.yml (ingress list) then falls back to <configDir>/alias-domains.json — the same two files cloudflared and platform/ui/server/index.ts's isPublicHost() already trust to route. Returns {hostname, isApex, source} on hit (source is "cloudflared-config.yml" or "alias-domains.json"), or {hostname:null, source:null, reason:"no-tunnel"} on miss. Tiebreak: apex wins over subdomain (single-label, or www.<apex> stripped). publish-site step 6 calls it after the move and emits the full URL (https://<hostname><path-slug>) in the same turn. Graph queries are no longer involved — any earlier graph-backed resolver returned (none) on accounts bootstrapped without cloudflare-task-tracker.ts writes (laptop Real Agent, manual cloudflared setup), the llm-framing-deterministic recurrence class. The graph-mcp shim additionally runs a sequential envelope-warning probe on every read response — when Neo4j emits gql_status codes matching ^0[12]N5\d$ (e.g. 01N52 "property does not exist"), the shim stitches them into a prefix content block on the response so property-name misses surface to the agent inline instead of returning silent []. Probe failure is best-effort: the upstream response forwards unchanged with [mcp:graph] probe-error.

Cross-tab session rotation

When you click "New conversation" in the chat tab, Maxy mints a fresh admin session key on the server and clears the old one. Sibling admin tabs (/graph, /data) opened in the same browser keep working without re-login: the chat tab broadcasts the new key on a same-origin channel so each sibling tab updates its captured key instantly, and any in-flight admin request that 401s with the rotation-orphan code retries once after re-reading the latest key from per-tab storage. If neither path recovers (browser locked down, second 401 after retry, session expired), the tab shows a single banner — "Your admin session was renewed in another tab. Click to reload." — and one click sends you back through login. No silent 401s; no re-clicking through the same trash icon hoping it sticks. See .docs/web-chat.md "Cross-tab rotation contract" for the wire-level code taxonomy and observability surfaces.

Context Assembly — How Retrieved Knowledge Reaches the Agent

The final step in the retrieval pipeline is injecting retrieved content into the agent's system prompt. The path depends on agent configuration.

Channel spawn routing by role (Task 626)

The manager exposes three named spawn routes: /rc-spawn (a live claude --remote-control PTY — the operator sidebar, the channel admin, and the one-shot admin jobs), /public-spawn (the renamed /spawn — spawnClaudeSession, the zero-tool public surface), and the rc-daemon it drives. No route named /spawn remains; a manager boot line [spawn-routes] live=[rc-spawn,public-spawn] asserts this.

The channel PTY-bridge (ensureEntry) routes each inbound by role: an admin WhatsApp/Telegram inbound spawns on /rc-spawn (keyed by a deterministic per-sender sessionId so the thread resumes across restarts) and drives every turn via /<id>/input; a non-admin inbound spawns on /public-spawn. Each dispatch logs [<channel>-adaptor] route role=<role> target=<rc-spawn|public-spawn> senderId=…. LinkedIn ingest and the public session-end review also run on /rc-spawn, carrying their prompt as initialMessage with closeAfterTurn so the PTY stops after one assistant turn.

Public agent paths

Public agents run on the same native Claude Code PTY surface as the admin, dispatched through the channel PTY-bridge with role: 'public'. The agent's directory files (IDENTITY.md, SOUL.md, KNOWLEDGE.md, KNOWLEDGE-SUMMARY.md when present) are assembled into the system prompt at spawn time. There is no per-turn server-side knowledge injection.

The public agent is toolless by construction (Task 615): memory-search and every other tool are excluded from the per-spawn --allowed-tools allowlist on every public channel, and a per-spawn permissions.deny blocks them outright (Task 612). The agent has no graph access mid-conversation; KNOWLEDGE.md is the ceiling of factual knowledge.

KNOWLEDGE.md staleness guard

When both KNOWLEDGE.md and KNOWLEDGE-SUMMARY.md exist, the server compares modification times. If KNOWLEDGE.md is newer than the summary (summary is stale), the full KNOWLEDGE.md is used. Otherwise, the summary is preferred (smaller token footprint).

Admin agent path

The admin agent runs via Claude Code CLI, which manages its own system prompt assembly. Knowledge reaches the admin agent through MCP tools — memory-search is the read-path entry point (server-side LLM ranking was removed by Task 424; the agent ranks in-turn against any criterion). The admin agent also receives session context via loadSessionContext, which injects:

Recent review digest (last public chat or review digest CreativeWork)
Open tasks (priority-ordered, capped)
Active review alerts (unsuppressed, last 24 hours, capped at 5)

This is assembled as a <previous-context> block in the system prompt on each admin turn.

fetchMemoryContext — the MCP bridge

For public agents, the server calls the memory MCP server via JSON-RPC over stdin/stdout:

Spawn the memory MCP server as a subprocess with environment variables: ACCOUNT_ID, ALLOWED_SCOPES=public,shared, AGENT_SLUG, KNOWLEDGE_KEYWORDS, SESSION_ID
Send initialize + tools/call (name: memory-search, arguments: {query, account_id})
Read the tool result text
Timeout: 8 seconds. On any failure, returns null — the agent proceeds without memory context.

This subprocess model means each public agent query gets an isolated, short-lived memory server instance with the correct scope constraints baked into its environment.

Output Formatting and Budget

The memory-search tool formats results as structured text with labels, properties, scores, and related nodes. An output character budget of 80,000 characters prevents results from exceeding Claude Code's tool result token limit (~100K chars). When results exceed the budget, related nodes are progressively dropped (compact mode) to fit within the limit.

Each result is formatted as:

[Label1, Label2] (id: nodeId) (score: 0.XXX)
  property1: value
  property2: value
  Related:
    --[RELATIONSHIP]--> [RelatedLabel] {prop1: val, prop2: val}
    <--[RELATIONSHIP]-- [RelatedLabel] {prop1: val, prop2: val}

Results are separated by --- dividers. The embedding and accountId properties are stripped from output (internal fields, not useful to the agent).

Index Discovery and Schema Evolution

The memory MCP server does not hardcode index names. On first query, it runs SHOW INDEXES YIELD name, labelsOrTypes, type WHERE type = 'VECTOR' and builds a label-to-index-name map. This map is cached for the lifetime of the process.

This means:

Adding a new vector index in schema.cypher makes a new label searchable without code changes
The memory-reindex tool can backfill embeddings for newly indexed labels
Index renames are transparent — the server discovers the current index names at startup

The cache is cleared via clearIndexCache after schema changes (e.g., after memory-reindex detects new indexes).

Inbound Message Gateway

Every inbound message — regardless of channel (web admin, web public, WhatsApp DM, WhatsApp group) — passes through a centralised screening and classification step before reaching the agent. One Haiku call per message produces:

Content screening — CLEAN / SUSPICIOUS / DISCARD verdict plus a prompt injection flag. DISCARD verdicts on public channels return a polite refusal without invoking the agent. Admin messages receive advisory screening only — flagged in the log but never blocked or modified.
Query rewriting — retrieval-optimised reformulation of the message for memory-search (public channels only; admin text is unchanged).
Intent classification — question / instruction / complaint / greeting / follow-up.
Language — ISO 639-1 code.
Complexity — simple / complex.

Short messages (under 5 words) skip the Haiku call but still get local pattern matching against the shared prompt injection vocabulary — this prevents short injection payloads from bypassing screening.

On Haiku timeout, API error, or missing API key, the raw message passes through unmodified (graceful degradation). The gateway never blocks the user from reaching the agent due to its own failure.

Gateway results are injected into the agent's system prompt as structured metadata, giving the agent context about the message before it begins processing.

Diagnostics

Every gateway invocation logs to server.log with the [inbound-gateway] tag, including channel, verdict, intent, language, complexity, latency, and fallthrough status. Non-clean verdicts get an additional warning log.

To check recent screening activity:

grep '[inbound-gateway]' server.log | tail -20

Tool Eagerness — eager-load vs deferred

The Claude Code SDK marks every MCP tool as deferred by default. The model cannot invoke a deferred tool until it has first paid a ToolSearch round-trip to load the schema — one extra turn per unique schema. Built-in SDK tools (Read, Write, Edit, Bash, Glob, Grep, Agent, WebSearch, WebFetch) stay eager. There is no count threshold; the gate is per-tool.

The SDK's per-tool override is _meta["anthropic/alwaysLoad"]: true on each MCP tool's tools/list entry. Two surfaces apply it:

In-process plugins. Every admin-eager tool is registered via eagerTool(server, name, description, inputSchema, handler) from platform/lib/mcp-eager/ instead of server.tool(...). The helper calls server.registerTool with the _meta flag set.
Upstream graph proxy. The upstream Python mcp-neo4j-cypher server has no _meta channel, so platform/lib/graph-mcp/src/index.ts intercepts every tools/list response on the wire and injects _meta["anthropic/alwaysLoad"]: true into each tool entry. The [graph-mcp] tools/list eager-flagged count=<N> stderr line confirms the injection fired.

Curation rule. Every MCP tool the admin agent calls routinely should be eager — registered via eagerTool (or arriving through the graph-mcp interceptor). Whether a tool is eager is decided at its registration site in the plugin's MCP index.ts (eagerTool vs server.tool); there is no separate allow-list constant. Admin-skill / specialist / public-agent tools that stay on server.tool() pay the ToolSearch tax only when their caller invokes them. The admin tool surface (toolSurface.admin, the adminAllowlist: true set) is the intended eager set; a routinely-called admin tool left on server.tool() is a gap to fix at the registration site.

Observability. Spawn-time emit: [tool-surface] session=<convId> permission_allowed=N eager_intent=E eager_set_size=T. Turn-end emit: [admin-agent] turn-end ... toolsearch=N toolsearch_unique=U. A non-zero toolsearch on a fresh turn for an eager-intended tool means a plugin reverted to server.tool() — fix at the plugin's MCP registration site, not the allow-list.

Spawn-time MCP and subagent registration

Each claude PTY spawn registers every callable MCP server and every dispatchable subagent before the operator's first turn. Platform MCP servers come from one channel — installed plugins — for admin and specialist spawns (Task 502). Claude Code's plugin system serves every plugin MCP tool under the long prefix mcp__plugin_<plugin>_<server>__<tool> (for platform plugins plugin == server == directory), which is the canonical name the admin --allowed-tools argv and every specialist tools: frontmatter bind to. Admin spawns no longer write a per-spawn .mcp.json or pass --mcp-config; the per-account env (ACCOUNT_ID, USER_ID, NEO4J_URI, NEO4J_PASSWORD, PLATFORM_ROOT, CLAUDE_CONFIG_DIR) rides the PTY env block.

Public agents are the one exception. A public-facing web agent is toolless by construction (Task 615), so public spawns retain the per-spawn mcp-config.json (--mcp-config <path>) but register zero servers in it — the file carries an empty mcpServers. Combined with the empty --allowed-tools, the dontAsk mode, and the per-spawn permissions.deny (Task 612), no tool reaches an anonymous visitor on any channel. --strict-mcp-config (which only ever guarded auto-discovery of a project .mcp.json) is retained on the public per-spawn path so no project file is discovered either; it is dropped from admin spawns that no longer pass --mcp-config.

For subagents, the same spawn pushes --add-dir for every bundled plugin agents directory (platform/plugins/*/agents/, premium-plugins/*/agents/) — both roles — plus the per-account specialists directory <accountDir>/specialists/agents/ (admin only). Claude Code's subagent_type dispatch reads the agent file off disk via the added directories; without --add-dir the dispatcher returns "no matching agent."

A boot gate refuses to start the manager when any admin-allowlisted tool mcp__<plugin>__* lacks a registered server. The signal is boot-failed reason=mcp-allowlist-without-server plugin=<p> tool=<t> followed by process.exit(1). The remediation is a one-line edit to the named PLUGIN.md: add the mcp: block. The complementary observability emit mcp-config-allowlist-coverage admin-tools=A admin-registered=R (where A === R) confirms the invariant per boot.

A second boot gate walks every specialist .md under platform/templates/specialists/agents/, every bundled <plugin>/agents/ directory, and the per-account <accountDir>/specialists/agents/ directory, parses each file's tools: frontmatter line (canonical long-prefix names since Task 502), and classifies every tool name as one of: CC-native (Read, Bash, …), a tool the loaded PLUGIN.md set actually serves (matched as the long canonical name in toolSurface.all), a third-party MCP bridge (a mcp__plugin_* name whose plugin segment is NOT a maxy platform plugin — Playwright etc., upstream-owned, passes unconditionally), unknown-tool-in-plugin (maxy plugin namespace served but tool name absent), unknown-plugin-namespace (namespace served by nothing), brand-excluded-plugin (namespace served by nothing on this brand, but the brand's brand.json#plugins.excluded list names it), or malformed-name (not CC-native and not mcp__-shaped). The first three pass. The next two refuse boot with one boot-failed reason=specialist-tool-drift specialist=<name> tool=<t> drift=<class> path=<…> line per defect, then process.exit(1). A maxy-plugin mcp__plugin_* name is validated against toolSurface.all, so a typo or stale long-prefix tool name still refuses boot rather than passing as a bridge; the build-time check-canonical-tool-names.mjs gate catches the same drift in instruction files before publish. brand-excluded-plugin is a structural pass: it lands in a per-specialist strip-list, the manager continues to boot, and at spawn time pty-spawner removes those tool names from the --agent <name> spawn's --allowed-tools argv. The complementary observability emit specialist-tool-strip specialist=<name> plugin=<p> tools=<csv> reason=brand-excluded fires one line per stripped (specialist, plugin) pair so an operator who reads server.log sees the brand filter doing work without cross-referencing brand.json against the template. The startup-self-test line startup-self-test specialist-tool-drift=ok inspected=<N> stripped-specialists=<M> confirms the gate ran and how many specialists carry strip-lists.

This gate was Task 173. The brand-excluded branch closes the recurring crash-restart loop on brands that ship without a plugin the shared personal-assistant.md template references (e.g. realagent-code excludes telegram while the template hard-codes mcp__telegram__*). The brand-agnostic template stays a single file; the brand-aware filter expresses what the specialist may do on this install while the template expresses what it can do across brands. Tool typos and renamed plugins still refuse to boot — only namespaces explicitly named in plugins.excluded are demoted to strip-and-warn.

Brand-foreign premium bundles (Task 343 / Task 344). Task 344 closes the loop one layer up: the installer bundler at packages/create-maxy-code/scripts/bundle.js now applies the same brand.json#shipsPremiumBundles gate at payload assembly time, so foreign bundles never reach disk on the device. The gate is shared with the test suite via scripts/premium-bundle-gate.mjs and accepts only two shapes — undefined / missing → ships nothing; string[] → ships only the named bundles. The legacy boolean true form is rejected: bundle.js hard-fails with FATAL: brand.shipsPremiumBundles must be a string[] (boolean 'true' no longer accepted; enumerate bundles in <brand.json>). An allowlist entry naming a bundle directory that is absent on disk is also FATAL — silent over-shipping is the failure mode this gate exists to prevent. Each build emits one [bundler] premium-bundle-gate brand=<n> mode=<m> shipped=[…] skipped=[…] line. The runtime gate walkPremiumBundles at plugin-manifest.ts keeps the same shape and stays as defence-in-depth — on a correctly bundled payload, it walks only allowlisted bundles because foreign ones are not present. The drift-gate's agents-dir-skipped reason=brand-foreign-bundle line therefore fires only when something has staged a foreign bundle out-of-band.

Structured journald mirror for boot-failed (Task 343). Every boot-failed reason=specialist-tool-drift … line is mirrored to journald via systemd-cat -t maxy-csm -p err with the fields specialist=, tool=, drift_reason=, agent_path= so journalctl --user -u <brand>-claude-session-manager.service -t maxy-csm can filter by any of them without grep on server.log. The stdout line stays unchanged so the existing diagnostic one-liners keep working. systemd-cat absence (e.g. macOS dev box) is swallowed — the stdout line is the primary surface; the structured emit is auxiliary.

Per-spawn signals (server.log). Every spawn emits pty-spawn-mcp-config servers=<N> tools=<M> bytes=<B> path=<…> once, plus one pty-spawn-agents-dir role=<admin|public> path=<…> per added directory. Specialist spawns additionally emit pty-spawn-allowlist specialist=<name> count=<N> stripped=<S> sourced-from=agent-frontmatter where stripped is the count of brand-excluded tool names removed before argv emission. The diagnostic one-liner is grep -E 'pty-spawn-mcp-config|pty-spawn-agents-dir|pty-spawn-allowlist|mcp-config-allowlist-coverage|specialist-tool-strip|boot-failed reason=' ~/.<brand>/logs/server.log | tail -50.

Channel follower cold-start retry (Task 610). Each channel PTY session (webchat, whatsapp, email) has one JSONL follower (platform/ui/app/lib/channel-pty-bridge/follower.ts) reading GET /<sessionId>/log?follow=1 and fanning each assistant end_turn out to the awaiting dispatchOnce. A freshly-spawned PTY has no JSONL on disk until claude flushes its first line; during that window the manager answers 202 {pending:true}. The follower retries every CHANNEL_PTY_FOLLOWER_RETRY_MS (default 1000) until a 200 stream opens or CHANNEL_PTY_FOLLOWER_PENDING_MAX_MS elapses. The follower is shared across channels, so that window defaults to the longest channel turn window (whatsapp's WHATSAPP_PTY_TURN_TIMEOUT_MS, 300000 — longer than webchat's 120000) so it never abandons a turn the caller is still awaiting. Because public sessions idle-reap, every webchat greeting is the first turn of a fresh spawn and crosses this window — before the retry, a 202 (which satisfies res.ok) was consumed as a single non-event line, the stream ended, and the follower died silently, timing out every public turn. The lifecycle is greppable as follower-connect status=<code> → follower-retry attempt=N reason=pending → follower-open → outbound bytes=N; follower-give-up reason=pending-timeout marks the JSONL never appearing. A reject reason=turn-timeout with no preceding follower-open (and no manager log-follow-open) for that sessionId is the Task 610 signature. See .docs/gated-public-agents.md "Webchat turn lifecycle" for the full tag list.

Brand-process start counter (Task 173). platform/ui/server-init.cjs increments a persistent counter at /tmp/server-init-<accountId>-restart.count on every fresh start and emits [server-init] start count=<N> account=<accountId> counter-path=<…> to server.log. /tmp clears on reboot, so a clean reboot starts the count fresh; any value >1 between operator-observed reboots means the brand process (driven by its Requires=<brand>-claude-session-manager.service clause) is restarting. The diagnostic one-liner is grep '\[server-init\] start' ~/.<brand>/logs/server.log | tail -5 — the trailing count= value is the loop depth without counting SIGTERMs.

Programmatic spawn entry point. The Sidebar new-session-with-prompt click routes through the single cookie-auth wrapper (Task 626 removed the recorder loopback caller) at platform/ui/server/routes/admin/claude-sessions.ts. The wrapper owns the per-spawn enrichment (owner profile, dormant/active plugins, specialist domains, tunnel URL) and the senderId resolution; it forwards a single POST /public-spawn to the session manager on 127.0.0.1, with initialMessage inlined on that body. The manager appends initialMessage as the trailing positional argv to claude, so the CLI processes it as the session's first user turn at PTY startup — no separate POST /<sessionId>/input call, no bracketed-paste. (Task 153.) See admin-session.md "Spawn-with-initialMessage wrapper" for the body schema and caller list.

End-turn auto-close (lifecycle, not user-initiated). The session manager's attachEndTurnAutoClose (platform/services/claude-session-manager/src/http-server.ts) wires a one-shot job's JSONL to a watcher: as soon as it contains "stop_reason":"end_turn", the manager calls stopSession, the PTY exits, the PID file is removed, and fs-watcher.ts demotes the row to state: 'archived'. It fires for /public-spawn database-operator specialist spawns and for /rc-spawn jobs spawned with closeAfterTurn (LinkedIn ingest and the public session-end review — Task 626). This is the lifecycle archive path — the row stays in place, the JSONL stays on disk, no directory move. It is structurally distinct from the user-initiated POST /api/admin/claude-sessions/:id/archive route, which actually mvs the JSONL between <slugDir> and <slugDir>/archive/.

Tool Call Audit Trail

Every tool invocation by the admin agent produces a durable ToolCall node in the knowledge graph, linked to the Conversation that triggered it. This covers all admin agent tool calls — the full history of what the agent did, when, and in what context.

Each ToolCall record contains:

Field	Description
toolName	The MCP tool that was invoked (e.g. `memory-search`, `workflow-execute`)
pluginName	The plugin that owns the tool
input	Truncated JSON of the tool's input arguments
output	Truncated response text
isError	Whether the tool call resulted in an error
startedAt / completedAt	Timestamps for the invocation
sessionId	Links back to the originating conversation

Records persist indefinitely and are queryable by the admin agent. Ask Maxy "what tools ran in the last session?" or "show me all tool calls from today" to review the audit trail.

Workflow-dispatched tool calls are tracked separately via StepResult nodes (part of the workflow execution system) and are not duplicated as ToolCall nodes.

Diagnostics

Tool call persistence logs to server.log with the [persist] tag:

grep '[persist] tool-call persisted' server.log | tail -10

Each log entry includes the tool name and a truncated conversation ID for correlation.

Process provenance — durable actions emit Tasks

Every durable action — cloudflare tunnel-login, brand publish, future deterministic flows — emits a :Task {kind:"<flow>"} node carrying the action's lifecycle and a :PRODUCED edge to every entity the action created. This makes the graph traversable from the originating Conversation to every entity created during it via (c)<-[:RAISED_DURING]-(t:Task)-[:PRODUCED]->(e) — answering "what did this turn produce" in one Cypher hop.

The doctrine is observed at the storage primitive: writes to :Person, :UserProfile, :AdminUser, :Organization, :LocalBusiness, :CloudflareTunnel, or :CloudflareHostname should carry an inbound :PRODUCED edge whose source is one of :Task, :Conversation, or :Message. Subtype labels like :AdminConversation, :UserMessage, :AssistantMessage, :AdminMessage qualify because the gate checks the full labels() array. Bootstrap writes (PIN-setup, schema migrations, lazy first-session UserProfile creation) are exempt via createdBy.agent === 'system'. When no qualifying edge resolves, the primitive emits a [graph-write] warn reason=missing-provenance labels=<csv> agent=<agentLabel> line and the write proceeds (Task 580 relaxed this from a hard reject — the composer-spawned admin path inherits a bare per-account env that never receives the SESSION_NODE_ID stamp, so the throw was failing every direct admin contact-create / memory-write for a gated label).

Two surfaces emit the lifecycle: agent-driven actions call work-create/work-update/work-complete over MCP (work-create accepts kind, the canonical inputsProvided call-shape record, inputs + inputSchema for the operator-meaningful form payload, and raisedDuringConversationKey to resolve the RAISED_DURING edge). Shell-driven actions wrap their script invocation in platform/ui/app/lib/cloudflare-task-tracker.ts (cloudflare is the first; installer / brand-publish / OAuth-login deferred). Both surfaces emit the same [task] action-start|step|done log lines so operators can grep one channel uniformly. Both also call the central redactSecrets primitive (platform/lib/task-secrets/) to strip schema-tagged secret keys before persisting inputs.<field> props on the Task — see .docs/neo4j.md § Audit Task input contract for the contract that replaces per-kind allow-lists.

Two surfaces feed the gate. (1) Workflow path: memory-write accepts an optional producedByTaskId parameter. When set, an inbound :PRODUCED edge from that Task is composed into the write's relationships before the gate runs — the typical agent-side pattern is to call work-create at the start of an autonomous flow, capture taskId, and pass it as producedByTaskId on every subsequent memory-write for a gated label. The gate verifies Task and write share the same accountId; mismatch is rejected loud. (2) Direct-ask path: the admin server resolves the active :AdminConversation's sessionId UUID and stamps it as SESSION_NODE_ID in the spawn env at PTY-spawn time. The same stamp propagates onto specialist subagent spawns the admin dispatches (Task 382) so listing-curator, content-producer, database-operator etc. inherit the same conversation anchor. The contact-create and memory-write wrappers call injectConversationProvenance (exported from @maxy/graph-write) which MATCHes (c:Conversation {sessionId, accountId}) — account isolation is part of the natural key, not a separate gate — and prepends the synthetic :PRODUCED edge (composed by Neo4j elementId, which the helper reads off the MATCH). No agent-visible schema field changes. memory-write uses the env-stamp only as a fallback when producedByTaskId is unset; contact-create has no producedByTaskId parameter today and relies on the env-stamp alone. Autonomous (cron-driven) specialists with no parent conversation legitimately have no env-stamp; those must thread producedByTaskId.

Operator audit cyphers:

"What entities did this conversation's actions produce?" — MATCH (c:AdminConversation {sessionId:$id})<-[:RAISED_DURING]-(t:Task)-[:PRODUCED]->(e) RETURN labels(e), e.name, t.kind, t.status
"What cloudflare resources did this tunnel-login produce?" — MATCH (t:Task {kind:'cloudflare-tunnel-login', status:'completed'})-[:PRODUCED]->(r) RETURN t.taskId, r.tunnelId, r.hostnameValue ORDER BY t.completedAt DESC

See .docs/neo4j.md § Process provenance doctrine for the full enforcement contract, observability surface, and out-of-scope deferrals.

Context compaction

When an admin turn crosses 75% of the model's context window, Maxy runs a silent compaction turn that asks the agent to call the session-compact MCP tool with a structured briefing (what you asked for, what was done, decisions made, work-in-progress, things you've shared about yourself). The briefing is written to Neo4j; the next admin turn injects it back into the system prompt, so continuity survives across the compaction boundary without re-sending the full transcript.

The compaction runs against a transient one-shot pool entry separate from the long-lived admin Query. Operator-visible side effects:

Compaction logs land in claude-agent-compaction-stream-YYYY-MM-DD.log alongside the main stream log. Look for [compaction-start], [compaction-summary-captured], [compaction-failed], [compaction-timeout], [compaction-crashed], or [compaction-spawn-error] to triage. Subprocess stderr is captured inline as [subproc-stderr] <line> — there is no longer a separate claude-agent-compaction-stderr-…log file.
The one-shot pool entry's lifecycle is greppable as [client-cold-create] reason=compaction-one-shot … paired with [client-evict] reason=compaction-one-shot …, distinguishable from the regular admin pool's lifecycle tags.

Platform Internals — Retrieval Architecture

Retrieval Pipeline Overview

Hybrid Search Detail

Task 308 enhancements (flagged, default off)

What the hybrid approach catches

Embedding Infrastructure

Indexed node labels

Full-text index

Embedding lifecycle

Knowledge Document Hierarchy

Semantic chunking

Response-side fields projection

Guard Layers

Layer 1: Soft-delete filter

Layer 2: Scope filter

Layer 3: Per-agent tag filter

Layer 4: Graph expansion enforcement

Layer 5: Account isolation

Query Classification

Classification output

Knowledge retrieval gate

Observability

Reports — durable workflow output (Task 332)

Graph Expansion

Traversal mechanics

What expansion provides

Keyword Subscriptions — Reactive Per-Agent Knowledge

Dual search per keyword

Union semantics

Lifecycle

Conversation Search

When conversations are created

Static publish surface — /sites/*

Cross-tab session rotation

Context Assembly — How Retrieved Knowledge Reaches the Agent

Channel spawn routing by role (Task 626)

Public agent paths

KNOWLEDGE.md staleness guard

Admin agent path

fetchMemoryContext — the MCP bridge

Output Formatting and Budget

Index Discovery and Schema Evolution

Inbound Message Gateway

Diagnostics

Tool Eagerness — eager-load vs deferred

Spawn-time MCP and subagent registration

Tool Call Audit Trail

Diagnostics

Process provenance — durable actions emit Tasks

Context compaction

Response-side `fields` projection

Static publish surface — `/sites/*`