
Engineering traceability: why decoupled architecture is a research requirement
Most AI tools trap their logic inside a prompt or a messy frontend. Domain-Driven Design and stateless inference engines make it possible to trace any insight back to its exact source, three years after the report was delivered.
The vibe-coding problem
There is a pattern in AI tooling that has become so common it now has a name: vibe-coding. The interface looks clean, the outputs sound confident, and the engineering underneath is built around making the UI feel good. Data flows toward the frontend. The database is shaped by what the dashboard needs to display. The prompt is tuned until the answer sounds right.
This works until someone asks a hard question: “Where exactly did this finding come from?” At that point, the architecture has no answer. The insight is embedded in a model’s weights, or cached in a component’s state, or reconstructed on every render from an opaque summarisation call. It cannot be audited. It cannot be reproduced. It cannot be defended.
Citium Tech builds the opposite. The database is the source of truth. The UI is a temporary window into permanent, immutable records. And every insight, every sentence in every report, is traceable to a specific source document, a specific model version, and a specific processing timestamp.
This article explains the engineering that makes that possible.
Domain-Driven Design as a research architecture
Domain-Driven Design (DDD) is typically discussed in the context of large-scale enterprise software. The core idea translates directly to research infrastructure: model the domain accurately, separate business logic from delivery mechanisms, and never let the convenience of the UI dictate the shape of the data.
In a research system, the domain has three layers:
The Source Layer: raw data as it arrived from external systems, immutable, timestamped, hashed. This layer is never modified after ingestion. It is the foundation of the audit trail described in The deterministic RAG guide.
The Insight Layer: derived outputs produced by the inference engine, stored as first-class domain objects with full metadata. An insight is not a string in a database column; it is a structured record with source references, model provenance, and a cryptographic hash of its own content.
The Presentation Layer: the frontend application, in this case a Svelte 5 application. It reads from the Insight Layer. It does not write to it. It does not influence how insights are generated. It is a consumer, not a participant.
This separation is what makes Audit-as-a-Service possible. The Svelte application can be rewritten, redesigned, or replaced entirely. The Insight Layer persists. A report generated three years ago is still traceable, not because the frontend preserved it, but because the domain object never changed.
Stateless inference: why the engine and the UI must be separate
The inference engine: the NestJS service that runs the deterministic RAG pipeline described in Why deterministic RAG beats generative AI for research and processes data from the self-healing pipelines described in Building self-healing data pipelines for market intelligence. It is stateless by design. It receives a request, processes it against the data layer, returns a structured response, and retains no session state.
This statefulness boundary is not a microservices preference. It is a traceability requirement. If the inference engine held state, an insight produced on Tuesday would depend on context accumulated from Monday’s queries. Reproducing Tuesday’s output would require reconstructing Monday’s state. That is not reproducibility; it is archaeology.
A stateless engine produces outputs that are a pure function of their inputs: the query, the retrieved chunks, the model version, and the temperature setting. Given the same inputs, you get the same output. Logging the inputs is sufficient to reproduce the output.
The Svelte 5 frontend connects to the NestJS API and displays whatever the domain layer contains. The frontend has no business logic. It does not transform data. It does not make inference decisions. It renders records.
┌─────────────────────┐ ┌────────────────────────────────────┐
│ SVELTE 5 FRONTEND │ │ NESTJS INFERENCE ENGINE │
│ │ │ │
│ • Renders records │ ──────► │ • Stateless request handler │
│ • No data logic │ REST │ • Calls RAG pipeline │
│ • No AI calls │ ◄────── │ • Writes ResearchOutput to DB │
│ • Read-only view │ JSON │ • Returns record ID to frontend │
└─────────────────────┘ └────────────────────────────────────┘
│
▼
┌──────────────────────────┐
│ PostgreSQL + pgvector │
│ Immutable records │
│ Cryptographic hashes │
└──────────────────────────┘ The frontend never receives a raw LLM response. It receives a record ID, fetches the stored ResearchOutput, and displays it. The inference happened once, was stored, and is now served from the database. Every subsequent request for the same insight is a database read. It is deterministic, fast, and auditable.
The ResearchOutput transformer: metadata as a first-class citizen
Every insight that enters the database passes through a Data Transformer in the NestJS service layer. The transformer’s job is to append cryptographic-grade metadata before the record is written. This is not an afterthought but a required part of the domain object schema.
// domain/research-output.transformer.ts
import { createHash } from 'crypto';
export interface ResearchOutputMetadata {
sourceHash: string; // SHA-256 of all source chunk IDs + their content hashes
modelIdentifier: string; // e.g. "claude-sonnet-4-20250514"
modelTemperature: number; // always 0 for production runs
processingTimestamp: string; // ISO 8601, UTC
schemaVersion: string; // e.g. "research-output/v2"
outputHash: string; // SHA-256 of summary + citations + metadata (excluding outputHash itself)
}
export interface ResearchOutput {
id: string;
queryText: string;
summary: string;
citations: CitationRecord[];
sourceChunks: SourceChunk[];
metadata: ResearchOutputMetadata;
}
export class ResearchOutputTransformer {
transform(
queryText: string,
summary: string,
citations: CitationRecord[],
sourceChunks: SourceChunk[],
modelIdentifier: string,
): ResearchOutput {
const processingTimestamp = new Date().toISOString();
const modelTemperature = 0;
const schemaVersion = 'research-output/v2';
// Hash the source material: chunk IDs + their individual content hashes
const sourceHash = createHash('sha256')
.update(
sourceChunks
.map(c => `${c.chunkId}:${c.contentHash}`)
.sort() // sort for determinism regardless of retrieval order
.join('|')
)
.digest('hex');
// Build metadata (excluding outputHash — that comes last)
const partialMetadata: Omit<ResearchOutputMetadata, 'outputHash'> = {
sourceHash,
modelIdentifier,
modelTemperature,
processingTimestamp,
schemaVersion,
};
// Hash the full output: summary + citations + metadata
const outputHash = createHash('sha256')
.update(
JSON.stringify({ summary, citations, metadata: partialMetadata })
)
.digest('hex');
return {
id: crypto.randomUUID(),
queryText,
summary,
citations,
sourceChunks,
metadata: { ...partialMetadata, outputHash },
};
}
} The sourceHash proves that the output was generated from a specific set of source documents. If a source document is later found to have been corrupted or manipulated, the hash mismatch is detectable.
The outputHash proves that the stored record has not been modified since it was written. Any change to the summary, citations, or metadata, even a single character, produces a different hash.
The schemaVersion ensures that a record written under the current schema can be distinguished from records written under a future schema, allowing the presentation layer to render them correctly even as the system evolves.
Audit-as-a-Service: what this enables in practice
The decoupled frontend described here, a Svelte 5 application that renders records but holds no logic, is the component that makes the audit trail described in the deterministic RAG guide accessible to non-technical stakeholders.
A compliance officer can open a research report from eighteen months ago and request the provenance of a specific finding. The frontend sends a record ID to the NestJS API, which retrieves the stored ResearchOutput, verifies the outputHash against the stored value, and returns the full source chain: the query text, the retrieved chunks with their origin URLs and timestamps, the model version, and the processing timestamp.
This is not a forensic exercise performed after something goes wrong. It is a routine query against a well-structured domain model. The traceability is not a feature bolted onto the system. It is the system.
The contrast with vibe-coded AI tools is not subtle. Those tools can show you what they said. They cannot show you why they said it, or whether they would say the same thing again tomorrow. Citium’s architecture can do both.