Skip to content

Architecture Overview

Valter is a modular monolith with 4 runtimes (API, Worker, MCP stdio, MCP HTTP), all sharing the same Python codebase with strict layered separation.

Valter follows a modular monolith pattern — not microservices. All runtimes share the same Python package (src/valter/), the same domain models, and the same business logic. What changes between runtimes is the entry point and the transport layer, not the core logic.

This design was chosen for three reasons:

  1. Consistency — a single codebase guarantees that the behavior of an MCP tool and its equivalent REST endpoint are always identical, because they call the same core function.
  2. Simplicity — there is one deployment unit to build, test, and reason about. No inter-service communication, no API contracts between services, no distributed state.
  3. Testability — all four runtimes can be validated by the same test suite, since the tests target core logic rather than transport-specific code.

The runtime is selected by the entry point used to start the process. In production on Railway, multiple instances of the same codebase run with different entry points (API on port 8000, MCP remote on port 8001, ARQ worker for background jobs).

The codebase enforces a strict one-way dependency rule between layers:

api/ → core/ → models/

This means:

  • api/ can import from core/ and models/, but never from stores/.
  • core/ can import from models/, but never from stores/ or api/.
  • models/ has zero internal imports — it is the leaf layer.
  • stores/ implements protocols defined in core/protocols.py and is injected at runtime via FastAPI’s Depends() mechanism, configured in api/deps.py.

This separation ensures that core/ contains pure business logic with no coupling to any concrete database driver. A PostgresDocStore can be swapped for a mock in tests (or a different implementation entirely) without changing a single line in core/.

The API layer is the outermost boundary of the application. It handles HTTP transport, request validation, authentication, and response serialization. Route handlers are intentionally thin — they validate input using Pydantic schemas, call a core function, and return the result.

Key components:

  • 11 FastAPI routershealth, retrieve, verify, enrich, similar, graph, features, factual, ingest, memories, datasets
  • Pydantic v2 schemas — request and response models in api/schemas/, separate from domain models
  • DI containerapi/deps.py wires concrete stores into core functions using Depends()
  • Middleware stack — requests pass through 5 middleware layers in order: CORS, Metrics IP Allowlist, Request Tracking (trace_id + Prometheus), Rate Limiter (Redis sliding window), Auth (API key + scopes)

The core layer contains all business logic. It has approximately 25 modules organized by domain capability:

GroupModulesPurpose
SearchHybridRetriever, DualVectorRetriever, QueryExpanderHybrid search with BM25 + semantic + KG boost, dual-vector factual search, multi-query expansion
AnalysisDocumentEnricher, LegalVerifier, SimilarityFinder, FactualExtractorIRAC analysis, anti-hallucination verification, case similarity, factual extraction via Groq
WorkflowWorkflowOrchestrator, ProjudiOrchestrator, PhaseAnalysis (5 modules)Full ingestion pipeline from PDF upload to human-reviewed artifacts
InfrastructureProtocols (runtime-checkable interfaces)Contracts that stores must implement

Every module in core/ depends only on protocols and models — never on concrete store implementations.

The store layer provides concrete implementations of the protocols defined in core/protocols.py. Each store is specialized for its data backend:

StoreBackendResponsibility
PostgresDocStorePostgreSQLDocument CRUD, full-text search (BM25)
PostgresFeaturesStorePostgreSQLAI-extracted features (21 fields per decision)
PostgresSTJStorePostgreSQLSTJ metadata (810K records)
PostgresIngestStorePostgreSQLIngestion jobs, workflow state
PostgresMemoryStorePostgreSQLSession memory with TTL
QdrantVectorStoreQdrantSemantic search (768-dim vectors, cosine similarity)
Neo4jGraphStoreNeo4jKnowledge graph queries (12+ analytical methods)
RedisCacheStoreRedisQuery cache (180s TTL), rate limiting counters
GroqLLMClientGroq APILLM calls for classification, extraction, query expansion
ArtifactStorageCloudflare R2 / localPDF and JSON artifact storage with canary rollout

The model layer defines domain entities as Pydantic v2 models. These models are shared across all layers and represent the canonical shape of data in the system:

ModuleModels
document.pyDocument, DocumentMetadata
chunk.pyChunk, ChunkMetadata
irac.pyIRAC analysis structure (Issue, Rule, Application, Conclusion)
graph.py30+ graph entity models (divergences, minister profiles, PageRank, communities, etc.)
frbr.pyFRBR ontology models (Work, Expression, Manifestation)
phase.pyLegal proceeding phase models
features.pyAI-extracted document features (21 fields)
factual.pyFactual digest and legal thesis
stj_metadata.pySTJ tribunal metadata
memory.pySession memory key-value pairs

All models use model_config = {"strict": False} to allow coercion from database results while maintaining type safety in application code.

Valter exposes four runtime entry points, all from the same codebase:

Entry PointFileCommandPortConsumers
REST APIsrc/valter/main.pymake dev8000Juca frontend, direct API clients
MCP stdiosrc/valter/mcp/__main__.pypython -m valter.mcpClaude Desktop, Claude Code
MCP HTTP/SSEsrc/valter/mcp/remote_server.pymake mcp-remote8001ChatGPT Apps via HMAC auth
ARQ Workersrc/valter/workers/__main__.pymake worker-ingestBackground ingestion jobs

In production (Railway), the REST API and MCP HTTP/SSE run as separate services with distinct URLs, while the ARQ Worker runs as a separate process consuming the Redis job queue.

At a high level, every request flows through the same pipeline regardless of entry point:

Consumer (Juca / ChatGPT / Claude)
Entry Point (REST API / MCP stdio / MCP HTTP)
Middleware Stack (CORS → Metrics → Tracking → RateLimit → Auth)
Route Handler (validates input, delegates to core)
Core Logic (retriever, enricher, verifier, etc.)
Stores (PostgreSQL, Qdrant, Neo4j, Redis, Groq, R2)
Response (serialized via Pydantic schema)

MCP tools follow the same path: each tool’s implementation calls core functions, which in turn call stores. The MCP layer adds no business logic — it is a thin adapter that translates MCP tool calls into the same core function calls that REST route handlers make.

For detailed visual diagrams of component relationships and the search pipeline, see Architecture Diagrams.