Architecture Diagrams
Architecture Diagrams
Section titled “Architecture Diagrams”Visual representations of Valter’s architecture using Mermaid diagrams.
Component Diagram
Section titled “Component Diagram”This diagram shows all major components and their connections, from external consumers down to data stores. The arrows represent runtime dependencies — the direction data flows during a request.
graph TB subgraph Consumidores JUCA[Juca Frontend<br/>Next.js] CHATGPT[ChatGPT<br/>MCP Remoto] CLAUDE[Claude Desktop/Code<br/>MCP stdio] end
subgraph "Valter — Entry Points" API[REST API<br/>FastAPI :8000] MCP_STDIO[MCP Server<br/>stdio] MCP_HTTP[MCP Server<br/>HTTP/SSE :8001] WORKER[ARQ Worker<br/>Background Jobs] end
subgraph "Middleware Stack" CORS[CORS] METRICS_MW[Metrics IP Allowlist] TRACKING[Request Tracking<br/>trace_id + Prometheus] RATELIMIT[Rate Limiter<br/>Redis sliding window] AUTH[Auth Middleware<br/>API Key + Scopes] end
subgraph "API Layer (api/)" ROUTES[11 Routers<br/>health, retrieve, verify,<br/>enrich, similar, graph,<br/>features, factual, ingest,<br/>memories, datasets] SCHEMAS[Pydantic v2 Schemas<br/>Request/Response] DEPS[Dependency Container<br/>DI via Depends()] end
subgraph "Core Layer (core/)" RETRIEVER[HybridRetriever<br/>BM25 + Semantic + KG Boost] DVR[DualVectorRetriever<br/>Facts + Thesis] ENRICHER[DocumentEnricher<br/>IRAC + KG Context] VERIFIER[LegalVerifier<br/>Anti-hallucination] SIMILARITY[SimilarityFinder<br/>70% semantic + 30% structural] FACTUAL[FactualExtractor<br/>Groq LLM] QE[QueryExpander<br/>Multi-query RAG] WORKFLOW_ORCH[WorkflowOrchestrator<br/>Full Case Analysis] PROJUDI[ProjudiOrchestrator<br/>PDF Extraction] PHASE[PhaseAnalysis<br/>Interpreter + Rules] PROTOCOLS[Protocols<br/>Runtime-checkable interfaces] end
subgraph "Store Layer (stores/)" PG_DOC[PostgresDocStore] PG_FEAT[PostgresFeaturesStore] PG_STJ[PostgresSTJStore] PG_INGEST[PostgresIngestStore] PG_MEM[PostgresMemoryStore] QDRANT_STORE[QdrantVectorStore] NEO4J_STORE[Neo4jGraphStore<br/>12+ query methods] REDIS_STORE[RedisCacheStore] GROQ_CLIENT[GroqLLMClient] ARTIFACT[ArtifactStorage<br/>R2 / Local] end
subgraph "Embeddings" ENCODER[SentenceTransformerEncoder<br/>Legal-BERTimbau 768d] RERANKER[CrossEncoderReranker] RAIL_ENC[RailwayEncoder<br/>Remote HTTP] RAIL_RER[RailwayReranker<br/>Remote HTTP] end
subgraph "Data Stores" PG[(PostgreSQL 16)] QD[(Qdrant)] N4J[(Neo4j / Aura)] RD[(Redis 7)] CF[(Cloudflare R2)] GROQ_API[(Groq API)] end
subgraph "Observability" STRUCTLOG[structlog<br/>JSON + trace_id] PROM[Prometheus<br/>/metrics] OTEL[OpenTelemetry<br/>Tracing] end
JUCA --> API CHATGPT --> MCP_HTTP CLAUDE --> MCP_STDIO MCP_STDIO --> API MCP_HTTP --> API
API --> CORS --> METRICS_MW --> TRACKING --> RATELIMIT --> AUTH AUTH --> ROUTES ROUTES --> DEPS DEPS --> RETRIEVER & DVR & ENRICHER & VERIFIER & SIMILARITY & FACTUAL
RETRIEVER --> QDRANT_STORE & NEO4J_STORE & PG_DOC & REDIS_STORE & ENCODER & RERANKER & QE WORKFLOW_ORCH --> PROJUDI & PHASE & PG_INGEST & ARTIFACT
PG_DOC --> PG PG_FEAT --> PG PG_STJ --> PG PG_INGEST --> PG PG_MEM --> PG QDRANT_STORE --> QD NEO4J_STORE --> N4J REDIS_STORE --> RD GROQ_CLIENT --> GROQ_API ARTIFACT --> CF
WORKER --> WORKFLOW_ORCH
TRACKING --> STRUCTLOG & PROM & OTELReading the Diagram
Section titled “Reading the Diagram”The diagram is organized in layers from top to bottom, matching the dependency rule (api/ -> core/ -> models/):
- Consumers at the top connect to entry points. Juca uses the REST API directly. ChatGPT uses MCP over HTTP/SSE with HMAC authentication. Claude uses MCP over stdio locally.
- Entry Points represent the four runtimes. The REST API and both MCP servers ultimately call the same core logic. The ARQ Worker handles background ingestion jobs.
- Middleware Stack processes every HTTP request in sequence. Each middleware adds a capability: CORS headers, metrics access control, request tracing, rate limiting, and authentication.
- API Layer contains 11 routers (one per domain), Pydantic schemas for request/response validation, and the dependency container that wires stores into core functions.
- Core Layer holds all business logic. No module here imports from
stores/directly — they receive store instances through dependency injection via theProtocolsmodule. - Store Layer provides concrete implementations for each data backend. The
GroqLLMClientis here because it is an external data provider, not because it is a traditional database. - Embeddings are handled by either local models (SentenceTransformer, CrossEncoder) or remote Railway-hosted instances, selected by configuration.
- Data Stores are the actual database processes (PostgreSQL, Qdrant, Neo4j, Redis) plus external APIs (Groq, Cloudflare R2).
- Observability collects structured logs, Prometheus metrics, and OpenTelemetry traces from the middleware tracking layer.
Search Pipeline
Section titled “Search Pipeline”The hybrid search pipeline is Valter’s most complex data flow. It combines three search strategies (BM25 keyword search, semantic vector search, and knowledge graph boosting) into a single ranked result set.
flowchart LR Q[Query] --> CACHE{Cache?} CACHE -->|hit| RESULT CACHE -->|miss| QE[Query Expansion<br/>Groq LLM] QE --> ENCODE[Encode<br/>Legal-BERTimbau] ENCODE --> PAR{Parallel} PAR --> BM25[BM25<br/>rank-bm25] PAR --> SEM[Semantic<br/>Qdrant cosine] BM25 --> MERGE[Merge<br/>Weighted / RRF] SEM --> MERGE MERGE --> FETCH[Fetch Docs<br/>PostgreSQL] FETCH --> FILTER[Filters<br/>ministro, data, tipo] FILTER --> KG{KG Boost?} KG -->|yes| BOOST[Neo4j Batch<br/>KG Boost] KG -->|no| RERANK BOOST --> RERANK[Reranking<br/>Cross-Encoder] RERANK --> STORE_CACHE[Store Cache<br/>Redis TTL 180s] STORE_CACHE --> RESULT[Response]Pipeline Stages
Section titled “Pipeline Stages”- Cache check — Redis is queried first with the exact query hash. Cache TTL is 180 seconds. On a hit, the cached result is returned immediately.
- Query expansion — if Groq is enabled, the query is expanded into up to 3 semantic variants using multi-query RAG. This improves recall for ambiguous legal queries.
- Encoding — the query (and its expansions) are encoded into 768-dimensional vectors using Legal-BERTimbau (
rufimelo/Legal-BERTimbau-sts-base). - Parallel search — BM25 keyword search against PostgreSQL and semantic cosine similarity search against Qdrant run concurrently.
- Merge — results from both strategies are combined using either weighted fusion or Reciprocal Rank Fusion (RRF), configurable per request.
- Document fetch — full document data is loaded from PostgreSQL for the merged candidate set.
- Filters — post-retrieval filters apply (minister, date range, decision type, result).
- KG Boost — if enabled, Neo4j is queried in batch to score each candidate based on shared criteria, facts, evidence, and legal devices in the knowledge graph. The KG boost score is blended with the search score.
- Reranking — a cross-encoder model re-scores the top candidates based on query-document relevance, producing the final ranking.
- Cache store — the final result is stored in Redis with a 180-second TTL.
Ingestion Workflow
Section titled “Ingestion Workflow”The ingestion workflow handles the complete lifecycle of a legal case analysis, from PDF upload through AI-powered extraction to human-reviewed artifacts. The workflow is managed by a state machine in core/workflow_state_machine.py.
stateDiagram-v2 [*] --> pending: PDF uploaded
pending --> extracting: Start extraction extracting --> segmented: PROJUDI segments extracted extracting --> failed: Extraction error
segmented --> classifying: Start classification classifying --> classified: Groq classification complete classifying --> failed: Classification error
classified --> analyzing: Start phase analysis analyzing --> analyzed: Phases interpreted analyzing --> failed: Analysis error
analyzed --> matching: Start jurisprudence matching matching --> matched: Cases matched + suggestions generated matching --> failed: Matching error
matched --> reviewing: Submit for human review reviewing --> approved: Human approves reviewing --> rejected: Human rejects rejected --> analyzing: Reprocess from analysis
approved --> generating: Generate artifacts generating --> completed: Artifacts stored (R2 / local) generating --> failed: Generation error
failed --> pending: Retry completed --> [*]Workflow Stages
Section titled “Workflow Stages”- pending — a PDF has been uploaded but no processing has started.
- extracting — the PROJUDI pipeline (
core/projudi_pipeline.py) segments the document, classifying each section with confidence scores and applying sibling inheritance for ambiguous segments. - classifying — Groq LLM extracts 21 structured fields from the document (facts, legal thesis, outcome, minister, category, etc.).
- analyzing — the phase analysis pipeline (
core/phase_interpreter.py+ rules) produces deterministic interpretation of legal proceeding phases with jurisprudence recommendations per phase. - matching — the hybrid retriever finds similar cases, and the system generates suggestions for the lawyer.
- reviewing — a human (lawyer) reviews the analysis and suggestions, approving or rejecting each phase. Rejections trigger reprocessing.
- generating — approved analyses are serialized as JSON and PDF artifacts, stored in Cloudflare R2 (or local filesystem).
- completed — the workflow is finished. All state transitions are recorded as auditable events in a JSONL manifest.
Data Model (FRBR Ontology)
Section titled “Data Model (FRBR Ontology)”The Neo4j knowledge graph is structured around the FRBR (Functional Requirements for Bibliographic Records) ontology, adapted for Brazilian legal documents. This ontology distinguishes between the abstract intellectual content of a law or decision (Work), a specific version of that content (Expression), and its physical form (Manifestation).
Node Types
Section titled “Node Types”| Node Type | Description | Example |
|---|---|---|
Decision | An STJ court decision (the primary entity) | REsp 1.234.567/SP |
Criterion | A legal criterion or test applied in decisions | ”boa-fe objetiva”, “nexo causal” |
LegalDevice | A specific article, paragraph, or clause of legislation | Art. 14 do CDC, Art. 927 do CC |
Precedent | A precedent cited by decisions | Sumula 297/STJ |
Minister | An STJ minister (judge) | Min. Nancy Andrighi |
Category | A legal subject area | Direito do Consumidor, Responsabilidade Civil |
Fact | A factual element relevant to the case | ”defeito do produto”, “dano moral” |
Evidence | A type of evidence referenced | ”prova pericial”, “documento” |
Relationship Types
Section titled “Relationship Types”| Relationship | From | To | Meaning |
|---|---|---|---|
APLICA | Decision | LegalDevice | The decision applies this legal provision |
USA_CRITERIO | Decision | Criterion | The decision uses this legal criterion |
CITA | Decision | Decision | One decision cites another |
CITA_PRECEDENTE | Decision | Precedent | The decision cites this precedent |
RELATOR_DE | Minister | Decision | This minister was the reporting judge |
PERTENCE_A | Decision | Category | The decision belongs to this subject area |
TEM_FATO | Decision | Fact | The decision involves this fact |
TEM_PROVA | Decision | Evidence | The decision references this evidence |
DIVERGE_DE | Decision | Decision | The decisions have divergent outcomes on the same criterion |
The current graph contains approximately 28,482 nodes and 207,163 edges, representing the relational structure of STJ jurisprudence. The 12 graph analytics endpoints in the API expose different traversal and aggregation patterns over this graph.
Deployment Architecture
Section titled “Deployment Architecture”In production, Valter runs on Railway with the following service layout:
Internet │ ┌────────────┼────────────┐ │ │ │ ▼ ▼ ▼ ┌──────────┐ ┌──────────┐ ┌──────────┐ │ REST API │ │MCP Remote│ │ Juca │ │ :8000 │ │ :8001 │ │ (Next.js)│ │ FastAPI │ │ HMAC Auth│ │ │ └────┬─────┘ └────┬─────┘ └────┬─────┘ │ │ │ │ ▼ │ │ ┌──────────┐ │ │ │ REST API │◄──────┘ │ │ (bridge) │ │ └──────────┘ │ ┌────┴─────────────────────────────┐ │ Internal Network │ │ │ │ ┌──────┐ ┌──────┐ ┌──────────┐ │ │ │ PG │ │Qdrant│ │ Redis │ │ │ │ 16 │ │ │ │ 7 │ │ │ └──────┘ └──────┘ └──────────┘ │ │ │ │ ┌──────────┐ │ │ │ARQ Worker│ │ │ │(ingest) │ │ │ └──────────┘ │ └──────────────────────────────────┘
External Services: ┌──────────┐ ┌──────────┐ ┌──────┐ │Neo4j Aura│ │Groq API │ │ R2 │ │(managed) │ │(LLM) │ │(S3) │ └──────────┘ └──────────┘ └──────┘Key Points
Section titled “Key Points”- REST API and MCP Remote are separate Railway services running the same Docker image with different start commands. This provides independent scaling and rollback.
- PostgreSQL, Qdrant, and Redis run as Railway plugins on the internal network, not exposed to the internet.
- Neo4j Aura is a managed external service (not self-hosted). PRs touching graph code must pass the
Aura ValidationCI workflow before merge. - Groq API and Cloudflare R2 are external services accessed over HTTPS. Both are optional — the system degrades gracefully without them (no AI features, local filesystem for artifacts).
- ARQ Worker runs as a separate process consuming the Redis job queue for background ingestion workflows.
- Juca (the Next.js frontend) calls the REST API directly. It is a separate deployment, not part of the Valter codebase.