Skip to content

Milestones

Sequential milestone plan: v1.0 through v2.1. Each milestone depends on the previous one being complete. Estimated timeline runs from March 2026 through late 2026.

Objective: Stabilize production, fix premortem vulnerabilities, prevent silent degradation.

Prerequisite: None (current state).

Estimated effort: 2-3 weeks.

FeaturePriorityDescription
Rate limiter fail-openP0When Redis is down, allow requests from valid API keys instead of blocking all traffic
Indexation gap closureP0Batch-index the ~19,700 ementa-only documents that lack embeddings (3,673 -> 20,000+ vectors)
Alerting wiringP1Connect Railway logs to Slack for critical errors and degradation alerts
HTTPS fixP1Resolve certificate validation issues on production domain
Merge pending PRsP1Close out open pull requests blocking downstream work
Privacy policy / termsP1Add required legal pages for App Directory submission
Datetime migrationP2Migrate naive datetime fields to timezone-aware (datetime -> datetime(timezone.utc))
README updateP2Update README to reflect current state and setup instructions
Absence runbookP2Document operational procedures for when the primary developer is unavailable
R2 canary activationP2Activate canary rollout for R2 artifact storage (currently at ~90% implementation)
  • Rate limiter allows requests when Redis is down (fail-open for valid keys)
  • Qdrant contains >= 20,000 indexed vectors
  • Slack alerts firing on critical errors
  • HTTPS certificate valid on production domain
  • Zero DeprecationWarning from naive datetime usage

Objective: Resilience to partial infrastructure failures and measurable search quality improvements.

Prerequisite: v1.0 complete.

Estimated effort: 2-3 weeks.

FeaturePriorityDescription
Circuit breakerP0Stop calling Neo4j after repeated failures/timeouts (>5s), allow recovery without blocking requests
Connection pool configurationP1Tune PostgreSQL, Neo4j, and Redis connection pools for production load patterns
ARQ cron ingestionP1Scheduled background jobs to check for new STJ decisions and ingest automatically
Fallback extraction to coreP1Move fallback text extraction logic from stores/ into core/ (proper layer)
Heuristic maps externalizationP2Move hardcoded classification heuristics to configuration files
Stopwords unificationP2Single stopwords source shared between BM25 and query expansion
Fallback metricsP2Prometheus counters for how often fallback paths are exercised
Store unit testsP2Unit test coverage for stores/ layer (currently undertested)
  • Circuit breaker active: Neo4j hang > 5s opens circuit, requests proceed without graph features
  • Connection pools configured with explicit limits and timeouts
  • ARQ checks for new decisions at least weekly
  • Fallback extraction logic lives in core/, not stores/

Objective: Transform Valter from a search backend into a reasoning engine. This is the flagship feature.

Prerequisite: v1.1 complete (circuit breaker and connection pools required for heavy multi-store queries).

Estimated effort: 2-3 weeks.

FeaturePriorityDescription
core/reasoning_chain.py orchestratorP0Server-side orchestrator that composes verified legal arguments from knowledge graph paths
POST /v1/reasoning-chain endpointP0REST endpoint exposing the reasoning chain to frontends
compose_legal_argument MCP toolP0MCP tool allowing LLMs to request composed legal arguments with provenance
Provenance trackingP0Every step in the reasoning chain links back to specific decisions, with citation counts and graph position
Temporal intelligence integrationP1Reasoning chain weights recent decisions higher, flags overturned precedents
TRF spike (50 decisions)P1Ingest 50 TRF decisions to test multi-tribunal feasibility before committing to v2.0

The reasoning chain orchestrator follows this flow:

  1. Query expansion — parse the legal question, identify relevant criteria and legal provisions
  2. Multi-strategy retrieval — hybrid search (BM25 + semantic + KG boost) for relevant decisions
  3. Graph traversal — follow citation paths, shared criteria, and precedent chains in Neo4j
  4. Argument composition — assemble a multi-step legal argument from the strongest graph paths
  5. Verification — every cited decision is verified against real STJ data (anti-hallucination)
  6. Provenance attachment — each step includes the source decision, citation count, recency, and graph connectivity score
  • Reasoning chain returns >= 3 verified steps with full provenance
  • MCP tool functional and tested with Claude and ChatGPT
  • Latency p95 < 5s for reasoning chain requests
  • TRF spike completed with documented breakpoints and feasibility assessment

Objective: Expand beyond STJ to other Brazilian courts.

Prerequisite: v1.2 complete (TRF spike executed, multi-tribunal breakpoints documented).

Estimated effort: 2-3 months (scope depends on spike results from v1.2).

FeaturePriorityDescription
Multi-tribunal architectureP0Abstract tribunal-specific logic behind interfaces, support multiple courts in the same deployment
TRF supportP0Federal Regional Courts — starting with the court identified in the v1.2 spike
TST supportP1Superior Labor Court
STF supportP1Supreme Federal Court (constitutional matters)
Leci integrationP1Integration with Leci (sister product) for enriched legal analysis
Juca integrationP1Integration with Juca (frontend) for seamless user experience
Automatic ingestion pipelineP1Continuous ingestion from multiple tribunal portals without manual intervention
  • At least 1 additional court with searchable, verified data
  • Reasoning chain works across tribunals (e.g., STJ decision citing TRF precedent)
  • Ingestion pipeline running for >= 2 courts

Objective: Multi-consumer platform with SLA guarantees and public ChatGPT App Directory presence.

Prerequisite: v2.0 complete (multi-tribunal working, stable enough for external users).

Estimated effort: Depends on demand and App Directory review timeline.

FeaturePriorityDescription
ChatGPT App Directory submissionP1Submit Valter as a public MCP tool in the ChatGPT App Directory
MCP hardeningP0Rate limiting per consumer, request validation, abuse prevention
Multi-tenancyP1Support multiple organizations with isolated data and billing
SLA guaranteesP1Documented uptime, latency, and availability targets
Load testingP0Validate that the system handles target concurrent load
Store test coverage > 80%P2Comprehensive test coverage for all store implementations
  • At least 1 external user (beyond the developer) actively using the system
  • App Directory submission completed (pending review)
  • Load tests validate SLA targets under concurrent load

2026-03 v1.0 — Stable Production (~2-3 weeks)
|
2026-03/04 v1.1 — Resilience + Search Quality (~2-3 weeks)
|
2026-04 v1.2 — Legal Reasoning Chain (~2-3 weeks) *** FLAGSHIP ***
|
2026-05-07 v2.0 — Multi-Tribunal Platform (~2-3 months, scope from spike)
|
2026-H2 v2.1 — Scale + Public Presence (depends on demand)