Search Endpoints
Search Endpoints
Section titled “Search Endpoints”Four endpoints for searching and retrieving STJ legal decisions using hybrid strategies, AI-extracted features, and dual-vector analysis.
POST /v1/retrieve
Section titled “POST /v1/retrieve”Hybrid search over the jurisprudence corpus combining BM25 lexical matching with semantic vector similarity, optional knowledge graph boost, and optional cross-encoder reranking.
Parameters
Section titled “Parameters”| Parameter | Type | Default | Description |
|---|---|---|---|
query | string | required | Natural-language legal query (1-1000 chars) |
top_k | integer | 20 | Number of results to retrieve (1-100) |
strategy | string | "weighted" | Scoring strategy: weighted, rrf, bm25, or semantic |
include_kg | boolean | false | Apply knowledge graph relevance boost before final ordering |
rerank | boolean | false | Apply cross-encoder reranking. Improves precision, adds ~100-300ms |
expand_query | boolean | false | Expand query with LLM-generated legal variants. Improves recall, adds ~500-1500ms |
weights | object | null | Custom signal weights (see below) |
filters | object | null | Post-retrieval filters (see below) |
page_size | integer | null | Enable cursor pagination (1-50, must be <= top_k) |
cursor | string | null | Continuation cursor from previous page |
include_stj_metadata | boolean | false | Include STJ metadata via PostgreSQL lookup (~5-20ms extra) |
Weights object
Section titled “Weights object”| Field | Type | Default | Description |
|---|---|---|---|
bm25 | float | 0.5 | BM25 lexical signal weight |
semantic | float | 0.4 | Semantic embedding signal weight |
kg | float | 0.1 | Knowledge graph boost weight |
Filters object
Section titled “Filters object”| Field | Type | Description |
|---|---|---|
ministro | string | Minister name (auto-normalized to uppercase) |
data_inicio | string | Start date filter (YYYYMMDD format) |
data_fim | string | End date filter (YYYYMMDD format) |
tipos_recurso | string[] | Appeal type filter (array) |
resultado | string | Outcome filter: provido, improvido, parcialmente provido |
source | string | Source type filter: corpus, embedding_only, ementa_only |
Example request
Section titled “Example request”curl -X POST http://localhost:8000/v1/retrieve \ -H "Authorization: Bearer $API_KEY" \ -H "Content-Type: application/json" \ -d '{ "query": "dano moral atraso voo overbooking companhia aerea", "top_k": 10, "strategy": "weighted", "rerank": true, "filters": { "resultado": "provido", "data_inicio": "20200101" } }'Example response
Section titled “Example response”{ "data": [ { "id": "doc-stj-resp-1234567", "processo": "REsp 1.234.567/SP", "ministro": "NANCY ANDRIGHI", "data": "20230615", "orgao": "TERCEIRA TURMA", "ementa": "RECURSO ESPECIAL. TRANSPORTE AEREO. ...", "ementa_preview": "RECURSO ESPECIAL. TRANSPORTE AEREO...", "tese": "O atraso significativo de voo gera dano moral presumido...", "razoes_decidir": null, "score": 0.92, "has_integra": true, "score_breakdown": { "bm25": 0.78, "semantic": 0.95, "kg_boost": null, "rerank_score": 0.92 }, "matched_terms": ["dano", "moral", "atraso", "voo"], "stj_metadata": null } ], "meta": { "trace_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890", "latency_ms": 245.3, "cache_hit": false, "model_version": "legal-bertimbau-v1.0", "expansion_queries": null }, "pagination": { "cursor": null, "has_more": false, "total_estimate": 8 }}POST /v1/similar_cases
Section titled “POST /v1/similar_cases”Find cases similar to a given decision using a blend of 70% semantic similarity and 30% structural knowledge graph overlap.
Parameters
Section titled “Parameters”| Parameter | Type | Default | Description |
|---|---|---|---|
document_id | string | required | Source document ID to compare against |
top_k | integer | 10 | Number of similar cases to return (1-100) |
include_structural | boolean | true | Include KG structural similarity in the score. Disabling uses semantic-only (faster). |
Example request
Section titled “Example request”curl -X POST http://localhost:8000/v1/similar_cases \ -H "Authorization: Bearer $API_KEY" \ -H "Content-Type: application/json" \ -d '{ "document_id": "doc-stj-resp-1234567", "top_k": 5, "include_structural": true }'POST /v1/search/features
Section titled “POST /v1/search/features”Structured search over AI-extracted document features with 9 combinable AND filters. At least one filter is required.
Parameters
Section titled “Parameters”| Parameter | Type | Default | Description |
|---|---|---|---|
categorias | string[] | null | Legal categories (OR/ANY semantics within the list) |
dispositivo_norma | string | null | Legal statute filter (e.g., CDC, CC/2002). Exact containment match. |
resultado | string | null | Outcome filter (exact, case-sensitive) |
unanimidade | boolean | null | Unanimous decision filter |
tipo_decisao | string | null | Decision type (exact, case-sensitive) |
tipo_recurso | string | null | Appeal type (exact, case-sensitive) |
ministro_relator | string | null | Reporting minister (exact, case-sensitive) |
argumento_vencedor | string | null | Winning argument text (partial match, case-insensitive) |
argumento_perdedor | string | null | Losing argument text (partial match, case-insensitive) |
limit | integer | 20 | Results per page (1-100) |
offset | integer | 0 | Pagination offset |
Example request
Section titled “Example request”curl -X POST http://localhost:8000/v1/search/features \ -H "Authorization: Bearer $API_KEY" \ -H "Content-Type: application/json" \ -d '{ "categorias": ["Direito do Consumidor"], "resultado": "provido", "dispositivo_norma": "CDC", "limit": 10 }'Example response
Section titled “Example response”{ "data": [ { "document_id": "doc-stj-resp-9876543", "processo": "REsp 9.876.543/RJ", "ementa_preview": "CONSUMIDOR. PRODUTO DEFEITUOSO...", "categorias": ["Direito do Consumidor"], "resultado": "provido", "tipo_decisao": "Acórdão", "unanimidade": true, "dispositivo_norma": ["CDC", "CC/2002"], "argumento_vencedor": "Responsabilidade objetiva do fornecedor..." } ], "total": 42, "meta": { "trace_id": "b2c3d4e5-f6a7-8901-bcde-f12345678901", "latency_ms": 35.8 }}POST /v1/factual/dual-search
Section titled “POST /v1/factual/dual-search”Dual-vector search that separates facts from legal thesis, searches each independently, then produces a divergence report. The pipeline: text input, LLM extraction (via Groq), encode each digest into separate vectors, vector search, divergence analysis.
Parameters
Section titled “Parameters”| Parameter | Type | Default | Description |
|---|---|---|---|
text | string | null | Legal text for analysis (50-15000 chars). Required if document_id is not provided. |
document_id | string | null | Corpus document ID. Required if text is not provided. |
top_k | integer | 10 | Max results per dimension (1-50) |
filters | object | null | Same filter object as /v1/retrieve (ministro, resultado, source) |
Example request
Section titled “Example request”curl -X POST http://localhost:8000/v1/factual/dual-search \ -H "Authorization: Bearer $API_KEY" \ -H "Content-Type: application/json" \ -d '{ "document_id": "doc-stj-resp-1234567", "top_k": 5 }'Example response
Section titled “Example response”{ "data": { "factual_digest": { "bullets": [ { "index": 0, "text": "Consumidor adquiriu produto com defeito...", "source_excerpt": "...", "uncertainty": false } ], "digest_text": "Consumidor adquiriu produto com defeito de fabricacao...", "extraction_model": "llama-3.3-70b-versatile" }, "thesis_digest": { "thesis_text": "Responsabilidade objetiva do fornecedor por vicio do produto...", "legal_basis": ["CDC art. 12", "CDC art. 18"], "precedents_cited": ["REsp 1.234.567/SP"], "extraction_model": "llama-3.3-70b-versatile" }, "factual_results": [ { "id": "doc-001", "processo": "REsp 111.222/MG", "ministro": "NANCY ANDRIGHI", "data": "20230101", "score": 0.89 } ], "thesis_results": [ { "id": "doc-002", "processo": "REsp 333.444/PR", "ministro": "MARCO BUZZI", "data": "20220615", "score": 0.85 } ], "overlap_ids": [], "fact_only_ids": ["doc-001"], "thesis_only_ids": ["doc-002"], "divergence_summary": "Os fatos sao similares a doc-001 mas a tese juridica diverge. doc-002 compartilha a mesma tese mas com fatos distintos." }, "meta": { "trace_id": "c3d4e5f6-a7b8-9012-cdef-123456789012", "latency_ms": 1850.5 }}The divergence report reveals three categories:
- overlap_ids — cases matching on both facts and thesis (strong precedent).
- fact_only_ids — factually similar but legally different (potential distinguishing).
- thesis_only_ids — same legal thesis but different facts (thematic precedent).