Skip to content

Search Endpoints

Four endpoints for searching and retrieving STJ legal decisions using hybrid strategies, AI-extracted features, and dual-vector analysis.

Hybrid search over the jurisprudence corpus combining BM25 lexical matching with semantic vector similarity, optional knowledge graph boost, and optional cross-encoder reranking.

ParameterTypeDefaultDescription
querystringrequiredNatural-language legal query (1-1000 chars)
top_kinteger20Number of results to retrieve (1-100)
strategystring"weighted"Scoring strategy: weighted, rrf, bm25, or semantic
include_kgbooleanfalseApply knowledge graph relevance boost before final ordering
rerankbooleanfalseApply cross-encoder reranking. Improves precision, adds ~100-300ms
expand_querybooleanfalseExpand query with LLM-generated legal variants. Improves recall, adds ~500-1500ms
weightsobjectnullCustom signal weights (see below)
filtersobjectnullPost-retrieval filters (see below)
page_sizeintegernullEnable cursor pagination (1-50, must be <= top_k)
cursorstringnullContinuation cursor from previous page
include_stj_metadatabooleanfalseInclude STJ metadata via PostgreSQL lookup (~5-20ms extra)
FieldTypeDefaultDescription
bm25float0.5BM25 lexical signal weight
semanticfloat0.4Semantic embedding signal weight
kgfloat0.1Knowledge graph boost weight
FieldTypeDescription
ministrostringMinister name (auto-normalized to uppercase)
data_iniciostringStart date filter (YYYYMMDD format)
data_fimstringEnd date filter (YYYYMMDD format)
tipos_recursostring[]Appeal type filter (array)
resultadostringOutcome filter: provido, improvido, parcialmente provido
sourcestringSource type filter: corpus, embedding_only, ementa_only
Terminal window
curl -X POST http://localhost:8000/v1/retrieve \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d '{
"query": "dano moral atraso voo overbooking companhia aerea",
"top_k": 10,
"strategy": "weighted",
"rerank": true,
"filters": {
"resultado": "provido",
"data_inicio": "20200101"
}
}'
{
"data": [
{
"id": "doc-stj-resp-1234567",
"processo": "REsp 1.234.567/SP",
"ministro": "NANCY ANDRIGHI",
"data": "20230615",
"orgao": "TERCEIRA TURMA",
"ementa": "RECURSO ESPECIAL. TRANSPORTE AEREO. ...",
"ementa_preview": "RECURSO ESPECIAL. TRANSPORTE AEREO...",
"tese": "O atraso significativo de voo gera dano moral presumido...",
"razoes_decidir": null,
"score": 0.92,
"has_integra": true,
"score_breakdown": {
"bm25": 0.78,
"semantic": 0.95,
"kg_boost": null,
"rerank_score": 0.92
},
"matched_terms": ["dano", "moral", "atraso", "voo"],
"stj_metadata": null
}
],
"meta": {
"trace_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
"latency_ms": 245.3,
"cache_hit": false,
"model_version": "legal-bertimbau-v1.0",
"expansion_queries": null
},
"pagination": {
"cursor": null,
"has_more": false,
"total_estimate": 8
}
}

Find cases similar to a given decision using a blend of 70% semantic similarity and 30% structural knowledge graph overlap.

ParameterTypeDefaultDescription
document_idstringrequiredSource document ID to compare against
top_kinteger10Number of similar cases to return (1-100)
include_structuralbooleantrueInclude KG structural similarity in the score. Disabling uses semantic-only (faster).
Terminal window
curl -X POST http://localhost:8000/v1/similar_cases \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d '{
"document_id": "doc-stj-resp-1234567",
"top_k": 5,
"include_structural": true
}'

Structured search over AI-extracted document features with 9 combinable AND filters. At least one filter is required.

ParameterTypeDefaultDescription
categoriasstring[]nullLegal categories (OR/ANY semantics within the list)
dispositivo_normastringnullLegal statute filter (e.g., CDC, CC/2002). Exact containment match.
resultadostringnullOutcome filter (exact, case-sensitive)
unanimidadebooleannullUnanimous decision filter
tipo_decisaostringnullDecision type (exact, case-sensitive)
tipo_recursostringnullAppeal type (exact, case-sensitive)
ministro_relatorstringnullReporting minister (exact, case-sensitive)
argumento_vencedorstringnullWinning argument text (partial match, case-insensitive)
argumento_perdedorstringnullLosing argument text (partial match, case-insensitive)
limitinteger20Results per page (1-100)
offsetinteger0Pagination offset
Terminal window
curl -X POST http://localhost:8000/v1/search/features \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d '{
"categorias": ["Direito do Consumidor"],
"resultado": "provido",
"dispositivo_norma": "CDC",
"limit": 10
}'
{
"data": [
{
"document_id": "doc-stj-resp-9876543",
"processo": "REsp 9.876.543/RJ",
"ementa_preview": "CONSUMIDOR. PRODUTO DEFEITUOSO...",
"categorias": ["Direito do Consumidor"],
"resultado": "provido",
"tipo_decisao": "Acórdão",
"unanimidade": true,
"dispositivo_norma": ["CDC", "CC/2002"],
"argumento_vencedor": "Responsabilidade objetiva do fornecedor..."
}
],
"total": 42,
"meta": {
"trace_id": "b2c3d4e5-f6a7-8901-bcde-f12345678901",
"latency_ms": 35.8
}
}

Dual-vector search that separates facts from legal thesis, searches each independently, then produces a divergence report. The pipeline: text input, LLM extraction (via Groq), encode each digest into separate vectors, vector search, divergence analysis.

ParameterTypeDefaultDescription
textstringnullLegal text for analysis (50-15000 chars). Required if document_id is not provided.
document_idstringnullCorpus document ID. Required if text is not provided.
top_kinteger10Max results per dimension (1-50)
filtersobjectnullSame filter object as /v1/retrieve (ministro, resultado, source)
Terminal window
curl -X POST http://localhost:8000/v1/factual/dual-search \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d '{
"document_id": "doc-stj-resp-1234567",
"top_k": 5
}'
{
"data": {
"factual_digest": {
"bullets": [
{ "index": 0, "text": "Consumidor adquiriu produto com defeito...", "source_excerpt": "...", "uncertainty": false }
],
"digest_text": "Consumidor adquiriu produto com defeito de fabricacao...",
"extraction_model": "llama-3.3-70b-versatile"
},
"thesis_digest": {
"thesis_text": "Responsabilidade objetiva do fornecedor por vicio do produto...",
"legal_basis": ["CDC art. 12", "CDC art. 18"],
"precedents_cited": ["REsp 1.234.567/SP"],
"extraction_model": "llama-3.3-70b-versatile"
},
"factual_results": [
{ "id": "doc-001", "processo": "REsp 111.222/MG", "ministro": "NANCY ANDRIGHI", "data": "20230101", "score": 0.89 }
],
"thesis_results": [
{ "id": "doc-002", "processo": "REsp 333.444/PR", "ministro": "MARCO BUZZI", "data": "20220615", "score": 0.85 }
],
"overlap_ids": [],
"fact_only_ids": ["doc-001"],
"thesis_only_ids": ["doc-002"],
"divergence_summary": "Os fatos sao similares a doc-001 mas a tese juridica diverge. doc-002 compartilha a mesma tese mas com fatos distintos."
},
"meta": {
"trace_id": "c3d4e5f6-a7b8-9012-cdef-123456789012",
"latency_ms": 1850.5
}
}

The divergence report reveals three categories:

  • overlap_ids — cases matching on both facts and thesis (strong precedent).
  • fact_only_ids — factually similar but legally different (potential distinguishing).
  • thesis_only_ids — same legal thesis but different facts (thematic precedent).