197 Reputation rollup sample thresholds

Status: Accepted
Date: 2026-02-15
Context:
- Motivation: complete ERD reputation rollup behavior for job_run_reputation_rollup_v1 so source_reputation only records trusted windows and uses ERD sample semantics.
- Constraints: runtime data access stays in stored procedures, no new dependencies, and behavior must be validated through existing data-layer tests.
Decision:
- Added migration 0089_indexer_reputation_rollup_thresholds.sql to redefine job_run_reputation_rollup_v1.
- Request samples now use outbound_request_log rows for request types (caps, search, tvsearch, moviesearch, rss, probe), exclude error_class=rate_limited, and count success as outcome='success' AND parse_ok=true.
- Rollups now upsert only for eligible windows (request_count >= 30 or acquisition_count >= 10) per ERD trusted-sample thresholds.
- Rollups are scoped to active indexers (indexer_instance.deleted_at IS NULL) and preserve fake/dmca numerator semantics from acquisition attempts plus reported_fake actions.
- Added job-runner tests in crates/revaer-data/src/indexers/jobs.rs for insufficient-sample skip behavior and eligible-window rate calculations.
- Alternatives considered: retaining previous always-upsert behavior (rejected: violates ERD “sufficient samples” requirement) and computing trust filtering in Rust instead of SQL (rejected: rollup ownership is database-side).
Consequences:
- Positive outcomes: source_reputation rows now align with ERD trust gating and sample definitions, reducing noisy low-sample rollups.
- Risks or trade-offs: sparse indexers may not get fresh reputation rows until enough traffic exists, which can increase neutral scoring fallback frequency.
Follow-up:
- Implement remaining Phase 9 derived refresh jobs and add dedicated tests for reputation windows 24h and 7d cadence behavior.
- Revisit whether stale reputation rows should be actively pruned when an indexer drops below sample thresholds.
Design notes:
- The procedure uses indexer_scope, combined, and eligible CTEs so threshold gating is explicit and unit-testable.
- min_samples now records the active trust threshold (10 when acquisition-driven, otherwise 30).
Test coverage summary:
- Added:
  - job_run_reputation_rollup_skips_insufficient_samples
  - job_run_reputation_rollup_writes_rates_for_eligible_samples
Observability updates:
- None (stored-procedure behavior change only).
Risk & rollback plan:
- Roll back by reverting migration 0089_indexer_reputation_rollup_thresholds.sql.
Dependency rationale:
- No new dependencies. Alternatives considered: not applicable.

Keyboard shortcuts

Revaer Documentation

197 Reputation rollup sample thresholds