Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

197 Reputation rollup sample thresholds

  • Status: Accepted
  • Date: 2026-02-15
  • Context:
    • Motivation: complete ERD reputation rollup behavior for job_run_reputation_rollup_v1 so source_reputation only records trusted windows and uses ERD sample semantics.
    • Constraints: runtime data access stays in stored procedures, no new dependencies, and behavior must be validated through existing data-layer tests.
  • Decision:
    • Added migration 0089_indexer_reputation_rollup_thresholds.sql to redefine job_run_reputation_rollup_v1.
    • Request samples now use outbound_request_log rows for request types (caps, search, tvsearch, moviesearch, rss, probe), exclude error_class=rate_limited, and count success as outcome='success' AND parse_ok=true.
    • Rollups now upsert only for eligible windows (request_count >= 30 or acquisition_count >= 10) per ERD trusted-sample thresholds.
    • Rollups are scoped to active indexers (indexer_instance.deleted_at IS NULL) and preserve fake/dmca numerator semantics from acquisition attempts plus reported_fake actions.
    • Added job-runner tests in crates/revaer-data/src/indexers/jobs.rs for insufficient-sample skip behavior and eligible-window rate calculations.
    • Alternatives considered: retaining previous always-upsert behavior (rejected: violates ERD “sufficient samples” requirement) and computing trust filtering in Rust instead of SQL (rejected: rollup ownership is database-side).
  • Consequences:
    • Positive outcomes: source_reputation rows now align with ERD trust gating and sample definitions, reducing noisy low-sample rollups.
    • Risks or trade-offs: sparse indexers may not get fresh reputation rows until enough traffic exists, which can increase neutral scoring fallback frequency.
  • Follow-up:
    • Implement remaining Phase 9 derived refresh jobs and add dedicated tests for reputation windows 24h and 7d cadence behavior.
    • Revisit whether stale reputation rows should be actively pruned when an indexer drops below sample thresholds.
  • Design notes:
    • The procedure uses indexer_scope, combined, and eligible CTEs so threshold gating is explicit and unit-testable.
    • min_samples now records the active trust threshold (10 when acquisition-driven, otherwise 30).
  • Test coverage summary:
    • Added:
      • job_run_reputation_rollup_skips_insufficient_samples
      • job_run_reputation_rollup_writes_rates_for_eligible_samples
  • Observability updates:
    • None (stored-procedure behavior change only).
  • Risk & rollback plan:
    • Roll back by reverting migration 0089_indexer_reputation_rollup_thresholds.sql.
  • Dependency rationale:
    • No new dependencies. Alternatives considered: not applicable.