Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Search-run retry behavior coverage for rate-limited and transient errors

  • Status: Accepted
  • Date: 2026-02-25
  • Context:
    • Motivation: ERD acceptance item 584 requires explicit verification that search-run retry behavior matches ERD rules for both rate-limited and transient failures.
    • Constraints:
      • Validation must happen at stored-proc behavior boundaries.
      • Keep changes test-focused, with no dependency additions.
    • Dependency rationale:
      • No new dependencies were added.
      • Alternative considered: infer behavior from migration SQL only. Rejected because acceptance requires executable regression tests.
  • Decision:
    • Added stored-proc tests in crates/revaer-data/src/indexers/search_requests.rs:
      • search_indexer_run_mark_failed_rate_limited_uses_retry_and_scope
      • search_indexer_run_mark_failed_transient_retries_before_limit
      • search_indexer_run_mark_failed_transient_reaches_retry_limit
    • Added local test helpers to create request/instance run scopes and assert run state transitions.
    • Updated ERD_INDEXERS_CHECKLIST.md item 584 to complete.
  • Consequences:
    • Positive outcomes:
      • Rate-limited retry semantics are now explicitly validated:
        • queued retry state
        • incremented attempt_count and rate_limited_attempt_count
        • required last_rate_limit_scope
      • Transient failure semantics are explicitly validated:
        • queued retry before max retries
        • terminal failed state when retry limit is reached
    • Risks or trade-offs:
      • Tests focus on stored-proc state transitions and do not duplicate higher-level orchestrator behavior already covered elsewhere.
  • Follow-up:
    • Test coverage summary:
      • Included in normal just ci and just ui-e2e quality gates.
    • Observability updates:
      • No observability schema changes required.
    • Risk and rollback:
      • Rollback is low risk and limited to test/docs/checklist updates.
    • Review checkpoints:
      • Continue with remaining unchecked acceptance items (Cloudflare transitions, streaming behavior, explainability).