canonical_prune_low_confidence_v1 needed to match ERD_INDEXERS.md pruning semantics for low-confidence fallback canonicals.
The ERD requires preserving candidates when their durable sources are also tied to non-pruned canonicals.
Existing logic inferred source ties via identity joins, which could diverge from persisted canonical/source linkage used by scoring and best-source derivations.
Decision:
Redefine canonical_prune_low_confidence_v1 to evaluate source ties from persisted canonical/source linkage tables:
canonical_torrent_source_base_score
canonical_torrent_source_context_score
canonical_torrent_best_source_global
canonical_torrent_best_source_context
Keep existing candidate eligibility guards:
title_size_fallback with identity_confidence <= 0.60
created_at older than 30 days
no acquisition attempts by canonical ID or hashes
no user_result_action with selected or downloaded
Prune only candidates whose linked sources are not tied to any non-candidate canonical.
Alternatives considered:
Keep identity-join inference only: rejected because it does not consistently reflect persisted canonical/source ties.
Add a new canonical-source mapping table: deferred to avoid schema expansion in this step.
Consequences:
Positive outcomes:
Pruning behavior now aligns with ERD source-linkage policy.
Candidate groups linked only to other candidates can be pruned together.
Candidates sharing sources with non-candidates are retained.
Risks or trade-offs:
Legacy rows without persisted link-table ties may be treated as having no source links.