Revaer
Centralized torrent orchestration with hot-reloadable configuration, consistent CLI/API surfaces, and observability-first defaults.
Revaer is a Rust workspace that coordinates torrent ingestion, filesystem operations, and operational guardrails from a PostgreSQL-backed control plane. The revaer-app binary composes focused crates covering the API, CLI, filesystem pipeline, telemetry, and libtorrent adapter.
What You’ll Find Here
- Roadmap & Specs – Track the current Phase One scope and remaining delivery deltas.
- Platform Interfaces – Configuration schema, HTTP API endpoints, and CLI command reference that match the current codebase.
- Operational Guides – Runbook, release checklist, and setup flows for operators.
- Architecture Decisions – ADRs documenting trade-offs across configuration, security, and engine integration.
- API Reference – Generated OpenAPI description and usage guidance for the control plane surface.
Use the sidebar navigation (or [ and ] shortcuts) to explore individual topics. Most pages include headings that double as tags for machine-readable manifests generated by the docs indexer.
Contributing Updates
Documentation lives next to the code. Add or edit Markdown files under docs/, then run:
just docs
This builds the mdBook site and refreshes the LLM manifests that power the documentation search experience.
Phase One Roadmap
Last updated: 2025-10-16
This document captures the current delta between the Phase One objective and the existing codebase. It should be kept in sync as work progresses across the eight workstreams.
Snapshot
| Workstream | Current State | Key Gaps | Immediate Actions |
|---|---|---|---|
| Control Plane & Setup | Postgres schema, ConfigService watcher, setup CLI/API, immutable-key guard, history logging; loopback enforcement + RFC7807 pointers live | Engine hot-reload not yet exercising throttles; setup token lifecycle/error telemetry still thin | Add watcher-driven throttle tests, expand setup diagnostics and rate-limit guardrails |
| Torrent Domain & Adapter | Event bus, orchestrator scaffold, enriched torrent DTOs, stub session worker now persists resume metadata/fastresume, reconciles selection/sequential flags, enforces throttle guard rails, and surfaces degraded health | Real libtorrent FFI binding and alert pump still pending; need to exercise live fast-resume blobs and real libtorrent rate/health controls | Replace stub session with libtorrent bindings, translate real alerts, and validate against native libtorrent in the feature-gated suite |
| File Selection & FsOps | FsOpsService emits lifecycle events and validates library root | No extraction/flatten/move-perms/cleanup pipeline, no .revaer.meta, no allow-list enforcement | Model FsOps plan, implement idempotent steps with allow-list + metadata tracking, add fixtures/tests |
| Public HTTP API & SSE | Admin setup/settings/torrent CRUD, SSE stream, metrics stub, OpenAPI generator | /v1/torrents/* family absent, no cursor pagination/filters, SSE replay lacks Last-Event-ID tests, health endpoints minimal | Define domain DTOs, implement public routes + filtering, extend SSE replay handling/tests, flesh out health |
| CLI Parity | Supports setup start/complete, settings patch, admin torrent add/remove (magnet-aware), status | Missing select, action, ls, status detail view, tail SSE client, richer validation | Extend CLI command surface to mirror API, add reconnecting SSE tailer, flesh out filtering and exit-code contract |
| Security & Observability | API key storage hashed, tracing initialised, metrics registry struct | No per-key rate limits, no X-RateLimit headers, magnet/body bounds missing, tracing not propagated to engine/fsops, metrics unused | Introduce token-bucket middleware, enforce payload bounds, propagate spans through orchestrator/fsops, export Prometheus counters |
| CI & Packaging | Workspace compiles, justfile for fmt/lint/test | No CI workflows, cargo-deny/audit missing, no env access guard, no Docker packaging or healthcheck | Author GitHub Actions (lint, security, tests, build), enforce env guard lint, build minimal non-root container with HEALTHCHECK |
| Operational End-to-End | Bootstrap skeleton and event bus exist | Torrent download, fs pipeline, restart resume, throttling, degraded health scenarios unimplemented | Sequence implementation/testing to satisfy runbook once engine/fsops/API parity are in place |
Remaining Scope Specification
1. Torrent Engine Integration
- Swap the stubbed
LibtSessionfor the real libtorrent binding so the existing worker drives a native session while continuing to process commands for add/pause/resume/remove, sequential toggles, rate limits, selection updates, reannounce, and force-recheck. - Validate persisted fast-resume payloads, priorities, target directories, and sequential flags against the live session on startup; continue emitting reconciliation events when divergence is detected.
- Translate libtorrent alerts into EventBus messages (
FilesDiscovered,Progress,StateChanged,Completed,Failure) while respecting the ≤10 Hz per-torrent coalescing rule; recover from alert polling failures by degrading health and attempting bounded restarts. - Ensure global and per-torrent rate caps driven by
engine_profileupdates are enforced by libtorrent within two seconds, with audit logs surfaced when caps change. - Extend the feature-gated integration suite to execute against the native libtorrent build (resume restore, rate-limit enforcement, alert mapping) in addition to the in-process stub.
2. File Selection & FsOps Pipeline
- Implement include/exclude glob logic with
skip-fluffpresets backed by the allow-list; synchronize selection changes to libtorrent file priorities and issue corresponding EventBus notifications. - Build the FsOps pipeline stages: extraction (7z), optional flattening, move/hardlink/copy into library roots, chmod/chown/umask adjustments, metadata capture, cleanup, and optional checksum calculation; each stage must record outcomes in
.revaer.metafor idempotency. - Enforce DB-driven allow-lists, refusing to access paths outside permitted roots and emitting structured errors when policies block execution.
- Degrade pipeline health when dependencies are missing (e.g., extractor binaries), ensuring both EventBus and
/health/fullreflect the condition; resume normal health once remediation succeeds. - Back the pipeline with unit coverage for rule parsing and integration coverage for an end-to-end torrent completion to library handoff, including restart scenarios that reuse
.revaer.meta.
3. Public HTTP API & SSE
- Ship
/v1/torrentsendpoints:POST(magnet or multipart.torrent),GETcollection with cursor pagination and filters (name, state, tracker, extension),GETdetail,POST /select, andPOST /action(pause/resume/remove with optional data deletion, reannounce, recheck, sequential toggle); enforce validation aligned with domain rules. - Adopt Problem+JSON responses that include JSON Pointer references for every validation failure; extend shared error helpers so CLI can mirror the structure.
- Enhance SSE with Last-Event-ID replay, duplicate suppression, resumable connections, and explicit event type exposure for new workflow outputs.
- Expand health reporting to
/health/full, surfacing engine, FsOps, and database status with latency measurements, dependency readiness, and revision metadata. - Update OpenAPI specs and golden request/response samples to cover the new surfaces; add integration tests that exercise pagination, filters, and SSE replay.
4. CLI Parity
- Add commands
revaer ls,status,select,action, andtail, mirroring API filters, selection arguments (include/exclude/skip-fluff), sequential toggles, and data deletion flags. - Implement an SSE tailer that reconnects on failure, honors Last-Event-ID, and avoids duplicate terminal output.
- Standardize exit codes (0 success, 2 validation, >2 runtime failures) and surface RFC7807 payloads, including pointer metadata, in human-readable CLI output.
- Provide CLI integration tests that run against the API fixture stack, covering filter combinations, sequential toggles, and tail reconnection behaviour.
5. Security & Observability
- Introduce API key lifecycle endpoints (issue, rotate, revoke) with hashed-at-rest storage, returning secrets only once; enforce per-key token-bucket rate limiting and include
X-RateLimit-*headers. - Harden inputs by bounding magnet length, multipart size, filter glob counts, and header values; return Problem+JSON validation errors without panics for malformed requests.
- Propagate tracing spans (request IDs) through the API, engine, and FsOps layers; ensure metrics cover HTTP status, event flow, queue depth, libtorrent transfer, and FsOps step durations, exposed via
/metrics. - Reflect degraded health when tools are missing, engine sessions fault, or queue depth exceeds thresholds; emit corresponding
SettingsChangedandHealthChangedevents. - Document operational expectations for rate limiting, key rotation, and observability dashboards.
6. CI & Packaging
- Create GitHub Actions (or equivalent) workflows for formatting (
cargo fmt), linting (cargo clippy -D warnings), security scans (cargo deny,cargo audit), tests (unit/integration with Postgres and libtorrent behind an opt-in guard), and cross-compilation artifacts for Linux x86_64 and aarch64. - Enforce an environment-access lint that fails CI if
std::envreads occur outside the composition root (excludingDATABASE_URL). - Produce a non-root Docker image with read-only root filesystem, declared volumes, and a healthcheck hitting
/health; ensure runtime documentation validates within the image. - Publish build artifacts and container digests with provenance metadata; wire CI status into the roadmap release checklist.
7. Operational Runbook Automation
- Author a script to execute the full phase objective on both x86_64 and aarch64: bootstrap using
DATABASE_URL, complete setup token flow, add a magnet, monitorFilesDiscovered/Progress/Completed, run FsOps, simulate crash/restart with fast-resume recovery, adjust throttles, and validate degraded health when extractors are absent. - Capture assertions and logs for each phase, producing artifacts suitable for runbook review and CI retention; ensure failures mark the engine or pipeline health accordingly.
- Include cleanup routines to return environments to a reusable state while retaining diagnostic logs.
8. Documentation & Final Polish
- Update
docs/phase-one-roadmap.mdcontinuously and add ADRs covering engine architecture, FsOps design, API/CLI contracts, and security posture. - Regenerate
docs/api/openapi.jsonalongside illustrative request/response examples for new endpoints. - Extend user-facing guides for CLI usage, health/metrics references, and operational setup covering API keys, rate limits, and degraded-mode recovery.
- Provide a final Phase One release checklist that ties documentation, runbook, and CI artifacts together.
Next Steps Tracking
- Land setup/network hardening and control-plane polish.
- Replace the stub worker with a real libtorrent session, resume store, and alert-driven event bridge.
- Implement FsOps pipeline with allow-listed execution and metadata.
- Expose
/v1/*APIs + CLI parity and reinforce security/observability. - Stand up CI, packaging, and full runbook validation.
Phase One Remaining Engineering Specification
Objectives
- Deliver a production-ready public interface (HTTP API, SSE, CLI) for torrent orchestration.
- Ship FsOps-backed artefacts through API, CLI, telemetry, and documentation with demonstrable reliability.
- Produce release artefacts (containers, binaries, documentation) that satisfy existing security, observability, and quality gates.
Scope Overview
-
Public HTTP API & SSE Enhancements
/v1/torrentsCRUD-style endpoints with cursor pagination, filtering, torrent actions, file selection updates, rate adjustments, and Problem+JSON responses.- SSE stream upgrades: Last-Event-ID replay, subscription filters, duplicate suppression, jitter-tolerant reconnect logic.
/health/fullexposing engine/FsOps/config readiness, dependency metrics, and revision metadata.- Regenerated OpenAPI (JSON + examples) reflecting the full public surface.
-
CLI Parity
- Commands covering list/status/select/action/tail flows with shared filtering + pagination options.
- SSE-backed
tailcommand with Last-Event-ID resume, dedupe, and retry semantics aligned with the API. - Problem+JSON error output, structured exit codes (
0success,2validation,>2runtime failures).
-
Packaging & Documentation
- Release-ready Docker image (non-root, readonly FS, volumes, healthcheck) bundling API server + docs.
- Provenance-signed binaries for supported architectures, plus GitHub Actions workflows for build, docker, msrv, and coverage gates.
- Updated ADRs, runbook, user guides, OpenAPI artefacts, and release checklist referencing the telemetry and security posture.
- Documentation of new metrics/traces/guardrails (config watcher latency, FsOps events, API counters).
Security & Observability Requirements (Cross-Cutting)
- All new API routes enforce API-key authentication with per-key rate limiting and guard-rail metrics.
- Problem+JSON responses are mandatory; eliminate
unwrap/panic paths and includeinvalid_paramspointers on validation failure. - Trace propagation from API → engine → FsOps; CLI should emit/propagate TraceId when available.
- Metrics: extend existing Prometheus registry with route labels, FsOps step counters, config watcher latency/failure gauges, and rate-limiter guardrails.
- Health degradation events (
Event::HealthChanged) must accompany any new guard-rail/latency breach or pipeline failure. - CLI commands should mask secrets in logs and optionally emit telemetry when configured (
REVAER_TELEMETRY_ENDPOINT).
Detailed Work Breakdown
1. Public API & SSE
Design Considerations
- Introduce DTO module (
api::models) for request/response structs to share with the CLI. - Cursor pagination: encode UUID/timestamp as opaque cursor in
nexttoken; align Last-Event-ID semantics with event stream IDs. - Filtering: support state, tracker, extension, tags, and name substring; guard invalid combinations with Problem+JSON.
- SSE filtering: permit query parameters for torrent subset, replays based on event type/state.
Implementation Tasks
- Routes:
POST /v1/torrents– magnet or .torrent upload (streamed, payload size guard).GET /v1/torrents– cursor pagination + filters.GET /v1/torrents/{id}– detail view with FsOps metadata.POST /v1/torrents/{id}/select– file selection update with validation.POST /v1/torrents/{id}/action– pause/resume/remove (with data), reannounce, recheck, sequential toggle, rate limits.
- SSE:
- Accept
Last-Event-IDheader, deduplicate by event ID, filter streams by torrent ID/state. - Simulate jitter/disconnects in tests (
tokio::time::pause,transport::Stream).
- Accept
- Health endpoint:
- Aggregate config watcher metrics (latency, failures), FsOps status, engine guardrails, revision hash.
- Problem+JSON mapping for all new errors with
invalid_paramspointer data. - OpenAPI:
- Regenerate spec covering new endpoints, Problem responses, SSE details, and sample payloads.
- Testing:
- Unit tests for filter parsing, DTO validation, Problem+JSON outputs.
- Integration tests using
tower::Serviceharness for each route. - SSE reconnection tests with simulated delays and Last-Event-ID resume.
/health/fullintegration test verifying new fields and degraded scenarios.
2. CLI Parity
Design Considerations
- Reuse DTOs from API models; consider shared crate/module for request structs and Problem+JSON parsing.
- Introduce output formatting with optional JSON/pretty table modes.
- Provide configuration via env vars and CLI flags; align defaults with API (e.g.,
REVAER_API_URL,REVAER_API_KEY).
Implementation Tasks
- Commands:
revaer ls– list torrents, support pagination (--cursor,--limit), filters (state/tracker/extension/tags).revaer status <id>– torrent detail view, optional follow mode.revaer select <id>– send selection rules from file/JSON (validate before submit).revaer action <id>– actions (pause,resume,remove,remove-data,reannounce,recheck,sequential,rate).revaer tail– SSE tail with Last-Event-ID persist (local file) and dedupe.
- Problem+JSON handling:
- Standardised pretty printer summarising
title,detail,invalid_params; respect exit codes.
- Standardised pretty printer summarising
- Telemetry:
- Optional metrics emission (success/failure counters) when telemetry endpoint configured.
- Testing:
- Integration tests using
httpmockto assert HTTP interactions and exit codes. - SSE tail tests with mocked stream delivering duplicates/disconnects.
- Snapshot tests for JSON outputs (ensuring deterministic fields).
- Integration tests using
3. Packaging & Documentation
Design Considerations
- Multi-stage Docker build: compile with Rust image, run on minimal base (distroless/alpine/ubi) with non-root user.
- Healthcheck script hitting
/health/fullwith timeout. - Release workflows should run on GitHub Actions with provenance metadata (supply-chain compliance).
Implementation Tasks
- Dockerfile +
Makefile/justtarget:- Build release binary, copy
docs/api/openapi.json, set/appas workdir. - Define volumes for data/config, create user
revaer, configure entrypoint.
- Build release binary, copy
- GitHub Actions (update
.github/workflows):build-release: runjust build-rel,just api-export, attach binaries/docs.docker: build image, rundocker scan(trivy/grype), and push on release tags.msrv: runjust fmt lint testwith pinned toolchain (documented in workflow).cov: ensurejust covgate passes (≥80% lines/functions).
- Documentation:
- ADRs: update
003-libtorrent-session-runner, add FsOps design ADR, API/CLI contract ADR, security posture update (API keys, rate limits). - Runbook: scripted scenario covering bootstrap → torrent add → FsOps pipeline → restart resume → rate throttle adjustments → degraded health simulation → recovery.
- User guides: CLI usage, metrics/telemetry reference, operational setup (keys, rate limits, config watcher health).
- OpenAPI: regenerate JSON, include sample Problem+JSON payloads and SSE description.
- Release checklist: steps to run
just ci, verify coverage, run docker scan, execute runbook, and tag release.
- ADRs: update
- Testing:
- Validate Docker container runtime (healthcheck, volume mounts, non-root permissions).
- Perform coverage review ensuring new tests bring line/function coverage ≥80%.
- Execute runbook; capture logs/metrics and link in docs.
Cross-Cutting Deliverables
- API key lifecycle (issue/rotate/revoke) extended with per-key rate limiting, recorded in telemetry and docs.
- Config watcher telemetry integrated into
/health/fulland metrics registry. - CLI and API emit guard-rail telemetry on violations (loopback enforcement, FsOps errors, rate-limit breaches).
- All new code paths covered by unit/integration tests; follow-up to update
just covgating. - Documentation kept up-to-date with implementation details and tested flows.
Sequencing (Suggested)
- Build API models and endpoints (foundation for CLI).
- Implement SSE enhancements while adding API integration tests.
- Extend CLI commands leveraging shared DTOs.
- Embed telemetry (metrics/traces) throughout API/CLI/FsOps changes.
- Stand up Docker build + CI workflows.
- Update ADRs, runbook, user guides, OpenAPI, and release checklist.
- Execute full QA cycle (coverage, docker scan, runbook, manual verification) and prepare for release tagging.
Acceptance Criteria
just lint,just test,just covand fulljust cipass locally and in CI.- Coverage (lines + functions) ≥ 80% across workspace.
- Docker image passes security scan with zero unwaived high severity findings.
- Runbook executed end-to-end; results referenced in documentation.
- OpenAPI specification and CLI docs match implemented behaviour.
- Release checklist completed with artefacts attached (binaries, Docker image, OpenAPI, docs).
Phase One Runbook
This runbook exercises the end-to-end control plane, validating FsOps, telemetry, and guard rails.
Prerequisites
- Docker image
revaer:ci(built viajust docker-build) or a localrevaer-appbinary (just build-rel). - PostgreSQL instance accessible to the application.
- API key with a conservative rate limit (e.g., burst
5, period60s). - CLI configured with
REVAER_API_URL,REVAER_API_KEY, and optionalREVAER_TELEMETRY_ENDPOINT.
Scenario
-
Bootstrap
- Issue a setup token:
revaer setup start --issued-by runbook. - Complete configuration with CLI secrets and directories:
revaer setup complete --instance runbook --bind 127.0.0.1 --resume-dir /data/resume --download-root /data/downloads --library-root /data/library --api-key-label runbook --passphrase <pass>. - Confirm
/health/fullreturnsstatus=okandguardrail_violations_total=0.
- Issue a setup token:
-
Add Torrent & Observe FsOps
- Add a torrent:
revaer torrent add <magnet> --name runbook. - Tail events:
revaer tail --event torrent_added,progress,state_changed --resume-file /tmp/revaer.tail. - Verify FsOps emits
fsops_started,fsops_completed, and Prometheus countersfsops_steps_totalincrease.
- Add a torrent:
-
Restart & Resume
- Stop the application, restart it, and ensure the torrent catalog repopulates.
- Confirm
SelectionReconciled(if metadata diverges) andHealthChangedclears once resume succeeds.
-
Rate Limit Guard-Rail
- Apply a tight API key limit (burst
1/per_seconds 60) viaconfig apply. - Execute three rapid CLI calls (e.g.,
revaer status <id>). The third should exit with code3, displaying a429Problem+JSON response. - Inspect
/metricsto verifyapi_rate_limit_throttled_totalincremented and/health/fullreflectsdegraded=["api_rate_limit_guard"].
- Apply a tight API key limit (burst
-
Recovery
- Restore the API key limit to an acceptable value.
- Re-run
revaer status <id>to confirm success,guardrail_violations_totalstops increasing, anddegradedreturns to[].
-
FsOps Failure Simulation
- Temporarily revoke write permissions on the library directory and re-run a completion.
- Observe
fsops_failedevents,HealthChangedwith["fsops"], and guard-rail telemetry. - Restore permissions and confirm recovery events.
Verification Artifacts
- Archive CLI telemetry emitted to
REVAER_TELEMETRY_ENDPOINT. - Capture Prometheus scrapings (
/metrics) before and after the run. - Record
/health/fullJSON snapshots for each phase.
Successful completion of this runbook satisfies the operational validation gate defined in AGENT.md.
Phase One Release Checklist
-
Branch Hygiene
- Ensure
mainis green (CI pipeline complete). - Review outstanding ADRs and docs for freshness.
- Ensure
-
Build & Test
just cijust build-reljust api-export
-
Artefact Verification
- Binary:
target/release/revaer-app - Checksum:
sha256sum target/release/revaer-app - OpenAPI:
docs/api/openapi.json - Docker image:
just docker-build && just docker-scan
- Binary:
-
Runbook Execution
- Follow
docs/runbook.md - Archive CLI telemetry,
/metrics,/health/fullsnapshots.
- Follow
-
Documentation Refresh
- Verify ADRs 005–007 reflect current design.
- Update user guides (
docs/api/guides/*.md) with any behavioural changes.
-
Tag & Publish
- Create annotated tag:
git tag -a vX.Y.Z -m "Phase One release" - Push tag:
git push origin vX.Y.Z - Attach artefacts generated by the
build-releaseworkflow.
- Create annotated tag:
-
Post-Release Monitoring
- Watch rate-limit and guard-rail metrics.
- Confirm
HealthChangedevents return to empty degraded set. - Validate automation telemetry for CLI success rates.
Configuration Surface
Canonical reference for the PostgreSQL-backed settings documents that drive Revaer’s runtime behaviour.
Revaer persists all operator-facing configuration inside the settings_* tables. The API (ConfigService) exposes strongly-typed snapshots that are consumed by the API server, torrent engine, filesystem pipeline, and CLI. Every change flows through a SettingsChangeset, ensuring a single validation path whether commands originate from the setup flow or the admin API.
Snapshot Components
The / .well-known/revaer.json endpoint and revaer setup complete CLI command both return the same structure:
{
"revision": 42,
"app_profile": { /* see below */ },
"engine_profile": { /*…*/ },
"fs_policy": { /*…*/ },
"api_keys": [
{ "key_id": "admin", "label": "bootstrap", "enabled": true, "rate_limit": null }
]
}
App Profile (settings_app_profile)
| Field | Type | Description |
|---|---|---|
id | UUID | Singleton identifier for the current document. |
instance_name | string | Human readable label surfaced in the CLI after setup. |
mode | "setup" or "active" | Gatekeeper for the authentication middleware. Setup requests are rejected once the system enters active. |
version | integer | Optimistic locking counter maintained by ConfigService. |
http_port | integer | Published TCP port for the API server. |
bind_addr | string (IPv4/IPv6) | Listen address for the API server. |
telemetry | object | Free-form map for logging + metrics toggles (e.g. log_level, prometheus). |
features | object | Feature switches such as fs_extract, par2, sse_backpressure. |
immutable_keys | array | List of fields that cannot be mutated via patches (ConfigError::ImmutableField). |
Engine Profile (settings_engine_profile)
| Field | Type | Description |
|---|---|---|
implementation | string | Currently libtorrent. Used to select the torrent workflow implementation. |
listen_port | integer? | Optional external listen port override for the engine. |
dht | bool | Enables/disables the DHT module. |
encryption | string | Encryption requirement (require, prefer, etc.). |
max_active | integer? | Cap on concurrently-active torrents; null means unlimited. |
max_download_bps / max_upload_bps | integer? | Global rate limits applied by the engine. |
sequential_default | bool | Default sequential downloading behaviour for new torrents. |
resume_dir | string | Filesystem location where fast-resume artefacts are stored. |
download_root | string | Directory used for in-progress torrent payloads. |
tracker | object | Tracker configuration (user-agent, announce overrides). |
Filesystem Policy (settings_fs_policy)
| Field | Type | Description |
|---|---|---|
library_root | string | Destination directory for completed artefacts. |
extract | bool | Whether completed payloads are extracted. |
par2 | string | off, verify, or repair depending on PAR2 behaviour. |
flatten | bool | Collapses single-file directories when moving into the library. |
move_mode | string | copy, move, or hardlink semantics for the FsOps pipeline. |
cleanup_keep / cleanup_drop | array | Glob patterns retaining or removing files during cleanup. |
chmod_file / chmod_dir | string? | Optional octal permissions applied to outputs. |
owner / group | string? | Optional ownership override for the library root. |
umask | string? | Umask applied during FsOps. |
allow_paths | array | Allowed staging/library paths the pipeline accepts. |
API Keys & Secrets
Patches can create, update, or revoke keys and named secrets. The request format mirrors SettingsChangeset:
{
"api_keys": [
{
"op": "upsert",
"key_id": "admin",
"label": "primary",
"enabled": true,
"secret": "optional-override",
"rate_limit": { "burst": 10, "per_seconds": 1 }
}
],
"secrets": [
{ "op": "set", "name": "libtorrent.passphrase", "value": "..." }
]
}
The API server enforces bucketed rate limits if rate_limit is supplied (burst per per_seconds). Invalid field names or mutations against immutable_keys yield RFC9457 ProblemDetails responses with an invalid_params array matching the JSON pointer returned by ConfigError.
Change Workflows
- Setup –
POST /admin/setup/startissues a one-time token.POST /admin/setup/completeconsumes that token, applies the providedSettingsChangeset, forcesapp_profile.modetoactive, and returns the hydrated snapshot along with the generated API key (also echoed in the CLI output). - Ongoing updates –
PATCH /admin/settings(CLI:revaer settings patch --file changes.json) requires an API key and supports partial documents. Any field omitted from the payload remains untouched. - Snapshot access –
GET /.well-known/revaer.json(no auth) andGET /health/fullboth return the revision and enable automation to verify configuration drift. Automation and dashboards can poll these endpoints without authenticating.
Revaer publishes SettingsChanged events on every successful mutation, ensuring subscribers refresh in-memory caches without polling.
HTTP API
REST + SSE surface exposed by
revaer-api. The OpenAPI document at/docs/openapi.jsonis generated byjust api-export.
Authentication
- Setup flow – Requests to
/admin/setup/startare open./admin/setup/completerequires thex-revaer-setup-tokenheader with the one-time token returned bysetup_start. The server refuses setup calls once the app profile switches toactive. - Operator actions – All
/admin/*(after setup) and/v1/*endpoints requirex-revaer-api-key: {key_id}:{secret}. The middleware validates the key viaConfigService, enforces per-key rate limiting, and rejects calls while the instance remains in setup mode. - Request correlation – An optional
x-request-idheader is echoed into tracing spans and surfaced on SSE traffic. The CLI auto-populates this header per invocation.
Error responses follow RFC9457 (ProblemDetails), populated with invalid_params entries whenever validation pinpoints a JSON pointer within the payload.
Endpoint Inventory
| Method | Path | Auth | Description |
|---|---|---|---|
GET | /health | none | Lightweight readiness probe returning mode + database status. |
GET | /health/full | none | Extended health snapshot with build SHA, metrics counters, and torrent queue depth. |
GET | /.well-known/revaer.json | none | Full configuration snapshot (ConfigSnapshot) including current revision. |
POST | /admin/setup/start | none | Issues a setup token; optionally accepts issued_by + ttl_seconds. |
POST | /admin/setup/complete | setup token | Applies a SettingsChangeset, promotes the instance to active, consumes the token, and returns the hydrated snapshot. |
PATCH | /admin/settings | API key | Applies partial configuration updates (SettingsChangeset) and broadcasts SettingsChanged. |
GET | /admin/torrents | API key | Same as GET /v1/torrents; retained for admin tooling. |
POST | /admin/torrents | API key | Alias for POST /v1/torrents. |
GET | /admin/torrents/:id | API key | Alias for GET /v1/torrents/:id. |
DELETE | /admin/torrents/:id | API key | Alias for invoking the remove action. |
GET | /v1/torrents | API key | Cursor-paginated torrent summaries with filtering (state, tracker, extension, tags, name). |
POST | /v1/torrents | API key | Submits a magnet URI or base64-encoded .torrent, optional tags/trackers, file rules, and per-torrent rate limits. |
GET | /v1/torrents/:id | API key | Detailed torrent view including file metadata when available. |
POST | /v1/torrents/:id/select | API key | Adjusts inclusion/exclusion globs, fluff skipping, and per-file priorities. |
POST | /v1/torrents/:id/action | API key | Lifecycle management (pause, resume, remove, reannounce, recheck, sequential, rate). |
GET | /v1/events | API key | Server-sent events stream (alias: /v1/torrents/events). Supports filtering by torrent ID, state, and event kind. |
GET | /metrics | none | Prometheus-formatted metrics from revaer-telemetry. |
GET | /docs/openapi.json | none | Static OpenAPI document used by the docs site and clients. |
All torrent-managing endpoints ensure the torrent workflow is wired. If the engine is unavailable, the API returns 503 Service Unavailable.
Torrent Submission (POST /v1/torrents)
Required headers: x-revaer-api-key. Provide either magnet or metainfo; the server rejects payloads missing both. Optional fields:
download_dir– Overrides the engine profile’s staging directory.sequential– Enables sequential downloading for this torrent only.tags/trackers– Stored alongside the torrent for filtering and bookkeeping.include/exclude/skip_fluff– File selection bootstrap applied before metadata fetch completes.max_download_bps/max_upload_bps– Per-torrent rate limits (bps) passed to the workflow.
On success the server returns 202 Accepted after dispatching TorrentWorkflow::add_torrent. The torrent ID in the payload becomes the canonical identifier.
Listing & Filtering (GET /v1/torrents)
Query parameters:
limit(default 50, max 200)cursor– Base64 token returned innext.state,tracker,extension,tags,name– Comma-separated filters (case-insensitive).
The response body is TorrentListResponse with an optional next cursor when additional pages exist.
Torrent Actions (POST /v1/torrents/:id/action)
type determines the shape of the body:
{ "type": "remove", "delete_data": true }
{ "type": "sequential", "enable": false }
{ "type": "rate", "download_bps": 1048576, "upload_bps": null }
Failures propagate engine errors as 500 Internal Server Error with a descriptive message in detail.
SSE Stream (GET /v1/events)
Headers:
x-revaer-api-key- Optional
Last-Event-ID– resuming from a previously stored ID (the CLI stores this via--resume-file).
Query parameters:
torrent– Comma-separated UUIDs.event– Comma-separated event kinds. Valid values:torrent_added,files_discovered,progress,state_changed,completed,fsops_started,fsops_progress,fsops_completed,fsops_failed,settings_changed,health_changed,selection_reconciled.state– Comma-separated torrent states (downloading,completed, etc.).
The server maintains a 20-second keep-alive ping and enforces filtering before events hit the wire.
Health & Metrics
GET /health– Primary readiness probe used by orchestration systems. Addsdatabaseto the degraded list if PostgreSQL is unreachable.GET /health/full– Returns the deployment revision, build SHA (build_sha()), metrics snapshot (config_watch_latency_ms,guardrail_violations_total,rate_limit_throttled_total, etc.), and torrent queue depth.GET /metrics– Exposes the same counters for Prometheus scraping.
For the complete schema definitions, consult the generated OpenAPI (just api-export).
CLI Reference
revaer-cliprovides parity with the API for setup, configuration management, torrent lifecycle, and observability.
Global Flags & Environment
| Flag | Environment | Default | Description |
|---|---|---|---|
--api-url <URL> | REVAER_API_URL | http://127.0.0.1:7070 | Base URL for API requests. |
--api-key <key_id:secret> | REVAER_API_KEY | none | Required for all post-setup commands that mutate or read torrents. |
--timeout <secs> | REVAER_HTTP_TIMEOUT_SECS | 10 | Per-request HTTP timeout. |
Each invocation bubbles a unique x-request-id through the API; the CLI also emits optional telemetry events when REVAER_TELEMETRY_ENDPOINT is set.
Setup Flow
revaer setup start [--issued-by <label>] [--ttl-seconds <secs>]
- Calls
POST /admin/setup/start. - Prints the plaintext token followed by its ISO8601 expiry.
- Use
--issued-byto tag the token source (defaults toapi).
revaer setup complete --instance <name> --bind <addr> --port <port> --resume-dir <path> --download-root <path> --library-root <path> --api-key-label <label> [--api-key-id <id>] [--passphrase <value>] [--token <token>]
- Loads the setup token either from
--tokenorREVAER_SETUP_TOKEN. - Builds a
SettingsChangesetcontaining the app profile, engine profile, filesystem policy, API key, and optional secret. - Forces
app_profile.mode = "active". - Echoes the generated API key (
key_id:secret) on success; store it securely before continuing.
Configuration Maintenance
revaer settings patch --file <path>
- Reads a JSON file containing a partial
SettingsChangeset. - Requires an API key.
- Returns a formatted
ProblemDetailsmessage if validation fails (immutable fields, unknown keys, etc.).
Torrent Lifecycle
revaer torrent add <magnet|.torrent> [--name <label>] [--id <uuid>]
- Accepts a magnet URI or a filesystem path to a
.torrent. - Automatically base64-encodes torrent files for the API.
- Optional overrides:
--namesets the human-friendly label;--idlets you supply a deterministic UUID instead of the auto-generated value.
revaer torrent remove <uuid>
- Issues
POST /v1/torrents/{id}/actionwith{ "type": "remove" }. - Use the more general
actioncommand fordelete_datasemantics.
revaer ls [--limit <n>] [--cursor <token>] [--state <state>] [--tracker <url>] [--extension <ext>] [--tags <tag1,tag2>] [--name <fragment>] [--format table|json]
- Lists torrents with the same filters supported by the REST API.
- Default output is a table summarising id, name, state, and progress.
- JSON output matches
TorrentListResponse.
revaer status <uuid> [--format table|json]
- Returns a detailed view of a single torrent.
- JSON output is the full
TorrentDetail(including file metadata when available).
revaer select <uuid> [--include <glob,glob>] [--exclude <glob,glob>] [--skip-fluff] [--priority index=priority,…]
- Updates file-selection rules via
POST /v1/torrents/{id}/select. --priorityaccepts repeatedindex=prioritypairs (skip|low|normal|high) mapped onto the engine’sFilePriority.
revaer action <uuid> <pause|resume|remove|reannounce|recheck|sequential|rate> [--delete-data] [--enable <bool>] [--download <bps>] [--upload <bps>]
- One-stop entry point for all torrent actions.
sequentialtoggles sequential downloads via--enable true|false.rateupdates per-torrent bandwidth caps (bps). Provide--downloadand/or--upload.removehonours--delete-data.
Event Streaming
revaer tail [--torrent <id,id>] [--event <kind,kind>] [--state <state,state>] [--resume-file <path>] [--retry-secs <n>]
- Connects to
/v1/eventsusing SSE. - Filters match the API query parameters and enforce UUID/event-kind validation before the request is made.
- When
--resume-fileis supplied, the CLI persists the last event ID across reconnects so the stream can resume after transient failures. --retry-secscontrols the backoff between reconnect attempts (default: 5 seconds).
All torrent commands require an API key. The CLI surfaces API problems exactly as the server returns them, including RFC9457 validation errors and rate-limit responses (429 Too Many Requests with retry metadata in the body).
API Documentation
This directory hosts HTTP API specifications, generated OpenAPI documents, and usage guides for the Revaer control plane.
Contents
schema/– Published OpenAPI payloads and supporting artefacts.guides/– Scenario-based walkthroughs (bootstrap, hot reload validation, torrent lifecycle).examples/– HTTP request/response samples captured from real workflows.
Current Coverage
- Setup & configuration –
/admin/setup/*and/admin/settingsflows with CLI parity. - Orchestration –
/admin/torrents(POST/DELETE/GET) for submitting or removing torrents, plus/admin/torrents/{id}for status inspection. - Observability –
/v1/eventsSSE stream (tested for replay/keep-alive) and/metricsPrometheus surface with torrent gauges.
See guides/bootstrap.md for an end-to-end description of the bootstrap lifecycle, background workers, and error handling expectations.
Next Steps
- Capture worked examples for torrent status reconciliation (list + selective GET).
- Provide troubleshooting recipes for common workflow failures (engine unavailable, filesystem policy rejection).
- Expand SSE consumer documentation with incremental backfill strategies.
OpenAPI Reference
Canonical machine-readable description of the Revaer control plane surface.
The generated OpenAPI specification lives alongside the documentation at docs/api/openapi.json. Regenerate it with:
just api-export
Once refreshed, rebuild the documentation (just docs) to publish the updated schema to the static site and LLM manifests. API consumers can download the JSON directly from the deployed documentation site or via the repository.
Architecture Decision Records
ADR documents capture the rationale behind significant technical decisions.
Suggested Workflow
- Create a new ADR using the template in
docs/adr/template.md. - Give it a sequential identifier (e.g.,
001,002) and a concise title. - Capture context, decision, consequences, and follow-up actions.
- Reference ADRs from code comments or docs where the decision applies.
001 – Global Configuration Revisioning
- Status: Proposed
- Date: 2025-02-23
Context
- All runtime configuration must be hot-reloadable across multiple crates.
- Consumers need a consistent ordering guarantee for applying changes received via LISTEN/NOTIFY, with a fallback to polling.
- We require a DB-native mechanism that can be incremented from triggers without race conditions and that carries across deployments.
Decision
- Introduce a singleton
settings_revisiontable with an ever-incrementingrevisioncounter. - Wrap updates to configuration tables (
app_profile,engine_profile,fs_policy,auth_api_keys,query_presets) in triggers that:- Update
settings_revision.revision = revision + 1. - Emit
NOTIFY revaer_settings_changed, '<table>:<revision>:<op>'.
- Update
ConfigServiceexposesConfigSnapshotto materialize a consistent view (revision + documents) for the application bootstrap path.- The revision remains monotonic even if polling is used (consumers record the last seen revision and request deltas if they miss notifications).
- Mutation APIs validate payloads server-side, applying field-level type checks and respecting
app_profile.immutable_keys. Violations surface as structured errors with section/field metadata, preventing silent drift.
Consequences
- Multi-table updates executed inside a transaction surface as a single revision bump, preserving ordering for consumers.
- LISTEN subscribers that drop their connection can reconcile by reloading
settings_revisionand querying deltas > last_seen_revision. - Trigger-level logic slightly increases write cost but keeps business code free of manual revision management.
Follow-up
- Implement
apply_changesetto write history rows with the associated revision. - Add integration tests that exercise transactionally updating multiple tables and verifying a single revision increment.
002 – Setup Token Lifecycle & Secrets Bootstrap
- Status: Proposed
- Date: 2025-02-23
Context
- Initial deployments must boot in a locked-down "Setup Mode" where only a one-time token grants access to the setup API.
- Tokens should be observable/auditable, expire automatically, and support regeneration without requiring an application restart.
- A follow-on requirement is to collect an encryption passphrase or server-side key for pgcrypto-backed secrets before exiting Setup Mode.
Decision
- Store tokens in the
setup_tokenstable withtoken_hash,issued_at,expires_at,consumed_at, andissued_by. - Enforce at most one active token via a partial unique index on rows where
consumed_at IS NULL. ConfigServicewill:- Generate tokens using cryptographically secure randomness.
- Persist only a hashed representation (argon2id) along with metadata.
- Emit history entries and
NOTIFYevents on token creation/consumption.
- The CLI/API surfaces token issuance and completion flows; the process prints the token to stdout only at generation time.
- During completion, the caller must supply the encryption materials (passphrase or reference to pgcrypto role). The handler verifies secrets are persisted before flipping
app_profile.modetoactive.
Consequences
- Operators can recover by issuing a new token if the previous one expires without restarting the service.
- Tokens are auditable; failed attempts can be recorded against the hashed token id (future enhancement).
- The bootstrap path ensures secrets exist before runtime modules that require them start, preventing a partially configured system.
Follow-up
- Implement argon2id hashing helpers and audit logging in
revaer-config. - Define the CLI workflow (
revaer-cli setup) that wraps token issuance and completion for headless environments. - Add problem detail responses for expired/consumed tokens in the API.
003 – Libtorrent Session Runner Architecture
- Status: Accepted
- Date: 2025-10-16
Context
- The current
revaer-torrent-libtcrate is a stub that simulates torrent actions without touching libtorrent, preventing real downloads, fast-resume, or alert handling. - Phase One requires a production-grade engine: a single async task must own the libtorrent session, persist fast-resume data/selection state, debounce high-volume alerts, and surface health to the event bus.
- The engine must enforce rate limits and selections within libtorrent, react within two seconds of configuration changes, and survive restarts by restoring torrents from
resume_dir.
Decision
- Introduce a dedicated
SessionWorkerspawned byLibtorrentEngine::new. It owns the libtorrentSession, receivesEngineCommandmessages, and emitsEngineEvents via an internal channel that feeds the sharedEventBus. - Wrap the libtorrent FFI in a thin adapter trait (
LibtSession) to encapsulate blocking calls (add_torrent,pause,set_sequential,apply_rate_limits,file_priorities, alert polling). The real implementation usestokio::task::spawn_blockingto call into C++ safely. - Add a
FastResumeStoreservice that reads/writes.fastresumeblobs plus JSON metadata (selection, priorities, download directory, sequential flag) insideresume_dir. On startup the worker loads the store, attempts to match existing handles, and emits reconciliation events if the stored state diverges. - Run an
AlertPumploop that waits on libtorrentalerts_waitnotify, drains all alerts, and funnels them through anAlertTranslatorthat converts them into domainEngineEvents (FilesDiscovered,Progress,StateChanged,Completed,Error). AProgressCoalescerthrottles updates to 10 Hz per torrent. - Integrate health tracking: fatal session errors transition the engine into a degraded state and emit both
HealthChangedand per-torrentErrorevents. The worker attempts limited restarts with exponential back-off before marking the engine unhealthy. - Rate limit updates from
EngineCommand::UpdateLimitsand configuration watcher updates call into libtorrent immediately; a watchdog verifies application within two seconds and logs warnings if the session reports stale caps.
Consequences
- The engine crate gains clear separation between command handling, libtorrent FFI, alert translation, and persistence, making it easier to test components in isolation using mock
LibtSessionimplementations. - Persisted state in
resume_direnables crash-restart flows to resume downloads, leveraging libtorrent fastresume and our own selection metadata. - Debouncing progress events reduces SSE pressure while preserving responsiveness; coalescing happens before events hit the shared bus.
- Health reporting integrates with the existing telemetry crate, providing operators visibility into session failures or missing dependencies (e.g., absent resume directory).
Follow-up
- Maintain regression coverage for the
libtorrentfeature path, ensuring fast-resume reconciliation and guard-rail health events remain stable. - Track upstream libtorrent upgrades and refresh the operator documentation whenever the resume layout or dependency expectations shift.
004 – Phase One Delivery Track
- Status: Accepted
- Date: 2025-10-17
Motivation
Phase One bundles the remaining work required to transition Revaer from the current stubs into a production-ready torrent orchestration platform. This record captures the implementation notes, decisions, and verification evidence for each workstream item enumerated in docs/phase-one-roadmap.md.
Design Notes
- Follow the library-first structure outlined in
AGENT.mdwith crate-specific modules for configuration, engine integration, filesystem operations, public API, CLI, security, and packaging. - Apply tight configuration validation and hot-reload behaviour to guarantee that throttle and policy updates propagate within two seconds.
- Emit guard-rail telemetry whenever global throttles are disabled, driven to zero, or configured above the 5 Gbps warning threshold so operators can react quickly.
- Replace the stub libtorrent adapter with a session worker that owns state, persists fast-resume metadata, and surfaces alert-driven events with bounded fan-out.
- Persist resume metadata and fastresume payloads via
FastResumeStore, reconcile on startup, and emitSelectionReconciledevents plus health degradations when store contents diverge or writes fail. - Build deterministic include/exclude rule evaluation and an idempotent FsOps pipeline anchored by
.revaer.meta. - Expose a consistent Problem+JSON contract across HTTP and CLI surfaces, including pagination and SSE replay support.
- Enforce observability invariants: structured tracing with context propagation, bounded rate limits, Prometheus metrics, and degraded health signalling when dependencies fail.
- Ensure every workflow is reproducible via
justtargets and validated in CI, with container packaging aligned to the non-root, read-only expectations. - Follow the canonical
justrecipe surface (fmt, lint, test, ci, etc.). Coloned variants are mapped to hyphenated recipe names (fmt-fix,build-rel,api-export) becausejust1.43.0 rejects colons in recipe identifiers without unstable modules; the semantics remain identical.
Test Coverage Summary
just ciserves as the baseline verification target. Each workstream delivers focused unit tests, integration coverage, and feature-flagged live tests (for libtorrent, Postgres, FsOps).- Coverage gates are enforced via
cargo llvm-covwith--fail-under 80across library crates. - Integration suites will rely on
testcontainers(Postgres, libtorrent) and workspace-specific fixtures for FsOps pipelines and API/CLI flows, including the configuration watcher hot-reload test and new libtorrent-feature tests for resume restoration and fastresume persistence.
Outcome
- All public surfaces now enforce API-key authentication with token-bucket rate limiting,
429Problem+JSON responses, and telemetry counters exported via Prometheus and/health/full. - SSE endpoints honour the same auth and Last-Event-ID semantics, with CLI resume support persisting state between reconnects.
- The CLI propagates
x-request-id, standardises exit codes (0success,2validation,3runtime), and emits optional telemetry events toREVAER_TELEMETRY_ENDPOINT. - A release-ready Docker image (
Dockerfile) packages the API binary and documentation on a non-root, read-only-friendly runtime with health checks and volume mounts for config/data. - CI now publishes release artefacts (
revaer-app, OpenAPI) and runs MSRV and container security jobs viajusttargets; binaries are checksummed alongside provenance metadata. - Documentation additions cover FsOps design, API/CLI contracts, security posture, operator runbook, telemetry reference, and the phase-one release checklist.
Observability Updates
- Telemetry enhancements include structured logs for setup token issuance/consumption, loopback enforcement failures, configuration watcher updates, rate-limit guard-rail decisions, and resume store degradation/recovery.
- Metrics will expand to track HTTP request outcomes, SSE fan-out, event queue depth, torrent throughput, FsOps step durations, and health degradation counts.
/health/fullwill report engine, FsOps, and database readiness with latency measurements and revision hashes, mirrored by CLI status commands.
Risk & Rollback Plan
- Maintain incremental commits gated by
just cito isolate regressions. Any new dependency introductions require explicit justification and fallbacks documented here. - Where feature flags guard libtorrent integration, provide mockable interfaces so tests can fall back to stub implementations if the environment lacks native bindings.
- Persist fast-resume metadata and
.revaer.metafiles so failed deployments can roll back without corrupting state; ensure migrations remain additive.
Dependency Rationale
No new dependencies have been added yet. Future additions (e.g., libtorrent bindings, glob evaluators, archive tools) must include:
- Why the crate/tool is necessary.
- Alternatives considered (including bespoke implementations) and why they were rejected.
- Security and maintenance assessment (license compatibility, release cadence).
005 – FsOps Pipeline Hardening
- Status: Accepted
- Date: 2025-10-17
Context
- Phase One promotes filesystem post-processing from a best-effort helper to a first-class workflow with explicit health semantics.
- The orchestrator must ensure every completed torrent flows through a deterministic FsOps state machine, emitting structured telemetry and reconciling mismatches with persisted metadata.
- Operators require visibility into FsOps latency, failures, and guard-rail breaches (e.g., missing extraction tools, permission errors) via
/health/full, Prometheus, and the shared EventBus.
Decision
- FsOps responsibilities live inside
revaer-fsops, invoked by the orchestrator (TorrentOrchestrator::apply_fsops) whenever aCompletedevent surfaces. - Each pipeline step (
extract,par2,move,cleanup) records start/completion/failure events and increments Prometheus counters viaMetrics::inc_fsops_step. - Metadata is persisted alongside
.revaer.metato reconcile selection overrides and resume directories across restarts; mismatches triggerSelectionReconciledevents plus guard-rail telemetry. - Health degradation is published when FsOps detects latency guard rails, missing tools, or unrecoverable IO errors; recovery clears the
fsopscomponent from the degrade set.
Consequences
- FsOps execution becomes observable and retry-friendly, enabling operator runbooks to diagnose stuck jobs with concrete metrics and events.
- Pipeline regressions now fail CI thanks to targeted unit/integration tests under
revaer-fsopsand orchestrator-level tests driving the shared event bus. - The orchestration layer remains single-owner of FsOps invocation, simplifying future extensions (e.g., checksum verification, media tagging) without leaking concerns into the API.
Verification
just testexercises FsOps unit cases, while orchestrator integration tests validate event emission, degradation flows, and metadata reconciliation./health/fulland Prometheus snapshots display FsOps metrics during the runbook, confirming latency guard rails and failure counters behave as expected.
006 – Unified API & CLI Contract
- Status: Accepted
- Date: 2025-10-17
Context
- Phase One requires parity between the public HTTP interface and the administrative CLI so operators can automate without reverse engineering payloads.
- Prior iterations lacked shared DTOs, consistent Problem+JSON responses, and stable pagination/SSE semantics across API and CLI.
- New rate limiting and telemetry features must surface identically on both surfaces to satisfy observability and security requirements.
Decision
- Shared request/response models live in
revaer-api::modelsand are re-exported to the CLI, ensuring identical JSON encoding/decoding paths. - All routes return RFC9457 Problem+JSON payloads on validation/runtime errors, including
invalid_paramspointers for user-correctable mistakes; the CLI pretty-prints these problems and maps validation to exit code2. - Cursor pagination, filter semantics, and SSE replay (
Last-Event-ID) are implemented once in the API and exercised by dedicated CLI commands (ls,status,tail). - The CLI propagates
x-request-idheaders, emits structured telemetry events toREVAER_TELEMETRY_ENDPOINT, and redacts secrets in logs; runtime failures exit with code3to distinguish from validation issues.
Consequences
- Changes to the API contract require updates in a single module (
revaer-api::models), reducing the risk of CLI drift. - Downstream tooling can rely on deterministic exit codes and Problem+JSON payloads, simplifying automation.
- Telemetry pipelines receive consistent trace identifiers regardless of whether requests originate from the CLI or other clients.
Verification
- Integration tests cover pagination, filter validation, SSE replay, and CLI HTTP interactions via
httpmock, ensuring behaviour remains in lockstep. just api-exportregeneratesdocs/api/openapi.json, and CI asserts the CLI uses the shared DTOs by compiling with the workspace feature set.
007 – API Key Security & Rate Limiting
- Status: Accepted
- Date: 2025-10-17
Context
- API keys were previously verified but not throttled, allowing abusive clients to starve the control plane and masking guard-rail violations.
- Operators need guard-rail metrics, health events, and documentation describing key lifecycle, rate limits, and rotation workflows.
- CLI tooling must respect the same security posture, including masking secrets and surfacing authentication failures with actionable errors.
Decision
- Each API key stores a JSON rate limit (
burst,per_seconds) validated byConfigService; token-bucket state is maintained per key inside the API layer. - Requests exceeding the configured budget return
429 Too Many RequestsProblem+JSON responses, increment Prometheus counters (api_rate_limit_throttled_total), and emitHealthChangedevents when guard rails (e.g., unlimited keys) are breached. - CLI authentication mandates
key_id:secret, redacts secrets in logs, and propagatesx-request-idso operators can correlate requests with server-side traces. - CI enforces MSRV and Docker security gates to ensure build artefacts respect the security baseline.
Consequences
- Compromised or runaway keys are contained, preventing control-plane denial-of-service and providing clear telemetry for incident response.
- Documentation now includes API key rotation steps, rate-limit expectations, and remediation guidance for guard-rail events.
- The API and CLI remain aligned by sharing auth context types and telemetry primitives.
Verification
- Unit tests cover rate-limit parsing and token-bucket behaviour; integration tests assert
429responses and CLI exit codes. /health/fullexposes rate-limit metrics, and the Docker image runs as a non-root user with health checks hitting the authenticated endpoints.
008 – Phase One Remaining Delivery (Task Record)
- Status: In Progress
- Date: 2025-10-17
Motivation
- Implement the outstanding Phase One scope: per-key rate limiting, CLI parity (telemetry, exit codes), packaging, documentation, and CI gates required by
docs/phase-one-remaining-spec.mdandAGENT.md.
Design Notes
- Introduced
ConfigService::authenticate_api_keyreturning rate-limit metadata, validated JSON payloads, and persisted canonical token-bucket configuration. - Added
ApiState::enforce_rate_limitwith per-key token buckets, guard-rail health publication, Prometheus counters, and Problem+JSON429responses. - CLI now builds
reqwestclients with defaultx-request-id, standardises exit codes (0/2/3), and emits optional telemetry events whenREVAER_TELEMETRY_ENDPOINTis set. - Created a multi-stage Dockerfile (non-root runtime, healthcheck, docs bundling) with
justrecipes for building and scanning. - Expanded CI with release artefact, Docker, and MSRV jobs that call the new
justtargets.
Test Coverage Summary
- Added unit tests for rate-limit parsing and token-bucket behaviour (
revaer-config,revaer-api). - Existing integration suites exercise Problem+JSON responses, SSE replay, and CLI HTTP interactions.
- Runbook (
docs/runbook.md) supports manual verification of FsOps, rate limits, and guard rails.
Observability Updates
- Prometheus now exposes
api_rate_limit_throttled_total;/health/fullincludes the counter and degrades when guard rails fire. - CLI telemetry emits JSON events (command, outcome, trace id, exit code) to configurable endpoints.
- Documentation adds telemetry reference, operations guide, and release checklist for operators.
Risk & Rollback
- Rate-limit enforcement is isolated to
require_api_key; rollback by removingenforce_rate_limitcall if unexpected throttles occur. - Docker image/builder changes are gated via
just docker-buildandjust docker-scan; revert by restoring previous absence of Docker packaging. - CI additions run after core jobs and can be disabled via workflow changes if they fail unexpectedly.
Dependency Rationale
- No new Rust crates were introduced. Docker scanning uses
trivyvia CI and manual recipe; it is optional for local development.
- Status: {Proposed|Accepted|Superseded}
- Date: {YYYY-MM-DD}
- Context:
- What problem are we solving?
- What constraints or forces shape the decision?
- Decision:
- Summary of the choice made.
- Alternatives considered.
- Consequences:
- Positive outcomes.
- Risks or trade-offs.
- Follow-up:
- Implementation tasks.
- Review checkpoints.