Skip to content

Features

Zart provides everything you need to build reliable, long-running workflows in Rust. Here is a detailed look at each feature.


Every step’s result is serialized and written to the database before the workflow advances. On any restart — planned or unplanned — completed steps are replayed from storage.

How it works:

  1. Worker calls ctx.step("charge", || async { stripe.charge(...).await }).
  2. Zart checks zart_steps for a row with (execution_id, "charge").
  3. Hit: deserializes and returns the stored value. Closure is never called.
  4. Miss: runs the closure, serializes the result, inserts the row, returns the value.

What this means for you:

  • A payment is never charged twice, even if your process crashes after the charge but before the receipt.
  • Database inserts, API calls, and emails become safe to “retry” — the retry only runs incomplete work.
  • No idempotency logic needed inside the step closure for most cases.
// This can crash anywhere — Zart recovers from the last completed step
ctx.step("step-1", || async { do_a().await }).await?;
ctx.step("step-2", || async { do_b().await }).await?; // <-- crash here
ctx.step("step-3", || async { do_c().await }).await?; // resumes here

Configure retry behaviour at both the task level and the individual step level.

Task-level retries via fn max_retries():

fn max_retries(&self) -> usize { 3 }

Step-level retries via ctx.step_with_retry() or z_step_with_retry!:

ctx.step_with_retry(
"flaky-external-call",
RetryConfig::exponential(5, Duration::from_secs(1)),
|| async { external.call().await },
).await?;

RetryConfig variants:

ConfigBehaviour
RetryConfig::none()No retries — fail immediately
RetryConfig::fixed(n, delay)n retries, constant delay between each
RetryConfig::exponential(n, base)n retries, delay doubles each attempt
RetryConfig::exponential_with_max(n, base, cap)Same, but delay capped at cap

Jitter is applied automatically to exponential backoff to prevent thundering herd.


Fan out work to multiple concurrent sub-tasks using ctx.schedule_step() and ctx.wait_all().

Each scheduled step becomes an independent task in the scheduler — it can run on a different worker node. wait_all durably suspends the parent until all children complete.

let h1 = ctx.schedule_step("email", || async { send_email().await });
let h2 = ctx.schedule_step("billing", || async { setup_billing().await });
let h3 = ctx.schedule_step("provision",|| async { provision_aws().await });
// Parent suspends here — no thread is blocked
ctx.wait_all(vec![h1, h2, h3]).await?;

If the parent process restarts while waiting, the children continue running independently and the parent resumes when they’re done.

See Parallel Steps for full documentation.


Pause a workflow without occupying a thread or blocking a worker.

The execution is suspended and re-queued after the duration expires. Zero threads blocked.

ctx.sleep(Duration::from_secs(3600)).await?; // wait 1 hour

Sleep until an absolute point in time.

let next_monday = compute_next_monday();
ctx.sleep_until(next_monday).await?;

Suspend until an external signal arrives.

let payload: Approval = ctx.wait_for_event(
"hr-approval",
Some(Duration::from_secs(5 * 86400)),
).await?;

See Wait for Event for full documentation.


Zart prevents duplicate executions at the scheduling layer.

Execution IDs: every execution has a unique ID. Scheduling with an explicit ID that already exists is a no-op — the existing execution continues.

// Safe to call multiple times — only one execution is created
scheduler.schedule_with_id(
"checkout",
&format!("checkout-order-{order_id}"),
CheckoutData { order_id },
).await?;

Step-level idempotency: within an execution, step names act as idempotency keys. A step that has already completed will never re-run its closure.

External call idempotency: use ctx.execution_id() to derive stable keys for external APIs:

let idempotency_key = format!("stripe-charge-{}", ctx.execution_id());
stripe.charge_idempotently(&card, amount, &idempotency_key).await?;

Multiple worker instances can run against the same database with zero configuration. Zart uses SELECT … FOR UPDATE SKIP LOCKED to claim tasks — each execution is processed by exactly one worker at a time.

Worker A ─── polls ──→ claims exec-001, exec-003
Worker B ─── polls ──→ claims exec-002, exec-004
Worker C ─── polls ──→ claims exec-005 (skips locked rows)

Workers can be added or removed at any time. There is no leader election, no ZooKeeper, no coordination service.

let worker = Worker::new(scheduler, registry, WorkerConfig {
poll_interval: Duration::from_secs(5),
max_tasks_per_poll: 10,
max_concurrent_tasks: 16, // tokio tasks per worker
shutdown_timeout: Duration::from_secs(30),
});

Swap databases without changing any workflow code. All storage backends implement the same Scheduler trait.

BackendCrateUse Case
PostgreSQLzart-postgresProduction — recommended
SQLitezart-sqliteDev, embedded, edge
MySQLzart-mysqlEnterprise / existing infra
Customzart (trait)Bring your own storage

Implement the Scheduler trait yourself to integrate with any data store:

use zart::Scheduler;
struct MyCustomScheduler { /* ... */ }
impl Scheduler for MyCustomScheduler {
// implement poll_due, claim, complete, fail, etc.
}

External systems can deliver signals to waiting workflows via offer_event.

Rust:

scheduler.offer_event(execution_id, "payment-received", payload).await?;

HTTP:

POST /api/v1/events/{execution_id}/{event_name}

CLI:

Terminal window
zart event <exec-id> <event-name> --data '{...}'

Events can carry arbitrary JSON payloads. The waiting wait_for_event call deserializes the payload into the expected type automatically.

Events time out gracefully — if no event arrives within the specified window, wait_for_event returns Err(TaskError::EventTimeout), allowing the workflow to take a fallback path.


Every execution and every step attempt is tracked in the database.

Execution status:

  • pending — scheduled, not yet started
  • running — actively being processed
  • waiting — durably paused (sleep or event wait)
  • completed — finished successfully
  • failed — exhausted retries

Step records (zart_steps table): stores the step name, attempt count, result JSON, and per-attempt error messages for debugging.

Query execution status via the Rust API:

let exec = scheduler.get_execution("exec-abc-123").await?;
println!("Status: {:?}", exec.status);
println!("Steps: {}", exec.completed_steps);

Or via CLI (coming soon):

Terminal window
zart status exec-abc-123 --verbose