Features

Zart provides everything you need to build reliable, long-running workflows in Rust. Here is a detailed look at each feature.

1. State Recovery

Every step’s result is serialized and written to the database before the workflow advances. On any restart — planned or unplanned — completed steps are replayed from storage.

How it works:

Worker calls ctx.step("charge", || async { stripe.charge(...).await }).
Zart checks zart_steps for a row with (execution_id, "charge").
Hit: deserializes and returns the stored value. Closure is never called.
Miss: runs the closure, serializes the result, inserts the row, returns the value.

What this means for you:

A payment is never charged twice, even if your process crashes after the charge but before the receipt.
Database inserts, API calls, and emails become safe to “retry” — the retry only runs incomplete work.
No idempotency logic needed inside the step closure for most cases.

// This can crash anywhere — Zart recovers from the last completed step
ctx.step("step-1", || async { do_a().await }).await?;
ctx.step("step-2", || async { do_b().await }).await?; // <-- crash here
ctx.step("step-3", || async { do_c().await }).await?; // resumes here

2. Retry Policies

Configure retry behaviour at both the task level and the individual step level.

Task-level retries via fn max_retries():

fn max_retries(&self) -> usize { 3 }

Step-level retries via ctx.step_with_retry() or z_step_with_retry!:

ctx.step_with_retry(
    "flaky-external-call",
    RetryConfig::exponential(5, Duration::from_secs(1)),
    || async { external.call().await },
).await?;

RetryConfig variants:

Config	Behaviour
`RetryConfig::none()`	No retries — fail immediately
`RetryConfig::fixed(n, delay)`	`n` retries, constant `delay` between each
`RetryConfig::exponential(n, base)`	`n` retries, delay doubles each attempt
`RetryConfig::exponential_with_max(n, base, cap)`	Same, but delay capped at `cap`

Jitter is applied automatically to exponential backoff to prevent thundering herd.

3. Parallel Execution

Fan out work to multiple concurrent sub-tasks using ctx.schedule_step() and ctx.wait_all().

Each scheduled step becomes an independent task in the scheduler — it can run on a different worker node. wait_all durably suspends the parent until all children complete.

let h1 = ctx.schedule_step("email",    || async { send_email().await });
let h2 = ctx.schedule_step("billing",  || async { setup_billing().await });
let h3 = ctx.schedule_step("provision",|| async { provision_aws().await });

// Parent suspends here — no thread is blocked
ctx.wait_all(vec![h1, h2, h3]).await?;

If the parent process restarts while waiting, the children continue running independently and the parent resumes when they’re done.

See Parallel Steps for full documentation.

4. Durable Waits

Pause a workflow without occupying a thread or blocking a worker.

`ctx.sleep(duration)`

The execution is suspended and re-queued after the duration expires. Zero threads blocked.

ctx.sleep(Duration::from_secs(3600)).await?; // wait 1 hour

`ctx.sleep_until(timestamp)`

Sleep until an absolute point in time.

let next_monday = compute_next_monday();
ctx.sleep_until(next_monday).await?;

`ctx.wait_for_event(name, timeout)`

Suspend until an external signal arrives.

let payload: Approval = ctx.wait_for_event(
    "hr-approval",
    Some(Duration::from_secs(5 * 86400)),
).await?;

See Wait for Event for full documentation.

5. Idempotency

Zart prevents duplicate executions at the scheduling layer.

Execution IDs: every execution has a unique ID. Scheduling with an explicit ID that already exists is a no-op — the existing execution continues.

// Safe to call multiple times — only one execution is created
scheduler.schedule_with_id(
    "checkout",
    &format!("checkout-order-{order_id}"),
    CheckoutData { order_id },
).await?;

Step-level idempotency: within an execution, step names act as idempotency keys. A step that has already completed will never re-run its closure.

External call idempotency: use ctx.execution_id() to derive stable keys for external APIs:

let idempotency_key = format!("stripe-charge-{}", ctx.execution_id());
stripe.charge_idempotently(&card, amount, &idempotency_key).await?;

6. Concurrent Workers

Multiple worker instances can run against the same database with zero configuration. Zart uses SELECT … FOR UPDATE SKIP LOCKED to claim tasks — each execution is processed by exactly one worker at a time.

Worker A ─── polls ──→ claims exec-001, exec-003
Worker B ─── polls ──→ claims exec-002, exec-004
Worker C ─── polls ──→ claims exec-005 (skips locked rows)

Workers can be added or removed at any time. There is no leader election, no ZooKeeper, no coordination service.

let worker = Worker::new(scheduler, registry, WorkerConfig {
    poll_interval:        Duration::from_secs(5),
    max_tasks_per_poll:   10,
    max_concurrent_tasks: 16,  // tokio tasks per worker
    shutdown_timeout:     Duration::from_secs(30),
});

7. Pluggable Storage

Swap databases without changing any workflow code. All storage backends implement the same Scheduler trait.

Backend	Crate	Use Case
PostgreSQL	`zart-postgres`	Production — recommended
SQLite	`zart-sqlite`	Dev, embedded, edge
MySQL	`zart-mysql`	Enterprise / existing infra
Custom	`zart` (trait)	Bring your own storage

Implement the Scheduler trait yourself to integrate with any data store:

use zart::Scheduler;

struct MyCustomScheduler { /* ... */ }

impl Scheduler for MyCustomScheduler {
    // implement poll_due, claim, complete, fail, etc.
}

8. Events

External systems can deliver signals to waiting workflows via offer_event.

Rust:

scheduler.offer_event(execution_id, "payment-received", payload).await?;

HTTP:

POST /api/v1/events/{execution_id}/{event_name}

CLI:

zart event <exec-id> <event-name> --data '{...}'

Events can carry arbitrary JSON payloads. The waiting wait_for_event call deserializes the payload into the expected type automatically.

Events time out gracefully — if no event arrives within the specified window, wait_for_event returns Err(TaskError::EventTimeout), allowing the workflow to take a fallback path.

9. Observability

Every execution and every step attempt is tracked in the database.

Execution status:

pending — scheduled, not yet started
running — actively being processed
waiting — durably paused (sleep or event wait)
completed — finished successfully
failed — exhausted retries

Step records (zart_steps table): stores the step name, attempt count, result JSON, and per-attempt error messages for debugging.

Query execution status via the Rust API:

let exec = scheduler.get_execution("exec-abc-123").await?;
println!("Status: {:?}", exec.status);
println!("Steps:  {}", exec.completed_steps);

Or via CLI (coming soon):

zart status exec-abc-123 --verbose