Durable Execution for Rust

Agents that survive failure

Give your Rust AI agents durable execution out of the box. Each LLM tool call result is persisted before the next one runs. Agent crashes, API timeouts, and restarts become non-events — completed work is never repeated, and agents resume from their last successful step.

Write payment flows where every step is persisted before moving on. A charge that succeeds is never repeated — even if your process crashes immediately after. Completed steps are replayed from the database, never re-executed.

Build onboarding flows where each step — welcome email, billing setup, resource provisioning — persists its result. If the process dies mid-flow, restart from exactly where it stopped. No duplicate emails, no double-charges.

Get Started View on GitHub

brewery_finder.rs

use zart::prelude::*;
use zart::{zart_durable, zart_step};

#[zart_step("extract", retry = "exponential(3, 2s)")]
async fn extract_city(query: &str) -> Result<City> {
    LlmFunction::<City>::new(llm).run(query).await
}

#[zart_step("fetch", retry = "exponential(3, 1s)")]
async fn fetch_breweries(city: &str) -> Result<Vec<Brewery>> {
    reqwest::get(&api_url(city)).await?.json().await
}

#[zart_durable("ai-brewery-finder", timeout = "10m")]
async fn brewery_finder(query: String) -> Result<Report> {
    let city = extract_city(&query).await?;
    let data = fetch_breweries(&city.name).await?;
    Ok(Report::summarize(&data))
}

checkout_flow.rs

use zart::prelude::*;
use zart::{zart_durable, zart_step};

#[zart_step("charge")]
async fn charge_card(order: &Order) -> Result<Charge> {
    stripe.charge(&order.card, order.total).await
}

#[zart_step("receipt")]
async fn send_receipt(charge: &Charge, email: &str) -> Result<()> {
    mailer.send(email, charge).await
}

#[zart_durable("checkout", timeout = "10m")]
async fn checkout(order: Order) -> Result<Receipt> {
    let charge = charge_card(&order).await?;
    send_receipt(&charge, &order.email).await?;
    let _: Shipped = zart::wait_for_event("shipped", Some(Duration::from_secs(86400))).await?;
    Ok(Receipt::from(charge))
}

onboarding.rs

use zart::prelude::*;
use zart::{zart_durable, zart_step};

#[zart_step("welcome")]
async fn send_welcome(email: &str) -> Result<()> {
    mailer.send(email, "Welcome!").await
}

#[zart_step("billing", retry = "exponential(3, 2s)")]
async fn setup_billing(user: &NewUser) -> Result<CustomerId> {
    stripe.create_customer(user).await
}

#[zart_durable("onboarding", timeout = "30m")]
async fn onboard(user: NewUser) -> Result<Welcome> {
    send_welcome(&user.email).await?;
    setup_billing(&user).await?;
    let _: Verified = zart::wait_for_event("email-confirmed", Some(Duration::from_secs(172800))).await?;
    provision(&user).await?;
    Ok(Welcome::for_user(&user))
}

Why Zart

Everything durable execution needs

Built for production Rust services. Every feature is designed to make distributed workflows reliable by default.

State Recovery

Each step's result is persisted before moving on. On restart, completed steps are replayed from storage — never re-executed.

Crash at 3am? Resume from step 4, not step 1

Smart Retries

Per-step retry policies with fixed, linear, or exponential backoff. Configure max attempts, delay, and jitter independently per step.

Transient network blip? Self-heals before anyone notices

Concurrent Workers

Multiple workers compete for tasks using SKIP LOCKED — zero contention, no coordination required, horizontal scaling out of the box.

Add more replicas. No config, no coordinator, no drama

Timeouts & Deadlines

Set per-workflow timeouts declaratively. A workflow that exceeds its deadline is cancelled cleanly — no stuck tasks, no leaked resources.

No zombie tasks eating DB connections at 100% CPU

Idempotency Built-in

Execution IDs prevent duplicate scheduling. A charge step that already succeeded will never fire again — even if the scheduler is called twice.

Double-charge bug in prod? That's a chargeback. Not anymore

Events & Async Waits

Workflows can durably wait for external signals — webhooks, human approvals, or other services — with configurable timeouts and automatic resumption.

Pause 3 days for human approval. Resume in milliseconds

Two styles

Use Zart your way

Whether you prefer ergonomic macros or explicit trait implementations — Zart has you covered.

use zart::prelude::*;
use zart::{zart_durable, zart_step};

/// Define each step as a standalone async function
#[zart_step("send-welcome", retry = "exponential(3, 2s)")]
async fn send_welcome(email: &str) -> Result<(), StepError> {
    send_email(email).await
}

#[zart_step("setup-billing")]
async fn setup_billing(email: &str) -> Result<String, StepError> {
    create_stripe_customer(email).await
}

/// Compose steps — no ctx threading, just plain .await
#[zart_durable("onboarding", timeout = "30m")]
async fn onboarding(data: OnboardingData) -> Result<(), TaskError> {
    send_welcome(&data.email).await?;
    let _customer_id = setup_billing(&data.email).await?;

    // Durable wait — workflow suspends, survives any restart
    let _: VerifyPayload =
        zart::wait_for_event("email-verified", Some(Duration::from_secs(172_800))).await?;

    Ok(())
}

use zart::prelude::*;
use async_trait::async_trait;
use std::borrow::Cow;

/// Define a step struct — holds your dependencies
struct SendEmailStep{ email: String }

#[async_trait]
impl ZartStep for SendEmailStep {
    type Output = ();
    fn step_name(&self) -> Cow<'static, str> { Cow::Borrowed("send-email") }
    fn retry_config(&self) -> Option<RetryConfig> {
        Some(RetryConfig::exponential(3, Duration::from_secs(2)))
    }
    // No ctx parameter — use zart::context() for introspection
    async fn run(&self) -> Result<(), StepError> {
        send_email(&self.email).await
    }
}

/// Compose steps — no ctx, no threading boilerplate
struct OnboardingTask;

#[async_trait]
impl DurableExecution for OnboardingTask {
    type Data   = OnboardingData;
    type Output = ();

    async fn run(&self, data: OnboardingData) -> Result<(), TaskError> {
        zart::step(SendEmailStep { email: data.email }).await?;
        let _: VerifyPayload = zart::wait_for_event(
            "email-verified",
            Some(Duration::from_secs(172_800)),
        ).await?;
        Ok(())
    }
}

Agents that survive failure

Everything durable execution needs

Use Zart your way

Ready to build?