Timeouts and Deadlines

Core Concept: Deadlines, Not Durations

A timeout is a deadline. When you say timeout = "5m", you’re saying “this must finish by first_attempt_time + 5m.”

The framework records the deadline when a step or execution is first scheduled. On every subsequent attempt — retry, worker restart, or reschedule — the framework checks now() >= deadline rather than starting a fresh countdown. The deadline is persisted in the task’s metadata so it survives across worker restarts and retries.

Execution-Level Timeouts

Set a top-level deadline on your durable handler with the timeout attribute. The deadline is computed as scheduled_at + timeout_duration when the execution is first started.

#[zart_durable("order-fulfillment", timeout = "30m")]
async fn order_fulfillment(data: OrderInput) -> Result<Receipt, TaskError> {
    // Must complete within 30 minutes of when the execution was first scheduled.
    validate_address(&data.order_id).await?;
    process_payment(&data.order_id).await?;
    ship_order(&data.order_id).await?;
    Ok(receipt)
}

If the execution deadline is exceeded, the worker invokes your on_failure handler (if defined) before marking the execution as failed. This gives you a chance to compensate — refund a payment, notify the user, etc.

Duration strings support the following units:

Format	Example	Meaning
`Xs`	`30s`	Seconds
`Xm`	`30m`	Minutes
`Xh`	`2h`	Hours
`Xd`	`3d`	Days

No Timeout

Omit the timeout attribute entirely for executions that may run indefinitely (e.g., long-lived approval workflows).

#[zart_durable("approval-workflow")]
async fn approval_workflow(req: ApprovalRequest) -> Result<ApprovalOutput, TaskError> {
    // No timeout — waits indefinitely for manager approval.
    let decision = zart::wait_for_event("manager-approval", Some(Duration::from_secs(604800))).await?;
    // ...
}

Step-Level Timeouts and Timeout Scope

Individual steps can have their own timeouts via the #[zart_step] attribute:

#[zart_step("call-external-api", timeout = "5m", retry = "fixed(3, 1s)")]
async fn call_external_api() -> Result<Response, StepError> {
    // ...
}

Timeout Scope: `global` (default) vs `per_attempt`

The timeout_scope attribute controls how the timeout behaves across retries:

Scope	Behavior	Example
`global` (default)	Deadline is calculated from the first attempt. All retries share the same window.	`timeout = "5m"` → 5 minutes total from the first attempt
`per_attempt`	Each retry attempt gets a fresh countdown. No deadline is persisted.	`timeout = "30s", timeout_scope = "per_attempt"` → each attempt gets 30 seconds

// Global scope (default): 5 minutes total across all retries
#[zart_step("call-api", timeout = "5m", retry = "fixed(3, 1s)")]
async fn call_api() -> Result<Response, StepError> {
    // First attempt at 10:21:30 → deadline = 10:26:30
    // If attempt 1 fails at 10:22:00 and retries, attempt 2 has ~4m29s remaining
    client.get("https://api.example.com").send().await
}

// Per-attempt scope: each attempt gets a fresh 30 seconds
#[zart_step("call-api", timeout = "30s", timeout_scope = "per_attempt", retry = "fixed(3, 1s)")]
async fn call_api_per_attempt() -> Result<Response, StepError> {
    // Each of the 3 attempts gets its own 30-second countdown
    client.get("https://api.example.com").send().await
}

Timeout Is Always Terminal

Important: When a step times out — whether under global or per_attempt scope — it is terminal. The retry policy is never consulted for a timeout. Timeout means the step already broke a policy; retry is reserved for business errors only.

Scenario	Retries remain?	Behavior
Step times out	—	Terminal `TimedOut`. Body resumes with `StepOutcome::ZartErr(TimedOut)`.
Business error	Yes	Retry. Under `global` scope, the retry carries the deadline forward.
Business error	No	Terminal `RetryExhausted`. Body resumes.

If a retry is scheduled under global scope but the deadline has already passed when the worker picks it up, the step immediately completes with TimedOut — the lambda is never executed.

Timeout + Execution Deadline Hierarchy

The execution-level timeout is the ultimate ceiling. Even if individual steps have generous timeouts or no timeouts at all, the entire execution fails when the execution deadline is exceeded.

Execution timeout (30m)
├── Step A (timeout: 5m, scope: global, retry: fixed(3, 1s))
│   └── All retries must complete within 5m of the first attempt
├── Step B (timeout: 30s, scope: per_attempt, retry: fixed(3, 1s))
│   └── Each attempt gets 30s, but capped by the 30m execution deadline
├── Step C (no timeout)
│   └── Still bounded by the 30m execution deadline
├── zart::sleep("wait", 1h)
│   └── Sleep duration is independent, but execution deadline still applies
└── zart::wait_for_event("signal", Some(7d))
    └── Event-wait deadline is independent of step/execution deadlines

When a step runs under global scope, its effective timeout is capped by whichever deadline is sooner: the step’s own deadline or the execution’s deadline.

Event Wait Timeouts

zart::wait_for_event accepts an optional Duration. If no event arrives before the timeout, the wait fails with DeadlineExceeded.

// Wait up to 7 days for approval
let decision: ApprovalDecision = zart::wait_for_event(
    "manager-approval",
    Some(Duration::from_secs(7 * 86400)),
).await?;

Pass None to wait indefinitely (until cancelled or the execution-level timeout is exceeded):

let event: ShipmentEvent = zart::wait_for_event("shipment-ready", None).await?;

Cancelling an Execution

Stop a running execution from the outside:

let cancelled = durable.cancel("signup-user-42").await?;

A cancelled execution stops at its next scheduling point.
A step currently running will finish, but no new steps will be scheduled.
If the execution is durably sleeping or waiting for an event, it is cancelled immediately.
The status becomes Cancelled.

Admin Operations and Deadlines

When an administrator retries a failed/timed-out step via retry_step, the framework clears the persisted deadline from the step’s task metadata before creating the new task.

This means:

The retried step gets a fresh timeout budget — it is not bound by the original deadline that already expired.
If the step has a configured timeout, the timeout is applied as a raw duration (same behavior as PerAttempt scope) rather than a remaining-time calculation.
This is intentional: an admin retry is an explicit operator decision that the step deserves another chance. Carrying forward an already-expired deadline would cause the retry to silently time out immediately.

Note: restart and rerun_steps start entirely new runs. Steps are re-scheduled through the normal path, which computes fresh deadlines from now() + timeout_duration. Pause/resume also naturally gets fresh deadlines since the body replays and re-schedules steps.

Next Steps

Free Functions — reference for sleep, sleep_until, wait_for_event
Execution Management — DurableScheduler reference
Error Handling — the three-way outcome model