Concurrency

dusk’s concurrency arrived across the 0.3.x releases: threads and atomics in 0.3.0, channels in 0.3.1, mutexes and condition variables in 0.3.2, and the thread pool plus non-blocking channel operations in 0.3.3. A thread is an OS thread, and the primitives are ordinary stdlib structs over runtime shims. There is no separate concurrency paradigm to opt into. This guide walks the pieces in the order you are likely to reach for them. The precise rules live in the concurrency reference, and the modules themselves are documented under std.concurrent.

Spawning and joining threads

spawn starts an OS thread and join waits for it. Both are always-available builtins like alloc and read_file: no paradigm gates them, and no import is needed.

func main() -> int32 {
    t, e := spawn(lambda () -> void {
        println("worker")
    })
    if e.exists() {
        printerr(e)
        return 1
    }
    je := join(t)
    je.ignore()
    println("main")
    return 0
}

spawn(f: () -> void) -> (thread, error) takes a lambda literal written at the call site. The error fires when the operating system refuses the thread, and dusk’s must-handle rule means you face it like any other error (see Errors). A closure stored in a variable cannot be spawned, since only the literal site knows the environment layout the runtime copies, so wrap the call in a literal instead.

join(t: thread) -> error blocks until the body returns. The thread handle is a record in the generational heap and join retires it, so joining the same handle twice faults deterministically, the same check a use after free hits. Join everything you spawn: a thread still running when main returns dies mid-work.

What a thread can capture

A spawned lambda captures outer variables by immutable copy, like every lambda, and the copies live in a private heap block the runtime frees when the body returns. A thread never reads another thread’s stack and never mutates another thread’s locals.

Scalars, strings, fixed arrays, structs, enums, tuples, raw pointers, and handle structs such as AtomicInt, Channel<T>, and Mutex cross freely. Capturing a slice, a closure, or an interface value is a compile error, wherever it sits, including buried in a struct or enum field, because each may view the spawning frame. A captured managed *T becomes a borrow inside the thread: the thread can read through it, but freeing or moving the binding there is a compile error.

Capture-by-copy also makes the classic captured-loop-variable bug inexpressible: each iteration’s spawn copies that iteration’s value, never four copies of the final one.

Channels

std.concurrent.channel provides Channel<T>, a bounded, thread-safe queue: an ordinary generic struct, not a compiler type. It holds at most the capacity given at construction, always at least one.

@paradigm procedural
@import std.concurrent.channel
@import std.vector

func main() -> int32 {
    parts: Channel<int64> = chan_new(4)
    handles: *Vector<thread> = alloc(vec_new())
    mut w: int64 = 0
    while w < 4 {
        lo := w * 10 + 1
        t, e := spawn(lambda () -> void {
            mut sum: int64 = 0
            mut i := lo
            while i < lo + 10 {
                sum = sum + i
                i = i + 1
            }
            se := chan_send(parts, sum)
            se.ignore()
        })
        if e.exists() {
            printerr(e)
            return 1
        }
        vec_push(handles, t)
        w = w + 1
    }
    mut total: int64 = 0
    mut got: int64 = 0
    while got < 4 {
        v, e := chan_recv(parts)
        e.ignore()
        total = total + v
        got = got + 1
    }
    println(total)
    mut k: int64 = 0
    while k < vec_len(handles) {
        je := join(vec_get(handles, k))
        je.ignore()
        k = k + 1
    }
    chan_free(parts)
    vec_free(handles)
    free(handles)
    return 0
}

Four workers each sum their slice of 1 through 40 and send one partial down the shared channel; main folds exactly four receives, so the total is exact no matter which worker lands first. The channel handle is one word and copies freely, including into a spawned lambda’s captures, and every copy names the same channel. Unlike a managed pointer, a channel is a sharing point: aliasing it is its purpose, so it sits outside the single-owner rule.

The core operations:

chan_new<T>(cap) needs the element type from the binding annotation, so write jobs: Channel<int64> = chan_new(8), not jobs := chan_new(8).
chan_send(c, x) copies the value in, blocking while the channel is full; its error means the channel is closed.
chan_recv(c) copies the oldest value out, blocking while the channel is empty; its error appears only once the channel is closed and drained, so a loop breaking on e.exists() consumes everything that was sent.
chan_close(c) wakes every blocked sender and receiver, discarding nothing already buffered; chan_free(c) frees the channel.

An element type must be safe to carry to another thread, the same rule spawn captures follow. Shutdown follows one order: close the channel, join every thread that touches it, then chan_free it. The exact contract for each operation, including what is fatal versus an error, is in the concurrency reference.

Non-blocking and timed operations

Added in 0.3.3, three operations refuse instead of parking:

e := chan_try_send(c, x)        // "channel is full" instead of waiting
v, e := chan_try_recv(c)        // "channel is empty" instead of waiting
v, e := chan_recv_timeout(c, 5) // parks at most 5 ms, "receive timed out"

chan_recv_timeout measures against a monotonic clock, and each variant still reports the closed message its blocking twin uses. A tick loop parks on chan_recv_timeout, does a round of work, and loops back in. That is the event loop shape the async releases build on.

Moving ownership across a channel

Ownership crosses a thread boundary by moving a managed pointer through a channel. chan_send(c, move(p)) kills the sender’s name at compile time through the ordinary argument-position move, and the receiver binds a fresh owner through the ordinary call-returns-ownership rule, so exactly one thread owns the record at every instant.

@import std.concurrent.channel

func main() -> int32 {
    hand: Channel<*int64> = chan_new(1)
    t, e := spawn(lambda () -> void {
        q, re := chan_recv(hand)
        re.ignore()
        println(*q + 1)
        free(q)
    })
    if e.exists() {
        printerr(e)
        return 1
    }
    p: *int64 = alloc(40)
    se := chan_send(hand, move(p))
    se.ignore()
    je := join(t)
    je.ignore()
    chan_free(hand)
    return 0
}

Sending without move leaves the sender holding a live name, so sender and receiver then share the record with no order between them. Two shapes leak: a moved send that the closed channel refuses, and managed pointers still buffered when chan_free runs. But neither is corruption, and neither happens in the sanctioned protocol where senders finish before the close; the concurrency reference walks through both. See Memory for the ownership rules this builds on.

Mutexes

std.concurrent.sync carries Mutex and Condvar. The pattern for shared mutable state is a *raw buffer guarded by one mutex (lock, touch the buffer, unlock), and putting defer unlock(m) right after lock(m) releases on every return path.

@paradigm procedural
@import std.concurrent.sync

func add_one(m: Mutex, buf: *raw int64) -> void {
    lock(m)
    defer unlock(m)
    buf[0] = buf[0] + 1
}

func main() -> int32 {
    m := mutex_new()
    buf: *raw int64 = alloc_bytes(8)
    buf[0] = 0
    t1, e1 := spawn(lambda () -> void {
        mut i: int64 = 0
        while i < 2500 {
            add_one(m, buf)
            i = i + 1
        }
    })
    e1.ignore()
    t2, e2 := spawn(lambda () -> void {
        mut i: int64 = 0
        while i < 2500 {
            add_one(m, buf)
            i = i + 1
        }
    })
    e2.ignore()
    j1 := join(t1)
    j1.ignore()
    j2 := join(t2)
    j2.ignore()
    println(buf[0])
    mutex_free(m)
    free(buf)
    return 0
}

This prints exactly 5000 because an unlock happens before the lock that next acquires the same mutex, the ordering that makes the guarded memory safe to touch. The Mutex handle is one word and copies freely; every copy names the same lock.

The mutex is the error-checking kind, so the classic pthread misuses (relocking a mutex the thread already holds, unlocking a mutex the thread does not hold, freeing a held mutex, operating on a mutex already freed) fault by name instead of hanging or corrupting.

Condition variables

cond_wait(cv, m) releases the mutex while it sleeps (until cond_signal(cv) wakes one waiter or cond_broadcast(cv) wakes all) and reacquires it before returning. Wakeups can be spurious, so a wait always sits in a loop that rechecks its predicate under the lock:

lock(m)
while buf[5] == 0 {
    cond_wait(notempty, m)
}
// consume under the lock, then
unlock(m)

Free a condition variable only after every waiter has left it. Freeing one a thread still waits on is fatal by name. There is no timed condition wait, so a predicate nothing ever makes true deadlocks; when you need a timeout, use chan_recv_timeout instead. The bounded.dusk and pingpong.dusk programs in the repo’s examples build a bounded buffer and a strict-alternation protocol from these primitives.

Atomics

std.concurrent.atomic carries AtomicInt, a sequentially consistent int64 over a heap word: atomic_new, atomic_load, atomic_store, atomic_add (returns the new value), atomic_cas (true when the swap happened), and atomic_free. The handle copies freely; every copy names the same word.

@paradigm procedural
@import std.concurrent.atomic

func main() -> int32 {
    c := atomic_new(0)
    t1, e1 := spawn(lambda () -> void {
        mut i: int64 = 0
        while i < 10000 {
            atomic_add(c, 1)
            i = i + 1
        }
    })
    e1.ignore()
    t2, e2 := spawn(lambda () -> void {
        mut i: int64 = 0
        while i < 10000 {
            atomic_add(c, 1)
            i = i + 1
        }
    })
    e2.ignore()
    j1 := join(t1)
    j1.ignore()
    j2 := join(t2)
    j2.ignore()
    println(atomic_load(c))
    atomic_free(c)
    return 0
}

The total is exactly 20000 because every add is sequentially consistent, which a plain shared int64 could not promise.

The thread pool

Added in 0.3.3, the pool is a process singleton of OS threads that runs fire-and-forget tasks, the substrate the async releases schedule onto. submit is an always-available builtin like spawn and shares its whole argument rule: one lambda literal of type () -> void, captures copied to a private heap block, the same slice/closure/interface capture ban, and a captured managed pointer borrowed, not owned. It returns only an error; the pool owns the task, so results flow back through a channel.

@paradigm procedural
@import std.concurrent.channel
@import std.concurrent.pool

func main() -> int32 {
    pe := pool_start(ncpu())
    if pe.exists() {
        printerr(pe)
        return 1
    }
    done: Channel<int64> = chan_new(4)
    mut n: int64 = 0
    while n < 3 {
        v := n * 10
        se := submit(lambda () -> void {
            we := chan_send(done, v)
            we.ignore()
        })
        se.ignore()
        n = n + 1
    }
    mut sum: int64 = 0
    mut got: int64 = 0
    while got < 3 {
        v, e := chan_recv_timeout(done, 5)
        if e.exists() {
            // A quiet tick: do a round of other work, then park again.
        } else {
            sum = sum + v
            got = got + 1
        }
    }
    println(sum)
    pool_shutdown()
    chan_free(done)
    return 0
}

pool_start(workers) -> error in std.concurrent.pool starts the singleton with a fixed worker count; ncpu() -> int64 is the natural count. A process gets exactly one successful start, and after pool_shutdown() (which drains everything already queued, joins the workers, and is safe to call from racing threads) the pool stays down for good, so shut it down once, before main returns.

A submit never blocks the submitter; its error exists only when the pool is not running, in which case the task body never runs. Tasks run on many workers at once, so completion order is unrelated to submission order, which is why the example above folds results out of a channel instead of assuming an order. The precise start, submit, and shutdown contracts, including which misuses are fatal, are in the concurrency reference.

The memory model

dusk does not detect data races: two threads touching the same memory, at least one writing, with no sanctioned ordering path between them is undefined behavior, exactly as in C. The sanctioned paths are the ones this guide already used: spawn captures copy, a receive happens after the send that delivered its value, an unlock happens before the next lock of the same mutex, atomics order the accesses they mediate, join orders a whole thread before the joiner continues, and queuing a pool task happens before its body runs. Sharing built by hand out of *raw T buffers has no ordering at all unless a mutex guards every touch.

The generational heap stays thread safe (alloc, free, and the dereference check all work from any thread), and in a program whose frees and uses are ordered by a sanctioned path it keeps its guarantee that a use after free faults instead of corrupting memory. In a program that races, the check degrades to a best-effort backstop. The concurrency reference states the full memory model.

Where to go next

Concurrency reference: the precise rules for spawn, join, submit, and the memory model.
std.concurrent: full API listing for channel, sync, atomic, pool, and thread (which carries sleep_ms).
Memory: ownership, move, and the generational heap the thread rules build on.
Errors: the must-handle rule every (T, error) return here follows.