Bench 3da6ffaa77 benches: expose preemption knobs + sweep runner
Config API changes (src/preempt.rs, src/runtime.rs):
- preempt: promote ALLOC_INTERVAL and TIMESLICE_CYCLES from bare consts to
  DEFAULT_ALLOC_INTERVAL / DEFAULT_TIMESLICE_CYCLES; store active values in
  thread-locals set on each actor resume so multiple runtimes can use
  different settings concurrently.
- runtime: add alloc_interval / timeslice_cycles fields to Config; add
  Config::alloc_interval(n) and Config::timeslice_cycles(c) builder methods;
  thread the values through RuntimeInner to the reset_timeslice() call in
  schedule_loop.

Bench changes:
- Add bench_cfg(threads) helper to general/tokio_favored/smarm_favored that
  wraps Config::exact and reads SMARM_ALLOC_INTERVAL / SMARM_TIMESLICE_CYCLES
  env vars, so the sweep script can vary knobs without recompiling.

Sweep tooling (benches/sweep.py):
- 'run':     run the 3-file bench suite once; --save-baseline persists JSON
- 'regress': compare current run against baseline.json, exit 1 on any bench
             that regresses >10% vs stored medians
- 'sweep':   run the full SWEEP_GRID (10 points), print comparison table,
             optional --save-csv; binaries pre-built so no recompile per point

Sweep results (10-point grid, 1-CPU sandbox):
- The preemption knobs have very little effect on this single-CPU machine.
  Most benches move <5% across the entire grid.
- Longer timeslices (tc=600k, tc=1200k) reliably hurt spawn_storm_busy
  (+11-15%) and catch_unwind_panics (+10-12%) because actors hold the
  scheduler mutex longer per timeslice, stalling the storm of joinable tasks.
- Shorter timeslices (tc=150k) give a small improvement on many_timers
  (-3-4%) and a wash everywhere else.
- yield_in_hot_loop and uncontended_channel are essentially flat across all
  knobs — both are scheduling-dominated and call yield_now explicitly, so
  the RDTSC-driven preemption path is irrelevant.
- Conclusion: the knobs matter primarily under contention (multi-core).
  Re-run sweep on a multi-core machine before drawing tuning conclusions.
2026-05-25 13:04:58 +00:00
2026-05-25 13:04:54 +00:00
2026-05-23 16:09:35 +00:00
2026-05-23 16:09:35 +00:00
2026-05-23 16:09:35 +00:00

smarm

Silly Marks Abstract Rust Machine. A prototype green-thread actor runtime for Rust.

Implements the core ideas in LOOM.md: green-thread actors on a shared heap, scheduled cooperatively, communicating only by Send messages. Erlang's isolation model without Erlang's copying GC, Rust's zero-copy ownership transfers without async's function colouring.

The scheduler is multi-threaded — one OS thread per available CPU, all drawing from a shared run queue. The single-threaded run() entry point is kept as a convenience wrapper around runtime::init(Config::exact(1)).run(f).

What's here

Module What it does
stack mmap'd growable stack with guard page; SIGSEGV on overflow
context #[naked] x86-64 context-switch shims, callee-saved regs only
preempt Allocator-driven preemption; check!() macro for no-alloc loops
pid (index, generation) PIDs; stale handles are detectable, not silent
actor Trampoline + catch_unwind boundary at the actor entry point
scheduler Run queue, slot table, spawn/join, parking, idle path
channel Unbounded MPSC channel; recv parks the actor
mutex Mutex<T> with mandatory timeout; FIFO waiters; parks the green thread
timer Min-heap of (deadline, reason); Sleep and WaitTimeout reasons
io block_on_io for blocking work; wait_readable/wait_writable + read/write via epoll
supervisor Signal::Exit / Signal::Panic delivered to a parent actor's mailbox

Quick taste

use smarm::{run, spawn, channel};

run(|| {
    let (tx, rx) = channel::<i64>();
    let h = spawn(move || {
        for _ in 0..3 {
            let v = rx.recv().unwrap();
            println!("got {v}");
        }
    });
    for v in 1..=3i64 {
        tx.send(v).unwrap();
    }
    h.join().unwrap();
});

Layout

src/
  stack.rs context.rs preempt.rs pid.rs actor.rs
  scheduler.rs channel.rs mutex.rs timer.rs io.rs supervisor.rs
  lib.rs
tests/
  per-module integration tests
benches/
  primes.rs    fan-out/fan-in compute, vs tokio current_thread
LOOM.md        design intent

Building and running

Standard Cargo. Requires Rust 1.95 or newer (the #[naked] attribute went stable in 1.88; we use a few unrelated post-1.88 features). x86-64 Linux only — ARM64 and macOS are on the deferred list because of the assembly shim and the epoll dependency.

cargo test                # all tests
cargo test --test mutex   # one module
cargo bench               # primes benchmark vs tokio

What's not here

See the Defer section of LOOM.md. Notable absences: supervisor restart-intensity caps, join! for handle groups, stack growth via remap, hierarchical timer wheel, fd-wait timeouts, Signal::Timeout. Each is mechanism we know how to add; none belongs in this iteration.

Description
SMARM - Smarm, Marks Actor Runtime Machinery
Readme 201 KiB
Languages
Rust 95.6%
Python 4.4%