T

Benchandsmarm 3da6ffaa77 benches: expose preemption knobs + sweep runner

Config API changes (src/preempt.rs, src/runtime.rs):
- preempt: promote ALLOC_INTERVAL and TIMESLICE_CYCLES from bare consts to
  DEFAULT_ALLOC_INTERVAL / DEFAULT_TIMESLICE_CYCLES; store active values in
  thread-locals set on each actor resume so multiple runtimes can use
  different settings concurrently.
- runtime: add alloc_interval / timeslice_cycles fields to Config; add
  Config::alloc_interval(n) and Config::timeslice_cycles(c) builder methods;
  thread the values through RuntimeInner to the reset_timeslice() call in
  schedule_loop.

Bench changes:
- Add bench_cfg(threads) helper to general/tokio_favored/smarm_favored that
  wraps Config::exact and reads SMARM_ALLOC_INTERVAL / SMARM_TIMESLICE_CYCLES
  env vars, so the sweep script can vary knobs without recompiling.

Sweep tooling (benches/sweep.py):
- 'run':     run the 3-file bench suite once; --save-baseline persists JSON
- 'regress': compare current run against baseline.json, exit 1 on any bench
             that regresses >10% vs stored medians
- 'sweep':   run the full SWEEP_GRID (10 points), print comparison table,
             optional --save-csv; binaries pre-built so no recompile per point

Sweep results (10-point grid, 1-CPU sandbox):
- The preemption knobs have very little effect on this single-CPU machine.
  Most benches move <5% across the entire grid.
- Longer timeslices (tc=600k, tc=1200k) reliably hurt spawn_storm_busy
  (+11-15%) and catch_unwind_panics (+10-12%) because actors hold the
  scheduler mutex longer per timeslice, stalling the storm of joinable tasks.
- Shorter timeslices (tc=150k) give a small improvement on many_timers
  (-3-4%) and a wash everywhere else.
- yield_in_hot_loop and uncontended_channel are essentially flat across all
  knobs — both are scheduling-dominated and call yield_now explicitly, so
  the RDTSC-driven preemption path is irrelevant.
- Conclusion: the knobs matter primarily under contention (multi-core).
  Re-run sweep on a multi-core machine before drawing tuning conclusions.

2026-05-25 13:04:58 +00:00

benches

benches: expose preemption knobs + sweep runner

2026-05-25 13:04:58 +00:00

src

fix: stress testing & stability (v0.6.5)

2026-05-24 07:03:45 +00:00

tests

fix: stress testing & stability (v0.6.5)

2026-05-24 07:03:45 +00:00

.gitignore

fix: stress testing & stability (v0.6.5)

2026-05-24 07:03:45 +00:00

BENCHMARKS_AND_TUNING.md

docs: BENCHMARKS_AND_TUNING.md — bench results, knob recommendations, arch guidance

2026-05-25 13:04:50 +00:00

benchmarks.md

benches: baseline results

2026-05-25 13:04:54 +00:00

Cargo.toml

feat: full runtime redesign (v0.6)

2026-05-23 16:09:35 +00:00

LOOM.md

feat: full runtime redesign (v0.6)

2026-05-23 16:09:35 +00:00

README.md

feat: full runtime redesign (v0.6)

2026-05-23 16:09:35 +00:00

README.md

smarm

Silly Marks Abstract Rust Machine. A prototype green-thread actor runtime for Rust.

Implements the core ideas in LOOM.md: green-thread actors on a shared heap, scheduled cooperatively, communicating only by Send messages. Erlang's isolation model without Erlang's copying GC, Rust's zero-copy ownership transfers without async's function colouring.

The scheduler is multi-threaded — one OS thread per available CPU, all drawing from a shared run queue. The single-threaded run() entry point is kept as a convenience wrapper around runtime::init(Config::exact(1)).run(f).

What's here

Module	What it does
`stack`	`mmap`'d growable stack with guard page; SIGSEGV on overflow
`context`	`#[naked]` x86-64 context-switch shims, callee-saved regs only
`preempt`	Allocator-driven preemption; `check!()` macro for no-alloc loops
`pid`	`(index, generation)` PIDs; stale handles are detectable, not silent
`actor`	Trampoline + `catch_unwind` boundary at the actor entry point
`scheduler`	Run queue, slot table, spawn/join, parking, idle path
`channel`	Unbounded MPSC channel; `recv` parks the actor
`mutex`	`Mutex<T>` with mandatory timeout; FIFO waiters; parks the green thread
`timer`	Min-heap of `(deadline, reason)`; `Sleep` and `WaitTimeout` reasons
`io`	`block_on_io` for blocking work; `wait_readable`/`wait_writable` + `read`/`write` via epoll
`supervisor`	`Signal::Exit` / `Signal::Panic` delivered to a parent actor's mailbox

Quick taste

use smarm::{run, spawn, channel};

run(|| {
    let (tx, rx) = channel::<i64>();
    let h = spawn(move || {
        for _ in 0..3 {
            let v = rx.recv().unwrap();
            println!("got {v}");
        }
    });
    for v in 1..=3i64 {
        tx.send(v).unwrap();
    }
    h.join().unwrap();
});

Layout

src/
  stack.rs context.rs preempt.rs pid.rs actor.rs
  scheduler.rs channel.rs mutex.rs timer.rs io.rs supervisor.rs
  lib.rs
tests/
  per-module integration tests
benches/
  primes.rs    fan-out/fan-in compute, vs tokio current_thread
LOOM.md        design intent

Building and running

Standard Cargo. Requires Rust 1.95 or newer (the #[naked] attribute went stable in 1.88; we use a few unrelated post-1.88 features). x86-64 Linux only — ARM64 and macOS are on the deferred list because of the assembly shim and the epoll dependency.

cargo test                # all tests
cargo test --test mutex   # one module
cargo bench               # primes benchmark vs tokio

What's not here

See the Defer section of LOOM.md. Notable absences: supervisor restart-intensity caps, join! for handle groups, stack growth via remap, hierarchical timer wheel, fd-wait timeouts, Signal::Timeout. Each is mechanism we know how to add; none belongs in this iteration.