15 Commits

Author SHA1 Message Date
smarm
d432349f99 Update the documentation 2026-05-25 22:14:07 +02:00
smarm
2b85ef60b2 Make preemption knobs configurable; fix unused-variable warnings
Add `Config::alloc_interval()` and `Config::timeslice_cycles()` so
callers can tune preemption sensitivity at runtime. The values flow
through `RuntimeInner` and are written into per-scheduler-thread locals
via a new `configure_preempt()` call at thread startup, keeping the hot
path free of cross-thread coherency traffic.

Fix unused-variable warnings in channel.rs by inlining `current_pid()`
directly into `te!` macro arguments — since the no-op macro arm never
evaluates its argument, no binding is needed at the call site.

Clean up a handful of dead imports exposed by the refactor.
2026-05-25 21:52:16 +02:00
Bench
3da6ffaa77 benches: expose preemption knobs + sweep runner
Config API changes (src/preempt.rs, src/runtime.rs):
- preempt: promote ALLOC_INTERVAL and TIMESLICE_CYCLES from bare consts to
  DEFAULT_ALLOC_INTERVAL / DEFAULT_TIMESLICE_CYCLES; store active values in
  thread-locals set on each actor resume so multiple runtimes can use
  different settings concurrently.
- runtime: add alloc_interval / timeslice_cycles fields to Config; add
  Config::alloc_interval(n) and Config::timeslice_cycles(c) builder methods;
  thread the values through RuntimeInner to the reset_timeslice() call in
  schedule_loop.

Bench changes:
- Add bench_cfg(threads) helper to general/tokio_favored/smarm_favored that
  wraps Config::exact and reads SMARM_ALLOC_INTERVAL / SMARM_TIMESLICE_CYCLES
  env vars, so the sweep script can vary knobs without recompiling.

Sweep tooling (benches/sweep.py):
- 'run':     run the 3-file bench suite once; --save-baseline persists JSON
- 'regress': compare current run against baseline.json, exit 1 on any bench
             that regresses >10% vs stored medians
- 'sweep':   run the full SWEEP_GRID (10 points), print comparison table,
             optional --save-csv; binaries pre-built so no recompile per point

Sweep results (10-point grid, 1-CPU sandbox):
- The preemption knobs have very little effect on this single-CPU machine.
  Most benches move <5% across the entire grid.
- Longer timeslices (tc=600k, tc=1200k) reliably hurt spawn_storm_busy
  (+11-15%) and catch_unwind_panics (+10-12%) because actors hold the
  scheduler mutex longer per timeslice, stalling the storm of joinable tasks.
- Shorter timeslices (tc=150k) give a small improvement on many_timers
  (-3-4%) and a wash everywhere else.
- yield_in_hot_loop and uncontended_channel are essentially flat across all
  knobs — both are scheduling-dominated and call yield_now explicitly, so
  the RDTSC-driven preemption path is irrelevant.
- Conclusion: the knobs matter primarily under contention (multi-core).
  Re-run sweep on a multi-core machine before drawing tuning conclusions.
2026-05-25 13:04:58 +00:00
Bench
6d1c59fb99 benches: baseline results
Two compile fixes:
- tokio_favored.rs bench_mpsc_smarm: consumer spawn closure returned u64 via
  bare 'count' tail expression; smarm::Runtime::run() requires FnOnce()->().
  Fixed to 'let _ = count;'. Same fix on the consumer.join() call site.
- smarm_favored.rs bench_unc_smarm: same pattern, same fix.

Baseline run: Intel Xeon @ 2.80GHz, 1 core, kernel 6.18.5, rustc 1.95.0,
smarm 0.3.0, no RUSTFLAGS. Single-CPU sandbox — N-thread rows identical to
1-thread; scaling sweep limited to 1 thread.

Notable findings:
- deep_recursion: tokio wins (22 vs 62 us); mmap stack alloc cost dominates
  for single-use actors at depth 500.
- yield_in_hot_loop: tokio wins (138 vs 182 ms); smarm mutex overhead on
  yield_now exceeds expected naked-switch advantage on 1 CPU.
- mpsc_contention/uncontended_channel/catch_unwind_panics: smarm wins as
  predicted.
- spawn_storm_busy: smarm 47x slower; global mutex saturated by bg yielders.
2026-05-25 13:04:54 +00:00
Bench
4b348d12be docs: BENCHMARKS_AND_TUNING.md — bench results, knob recommendations, arch guidance 2026-05-25 13:04:50 +00:00
smarm
aeacaf6118 fix: stress testing & stability (v0.6.5)
Improve reliability under high load:
- tests/stress.rs: New comprehensive stress test suite (448 lines)
- Fine-tune I/O & runtime scheduling edge cases
- Pin versions & fix MSRV compatibility
2026-05-24 07:03:45 +00:00
Claude
978678a46e feat: full runtime redesign (v0.6)
Complete rewrite with improved architecture & correctness:
- src/runtime.rs: Simplified task scheduling with proper state transitions
- src/scheduler.rs: Decoupled from runtime, pure task queue logic
- src/io.rs, src/mutex.rs: Refactored for clarity & performance
- New actor model framework (src/actor.rs, src/context.rs)
- Channel primitives (src/channel.rs) & process IDs (src/pid.rs)
- Preemption framework (src/preempt.rs) for fair timeslicing
- Expanded benchmarks & tests (multi_scheduler, primes, runtime)
2026-05-23 16:09:35 +00:00
Claude
078447539c chore: reset working tree (v0.5)
Temporary commit clearing working tree for v0.6 rebuild
2026-05-23 16:09:35 +00:00
Claude
e9fdbb1160 refactor: centralize runtime logic (v0.4)
Extract scheduler responsibilities into a dedicated Runtime component:
- src/runtime.rs: New centralized control flow (669 lines)
- src/scheduler.rs: Simplified to task queue & preemption management
- tests/runtime.rs: Comprehensive runtime test suite
- benches/multi_scheduler.rs: Multi-runtime scheduling benchmarks
- Improves modularity and enables per-runtime configuration
2026-05-23 16:09:32 +00:00
Claude
8cbef1dfc1 feat: I/O and mutex support (v0.3)
Add epoll-based non-blocking I/O and kernel-like mutexes:
- src/io.rs: Complete epoll backend with timeout & error handling
- src/mutex.rs: Fair mutex with waiter queues & parking integration
- Enhanced scheduler to support synchronous I/O blocking
- Comprehensive test suites for I/O (epoll) and mutex behavior
- Documentation: LOOM.md concurrency model & README
2026-05-23 16:09:29 +00:00
Claude
d3ab81b833 preempt: explicit check!() macro for no-alloc loops
Stable Rust emits stack probes inline (subq/movq/jne loop) rather than
calling __rust_probestack, so there's no transparent hook for stack-
frame preemption. Override of __rust_probestack links cleanly but never
runs. Falling back to an explicit check!() that users drop into hot
compute loops.

check!() decrements the same ALLOC_COUNT counter as the heap path, so
both event sources fire timeslice checks at the same rate. Documents
the prep-to-park invariant on maybe_preempt — library code that
registers a wakeup and then parks must keep that window alloc-free and
check-free, or a preemption-driven yield in the middle would lose the
wakeup.
2026-05-22 05:37:04 +00:00
Claude
51bfccc3c2 feat: I/O and mutex support (v0.3)
Add epoll-based non-blocking I/O and kernel-like mutexes:
- src/io.rs: Complete epoll backend with timeout & error handling
- src/mutex.rs: Fair mutex with waiter queues & parking integration
- Enhanced scheduler to support synchronous I/O blocking
- Comprehensive test suites for I/O (epoll) and mutex behavior
- Documentation: LOOM.md concurrency model & README
2026-05-22 05:32:24 +00:00
Claude
2cf75febdc timer: sleep(duration) via min-heap of (deadline, pid)
Adds a BinaryHeap of timer entries on SchedulerState. sleep() inserts
an entry and parks; schedule_loop pops due entries each iteration and
unparks them. When the run queue is empty but timers are pending, the
OS thread sleeps until the soonest deadline.

Single-threaded only; thread::sleep is fine because no other thread
can wake us. The IO thread coming next will need a Condvar or pipe
wakeup to break this OS-sleep early.
2026-05-22 05:22:55 +00:00
Claude
6c48caecab preempt: pub maybe_preempt for shared counter across event sources
v0.2 will add stack-frame entries as a second preemption event source.
Both routes share ALLOC_COUNT so the timeslice check rate is the same
whether the actor is alloc-heavy, frame-heavy, or mixed.
2026-05-22 05:15:12 +00:00
Claude
0e9d9d7d5f v0.1: green-thread actors, supervision, channels, benchmark
Hand-rolled context switching on mmap'd stacks with guard pages,
allocator-driven RDTSC preemption, unbounded MPSC channels, supervision
via per-slot Signal mailboxes, root supervisor as sentinel PID.

Lib + tests + benches clean check/clippy. All 29 tests pass.
Bench: smarm 3.4% over serial baseline, within 160us of tokio
current-thread on prime-counting fan-out.
2026-05-22 05:01:51 +00:00