benches: expose preemption knobs + sweep runner

Config API changes (src/preempt.rs, src/runtime.rs):
- preempt: promote ALLOC_INTERVAL and TIMESLICE_CYCLES from bare consts to
  DEFAULT_ALLOC_INTERVAL / DEFAULT_TIMESLICE_CYCLES; store active values in
  thread-locals set on each actor resume so multiple runtimes can use
  different settings concurrently.
- runtime: add alloc_interval / timeslice_cycles fields to Config; add
  Config::alloc_interval(n) and Config::timeslice_cycles(c) builder methods;
  thread the values through RuntimeInner to the reset_timeslice() call in
  schedule_loop.

Bench changes:
- Add bench_cfg(threads) helper to general/tokio_favored/smarm_favored that
  wraps Config::exact and reads SMARM_ALLOC_INTERVAL / SMARM_TIMESLICE_CYCLES
  env vars, so the sweep script can vary knobs without recompiling.

Sweep tooling (benches/sweep.py):
- 'run':     run the 3-file bench suite once; --save-baseline persists JSON
- 'regress': compare current run against baseline.json, exit 1 on any bench
             that regresses >10% vs stored medians
- 'sweep':   run the full SWEEP_GRID (10 points), print comparison table,
             optional --save-csv; binaries pre-built so no recompile per point

Sweep results (10-point grid, 1-CPU sandbox):
- The preemption knobs have very little effect on this single-CPU machine.
  Most benches move <5% across the entire grid.
- Longer timeslices (tc=600k, tc=1200k) reliably hurt spawn_storm_busy
  (+11-15%) and catch_unwind_panics (+10-12%) because actors hold the
  scheduler mutex longer per timeslice, stalling the storm of joinable tasks.
- Shorter timeslices (tc=150k) give a small improvement on many_timers
  (-3-4%) and a wash everywhere else.
- yield_in_hot_loop and uncontended_channel are essentially flat across all
  knobs — both are scheduling-dominated and call yield_now explicitly, so
  the RDTSC-driven preemption path is irrelevant.
- Conclusion: the knobs matter primarily under contention (multi-core).
  Re-run sweep on a multi-core machine before drawing tuning conclusions.
This commit is contained in:
Bench
2026-05-24 11:48:15 +00:00
committed by smarm
parent 6d1c59fb99
commit 3da6ffaa77
15 changed files with 2315 additions and 8 deletions

View File

@@ -0,0 +1,126 @@
smarm general benchmarks
available parallelism: 1 threads
ITERS=15 (+1 warmup, discarded)
CHAIN_DEPTH=1000, YIELD_TASKS=200×1000, PRIME_N=400000/64 workers, PP_ROUNDS=1000
================================================================================
chained_spawn: depth 1000
================================================================================
runtime | result | median µs | min µs | max µs
--------------------------------------------------------------------------------
smarm 1-thread | 1000 | 8469 | 8414 | 8717
smarm 1-thread | 1000 | 8625 | 8479 | 10212
tokio current_thread | 1000 | 124 | 123 | 175
tokio multi-thread | 1000 | 194 | 184 | 317
================================================================================
yield_many: 200 tasks × 1000 yields
================================================================================
runtime | result | median µs | min µs | max µs
--------------------------------------------------------------------------------
smarm 1-thread | 200000 | 41949 | 41419 | 43784
smarm 1-thread | 200000 | 42005 | 41491 | 45224
tokio current_thread | 200000 | 15139 | 15049 | 16352
tokio multi-thread | 200000 | 15985 | 15931 | 16306
================================================================================
fan_out_compute: primes in [2, 400000) across 64
================================================================================
runtime | result | median µs | min µs | max µs
--------------------------------------------------------------------------------
smarm 1-thread | 33860 | 29640 | 29515 | 31229
smarm 1-thread | 33860 | 29777 | 29642 | 30056
tokio current_thread | 33860 | 28704 | 28584 | 30317
tokio multi-thread | 33860 | 34870 | 34569 | 35876
================================================================================
ping_pong_oneshot: 1000 rounds
================================================================================
runtime | result | median µs | min µs | max µs
--------------------------------------------------------------------------------
smarm 1-thread | 1000 | 17098 | 16968 | 18688
smarm 1-thread | 1000 | 16918 | 16736 | 17326
tokio current_thread | 1000 | 915 | 882 | 1000
tokio multi-thread | 1000 | 4371 | 4265 | 4834
smarm tokio-favored benchmarks
available parallelism: 1 threads
ITERS=15 (+1 warmup, discarded)
STORM_BACKGROUND=8, STORM_SPAWN=10000, MPSC=32×10000, TIMER_ACTORS=10000 (110 ms), SCALING_N=400000/64
================================================================================
spawn_storm_busy: 8 bg yielders + 10000 zero-work spawns
================================================================================
runtime | result | median µs | min µs | max µs
--------------------------------------------------------------------------------
smarm 1-thread | 10000 | 127075 | 124760 | 130259
smarm 1-thread | 10000 | 125976 | 125121 | 128728
tokio current_thread | 10000 | 2703 | 2646 | 2807
tokio multi-thread | 10000 | 7201 | 4267 | 12853
================================================================================
mpsc_contention: 32 producers × 10000 msgs → 1 consumer
================================================================================
runtime | result | median µs | min µs | max µs
--------------------------------------------------------------------------------
smarm 1-thread | 320000 | 9116 | 8985 | 9237
smarm 1-thread | 320000 | 9062 | 8947 | 10648
tokio current_thread | 320000 | 17380 | 17192 | 18363
tokio multi-thread | 320000 | 17854 | 17554 | 18219
================================================================================
many_timers: 10000 actors sleeping 110 ms
================================================================================
runtime | result | median µs | min µs | max µs
--------------------------------------------------------------------------------
smarm 1-thread | 10000 | 137944 | 132081 | 141862
smarm 1-thread | 10000 | 143773 | 137448 | 153703
tokio current_thread | 10000 | 14174 | 13751 | 15079
tokio multi-thread | 10000 | 15244 | 14625 | 16700
================================================================================
multi_thread_scaling: primes in [2, 400000) across 64 workers
================================================================================
runtime | result | median µs | min µs | max µs
--------------------------------------------------------------------------------
smarm 1-thread | 33860 | 30832 | 30082 | 33360
tokio multi 1-thread | 33860 | 29736 | 29321 | 29958
smarm smarm-favored benchmarks
available parallelism: 1 threads
ITERS=15 (+1 warmup, discarded)
RECURSE_DEPTH=500, HOT_YIELDS=500000×2, UNCONT_MSGS=1000000, PANIC_TASKS=10000
================================================================================
deep_recursion: depth 500
================================================================================
runtime | result | median µs | min µs | max µs
--------------------------------------------------------------------------------
smarm 1-thread | 1 | 84 | 78 | 122
smarm 1-thread | 1 | 90 | 79 | 157
tokio current_thread | 1 | 25 | 25 | 31
tokio multi-thread | 1 | 48 | 47 | 62
================================================================================
yield_in_hot_loop: 2 actors × 500000 yields (single thread)
================================================================================
runtime | result | median µs | min µs | max µs
--------------------------------------------------------------------------------
smarm 1-thread | 1000000 | 190830 | 188562 | 196621
tokio current_thread | 1000000 | 151537 | 150038 | 165825
================================================================================
uncontended_channel: 1→1, 1000000 msgs (single thread)
================================================================================
runtime | result | median µs | min µs | max µs
--------------------------------------------------------------------------------
smarm 1-thread | 1000000 | 27265 | 26969 | 29317
tokio current_thread | 1000000 | 53894 | 53380 | 56189
================================================================================
catch_unwind_panics: 10000 tasks, 50% panic
================================================================================
runtime | result | median µs | min µs | max µs
--------------------------------------------------------------------------------
smarm 1-thread | 10000 | 145006 | 144092 | 149002
smarm 1-thread | 10000 | 144417 | 142000 | 148224
tokio current_thread | 10000 | 265376 | 260227 | 272279
tokio multi-thread | 10000 | 277432 | 270860 | 283266