Compare commits
2 Commits
3da6ffaa77
...
master
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
d432349f99 | ||
|
|
2b85ef60b2 |
1
.gitignore
vendored
1
.gitignore
vendored
@@ -1,2 +1,3 @@
|
|||||||
target
|
target
|
||||||
Cargo.lock
|
Cargo.lock
|
||||||
|
smarm_trace.json
|
||||||
|
|||||||
@@ -12,7 +12,7 @@ libc = "0.2"
|
|||||||
|
|
||||||
[dev-dependencies]
|
[dev-dependencies]
|
||||||
libc = "0.2"
|
libc = "0.2"
|
||||||
tokio = { version = "1", features = ["rt", "rt-multi-thread", "macros", "sync"] }
|
tokio = { version = "1", features = ["rt", "rt-multi-thread", "macros", "sync", "time"] }
|
||||||
|
|
||||||
[profile.dev]
|
[profile.dev]
|
||||||
panic = "unwind"
|
panic = "unwind"
|
||||||
|
|||||||
26
README.md
26
README.md
@@ -1,8 +1,8 @@
|
|||||||
# smarm
|
# smarm
|
||||||
|
|
||||||
> Silly Marks Abstract Rust Machine. A prototype green-thread actor runtime for Rust.
|
> SMARM — Smarm, Marks Actor Runtime Machinery. A proof-of-concept green-thread actor runtime for Rust.
|
||||||
|
|
||||||
Implements the core ideas in [`LOOM.md`](./LOOM.md): green-thread actors on a
|
Implements the core ideas in [`Achitecture.md`](.docs/Architecture.md): green-thread actors on a
|
||||||
shared heap, scheduled cooperatively, communicating only by `Send` messages.
|
shared heap, scheduled cooperatively, communicating only by `Send` messages.
|
||||||
Erlang's isolation model without Erlang's copying GC, Rust's zero-copy
|
Erlang's isolation model without Erlang's copying GC, Rust's zero-copy
|
||||||
ownership transfers without async's function colouring.
|
ownership transfers without async's function colouring.
|
||||||
@@ -58,7 +58,6 @@ tests/
|
|||||||
per-module integration tests
|
per-module integration tests
|
||||||
benches/
|
benches/
|
||||||
primes.rs fan-out/fan-in compute, vs tokio current_thread
|
primes.rs fan-out/fan-in compute, vs tokio current_thread
|
||||||
LOOM.md design intent
|
|
||||||
```
|
```
|
||||||
|
|
||||||
## Building and running
|
## Building and running
|
||||||
@@ -76,7 +75,26 @@ cargo bench # primes benchmark vs tokio
|
|||||||
|
|
||||||
## What's not here
|
## What's not here
|
||||||
|
|
||||||
See the **Defer** section of `LOOM.md`. Notable absences: supervisor
|
See the **Defer** section of `Architecture.md`.
|
||||||
restart-intensity caps, `join!` for handle groups, stack growth via remap,
|
restart-intensity caps, `join!` for handle groups, stack growth via remap,
|
||||||
hierarchical timer wheel, fd-wait timeouts, `Signal::Timeout`. Each is
|
hierarchical timer wheel, fd-wait timeouts, `Signal::Timeout`. Each is
|
||||||
mechanism we know how to add; none belongs in this iteration.
|
mechanism we know how to add; none belongs in this iteration.
|
||||||
|
|
||||||
|
## Docs
|
||||||
|
|
||||||
|
| Document | What it covers |
|
||||||
|
|---|---|
|
||||||
|
| [`Architecture.md`](./docs/Architecture.md) | Design intent, runtime model, and deferred work |
|
||||||
|
| [`smarm - Deep Dive.html`](./docs/smarm%20-%20Deep%20Dive.html) | Generated walkthrough of the system; good starting point |
|
||||||
|
| [`BENCHMARKS_AND_TUNING.md`](./docs/BENCHMARKS_AND_TUNING.md) | Where smarm wins and loses vs tokio, preemption knob recommendations |
|
||||||
|
| [`benchmarks.md`](./docs/benchmarks.md) | Raw benchmark results, methodology, and tuning experiment log |
|
||||||
|
|
||||||
|
## Contributing
|
||||||
|
|
||||||
|
This is a personal proof-of-concept. There's no PR workflow — if you fork it
|
||||||
|
and do something interesting, just send me an email. I'd genuinely like to
|
||||||
|
hear about it.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
<sub>The name is a recursive acronym. The M is for Marks, as in the BEAM — Bogdan/Björn's Erlang Abstract Machine, the virtual machine that runs Erlang and Elixir. smarm is not the BEAM. It just admires it from a safe distance.</sub>
|
||||||
|
|||||||
@@ -1,4 +1,4 @@
|
|||||||
# Loom
|
# SMARM Architecture
|
||||||
|
|
||||||
> Erlang-style actor concurrency for Rust, without the copies, the colors, or the GC pauses.
|
> Erlang-style actor concurrency for Rust, without the copies, the colors, or the GC pauses.
|
||||||
|
|
||||||
@@ -11,7 +11,7 @@ draws the boundary, the borrow checker already enforces it. What it lacks is an
|
|||||||
async/await is IO-centric, colors your functions, and trades stack simplicity for state-machine complexity;
|
async/await is IO-centric, colors your functions, and trades stack simplicity for state-machine complexity;
|
||||||
OS threads are too heavy to spawn per actor.
|
OS threads are too heavy to spawn per actor.
|
||||||
|
|
||||||
Loom adds a third option: **green-thread actors on a shared heap**, scheduled cooperatively, with
|
SMARM adds a third option: **green-thread actors on a shared heap**, scheduled cooperatively, with
|
||||||
message-passing as the only cross-actor communication primitive. You get Erlang's isolation model without
|
message-passing as the only cross-actor communication primitive. You get Erlang's isolation model without
|
||||||
Erlang's copying GC, and you get Rust's zero-copy ownership transfers without async's cognitive overhead.
|
Erlang's copying GC, and you get Rust's zero-copy ownership transfers without async's cognitive overhead.
|
||||||
No function coloring. No `Box<dyn Future>`. Just actors, messages, and the borrow checker doing what it
|
No function coloring. No `Box<dyn Future>`. Just actors, messages, and the borrow checker doing what it
|
||||||
@@ -24,14 +24,14 @@ already does.
|
|||||||
### Actors and scheduling
|
### Actors and scheduling
|
||||||
|
|
||||||
Each actor is a lightweight green thread with its own heap-allocated, growable stack. Stacks are
|
Each actor is a lightweight green thread with its own heap-allocated, growable stack. Stacks are
|
||||||
allocated via `mmap` with a guard page below the region; overflow is detected by the OS without Loom
|
allocated via `mmap` with a guard page below the region; overflow is detected by the OS without SMARM
|
||||||
polling for it. Initial stacks are small and grow by remapping on demand.
|
polling for it. Initial stacks are small and grow by remapping on demand.
|
||||||
|
|
||||||
The scheduler runs one OS thread per CPU. Each scheduler thread loops against a single global
|
The scheduler runs one OS thread per CPU. Each scheduler thread loops against a single global
|
||||||
`Mutex<HashMap>` queue shared across all schedulers. If queue contention becomes a measured bottleneck
|
`Mutex<HashMap>` queue shared across all schedulers. If queue contention becomes a measured bottleneck
|
||||||
this can be revisited; the interface will not change.
|
this can be revisited; the interface will not change.
|
||||||
|
|
||||||
Loom requires `panic = unwind`. Users who set `panic = abort` accept that supervision and actor
|
SMARM requires `panic = unwind`. Users who set `panic = abort` accept that supervision and actor
|
||||||
isolation are silently degraded to process death.
|
isolation are silently degraded to process death.
|
||||||
|
|
||||||
### Process descriptor
|
### Process descriptor
|
||||||
@@ -84,11 +84,11 @@ threshold is exceeded the actor yields. The workloads that starve a scheduler
|
|||||||
data transformation — are precisely the ones doing frequent allocations, so this approximation is
|
data transformation — are precisely the ones doing frequent allocations, so this approximation is
|
||||||
correct by construction.
|
correct by construction.
|
||||||
|
|
||||||
`RDTSC` is not monotonic across core migration; a slightly wrong timeslice is acceptable. Loom is
|
`RDTSC` is not monotonic across core migration; a slightly wrong timeslice is acceptable. SMARM is
|
||||||
not a real-time scheduler.
|
not a real-time scheduler.
|
||||||
|
|
||||||
Known failure mode: tight no-alloc loops are invisible to this mechanism. Actors doing sustained
|
Known failure mode: tight no-alloc loops are invisible to this mechanism. Actors doing sustained
|
||||||
allocation-free compute must call `loom::yield_now()` explicitly, or offload to a thread pool
|
allocation-free compute must call `smarm::yield_now()` explicitly, or offload to a thread pool
|
||||||
outside the actor scheduler (e.g. rayon). This is documented and acceptable — such loops are rare
|
outside the actor scheduler (e.g. rayon). This is documented and acceptable — such loops are rare
|
||||||
in message-passing workloads.
|
in message-passing workloads.
|
||||||
|
|
||||||
@@ -99,12 +99,12 @@ An actor yields at:
|
|||||||
- **Channel send/recv** — the primary communication primitive
|
- **Channel send/recv** — the primary communication primitive
|
||||||
- **Mutex contention** — attempting to lock a held `Arc<Mutex<>>` parks the actor
|
- **Mutex contention** — attempting to lock a held `Arc<Mutex<>>` parks the actor
|
||||||
- **IO** — blocking on a socket or file descriptor parks the actor until the IO thread signals readiness
|
- **IO** — blocking on a socket or file descriptor parks the actor until the IO thread signals readiness
|
||||||
- **`loom::sleep(duration)`** — parks the actor; the timer wheel re-queues it on expiry
|
- **`smarm::sleep(duration)`** — parks the actor; the timer wheel re-queues it on expiry
|
||||||
- **`loom::yield_now()`** — explicit cooperative yield
|
- **`smarm::yield_now()`** — explicit cooperative yield
|
||||||
- **Allocator preemption** — as above
|
- **Allocator preemption** — as above
|
||||||
- **Spawn** — does not yield by default; the new actor is queued and the spawner continues
|
- **Spawn** — does not yield by default; the new actor is queued and the spawner continues
|
||||||
|
|
||||||
`std::thread::sleep` inside an actor blocks the entire OS thread and should never be used. Loom
|
`std::thread::sleep` inside an actor blocks the entire OS thread and should never be used. SMARM
|
||||||
may emit a warning if it can detect this.
|
may emit a warning if it can detect this.
|
||||||
|
|
||||||
### IO thread
|
### IO thread
|
||||||
@@ -112,7 +112,7 @@ may emit a warning if it can detect this.
|
|||||||
A single dedicated IO thread runs an `epoll`/`kqueue` loop. Actors blocking on IO register their
|
A single dedicated IO thread runs an `epoll`/`kqueue` loop. Actors blocking on IO register their
|
||||||
file descriptor and PID; the IO thread moves them back into the global queue when the fd is ready.
|
file descriptor and PID; the IO thread moves them back into the global queue when the fd is ready.
|
||||||
A `HashMap<RawFd, Pid>` maps fds to parked actors. Cancellation (actor dies while waiting on IO)
|
A `HashMap<RawFd, Pid>` maps fds to parked actors. Cancellation (actor dies while waiting on IO)
|
||||||
deregisters the fd. This is intentionally simple and not pluggable; Loom is not a general async
|
deregisters the fd. This is intentionally simple and not pluggable; SMARM is not a general async
|
||||||
executor.
|
executor.
|
||||||
|
|
||||||
### Communication
|
### Communication
|
||||||
@@ -155,7 +155,7 @@ sensible global default.
|
|||||||
|
|
||||||
### Mutex timeout
|
### Mutex timeout
|
||||||
|
|
||||||
Every `loom::mutex` lock attempt is mediated by the scheduler. If the lock is not acquired within
|
Every `smarm::mutex` lock attempt is mediated by the scheduler. If the lock is not acquired within
|
||||||
a configurable timeout, the actor receives a `LockTimeout` error rather than parking forever. This
|
a configurable timeout, the actor receives a `LockTimeout` error rather than parking forever. This
|
||||||
is a hard runtime guarantee, not a convention. Default timeout is global and configurable;
|
is a hard runtime guarantee, not a convention. Default timeout is global and configurable;
|
||||||
individual locks and individual call sites can override it.
|
individual locks and individual call sites can override it.
|
||||||
@@ -165,9 +165,9 @@ individual locks and individual call sites can override it.
|
|||||||
Actors can spawn children and wait on a group of handles:
|
Actors can spawn children and wait on a group of handles:
|
||||||
|
|
||||||
```rust
|
```rust
|
||||||
let h1 = loom::spawn(|| compute_a());
|
let h1 = smarm::spawn(|| compute_a());
|
||||||
let h2 = loom::spawn(|| compute_b());
|
let h2 = smarm::spawn(|| compute_b());
|
||||||
let (a, b) = loom::join!(h1, h2);
|
let (a, b) = smarm::join!(h1, h2);
|
||||||
```
|
```
|
||||||
|
|
||||||
`join!` parks the calling actor until all handles complete. The last child to finish re-queues the
|
`join!` parks the calling actor until all handles complete. The last child to finish re-queues the
|
||||||
@@ -176,7 +176,7 @@ parent. This is a countdown in the parent's descriptor; no polling, no waker reg
|
|||||||
|
|
||||||
### Timer wheel
|
### Timer wheel
|
||||||
|
|
||||||
`loom::sleep` and supervision timeouts are driven by a timer wheel in the scheduler. Sleeping
|
`smarm::sleep` and supervision timeouts are driven by a timer wheel in the scheduler. Sleeping
|
||||||
actors are parked and re-queued by the timer thread on expiry. The timer wheel is internal
|
actors are parked and re-queued by the timer thread on expiry. The timer wheel is internal
|
||||||
infrastructure; its design is an implementation detail.
|
infrastructure; its design is an implementation detail.
|
||||||
|
|
||||||
@@ -189,22 +189,29 @@ infrastructure; its design is an implementation detail.
|
|||||||
- **Queue contention** — if `Mutex<HashMap>` proves to be a bottleneck under profiling, evaluate
|
- **Queue contention** — if `Mutex<HashMap>` proves to be a bottleneck under profiling, evaluate
|
||||||
`DashMap` or a lock-free work-stealing deque (e.g. `crossbeam-deque`). Not before.
|
`DashMap` or a lock-free work-stealing deque (e.g. `crossbeam-deque`). Not before.
|
||||||
- **AVX-512 context save** — extend `ContextSaveArea` when there is a concrete use case.
|
- **AVX-512 context save** — extend `ContextSaveArea` when there is a concrete use case.
|
||||||
- **`loom::sleep` vs raw sleep semantics** — further control knobs deferred until the basic sleep
|
- **`smarm::sleep` vs raw sleep semantics** — further control knobs deferred until the basic sleep
|
||||||
is working and real use cases are understood.
|
is working and real use cases are understood.
|
||||||
- **Supervision tree API** — the contract is defined; the recursive hierarchy, restart strategies,
|
- **Supervision tree API** — the contract is defined; the recursive hierarchy, restart strategies,
|
||||||
and introspection API are implementation work.
|
and introspection API are implementation work.
|
||||||
- **no_std support** — the assembly shim is no_std friendly but the IO thread and allocator require
|
- **no_std support** — the assembly shim is no_std friendly but the IO thread and allocator require
|
||||||
OS primitives. Target is no_std + `alloc` on hosted platforms; bare metal is out of scope.
|
OS primitives. Target is no_std + `alloc` on hosted platforms; bare metal is out of scope.
|
||||||
- **Distribution** — Loom is a single-process runtime. No distribution protocol, no BEAM-style
|
- **Distribution** — SMARM is a single-process runtime. No distribution protocol, no BEAM-style
|
||||||
clustering.
|
clustering.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## What Loom is Not
|
## What SMARM is Not
|
||||||
|
|
||||||
- Not a drop-in replacement for Tokio. Loom does not implement `Future` or the async executor interface.
|
- Not a drop-in replacement for Tokio. SMARM does not implement `Future` or the async executor interface.
|
||||||
- Not a general allocator. Loom manages actor stacks; heap allocation for actor data goes through
|
- Not a general allocator. SMARM manages actor stacks; heap allocation for actor data goes through
|
||||||
the system allocator.
|
the system allocator.
|
||||||
- Not Erlang. No hot code reloading, no distribution protocol, no BEAM bytecode. Loom is a
|
- Not Erlang. No hot code reloading, no distribution protocol, no BEAM bytecode. SMARM is a
|
||||||
concurrency runtime, not a platform.
|
concurrency runtime, not a platform.
|
||||||
- Not a real-time scheduler. Timeslice accuracy is best-effort.
|
- Not a real-time scheduler. Timeslice accuracy is best-effort.
|
||||||
|
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## On names
|
||||||
|
|
||||||
|
<sub>The name is a recursive acronym. The M is for Marks, as in the BEAM — Bogdan/Björn's Erlang Abstract Machine, the virtual machine that runs Erlang and Elixir. smarm is not the BEAM. It just admires it from a safe distance.</sub>
|
||||||
1297
docs/smarm - Deep Dive.html
Normal file
1297
docs/smarm - Deep Dive.html
Normal file
File diff suppressed because it is too large
Load Diff
@@ -98,12 +98,10 @@ impl<T> Sender<T> {
|
|||||||
g.parked_receiver.take()
|
g.parked_receiver.take()
|
||||||
};
|
};
|
||||||
if let Some(pid) = unpark {
|
if let Some(pid) = unpark {
|
||||||
let me = crate::actor::current_pid();
|
crate::te!(crate::trace::Event::Send { sender: crate::actor::current_pid().unwrap_or(crate::pid::Pid::new(u32::MAX, u32::MAX)), receiver: Some(pid) });
|
||||||
crate::te!(crate::trace::Event::Send { sender: me.unwrap_or(crate::pid::Pid::new(u32::MAX, u32::MAX)), receiver: Some(pid) });
|
|
||||||
crate::scheduler::unpark(pid);
|
crate::scheduler::unpark(pid);
|
||||||
} else {
|
} else {
|
||||||
let me = crate::actor::current_pid();
|
crate::te!(crate::trace::Event::Send { sender: crate::actor::current_pid().unwrap_or(crate::pid::Pid::new(u32::MAX, u32::MAX)), receiver: None });
|
||||||
crate::te!(crate::trace::Event::Send { sender: me.unwrap_or(crate::pid::Pid::new(u32::MAX, u32::MAX)), receiver: None });
|
|
||||||
}
|
}
|
||||||
Ok(())
|
Ok(())
|
||||||
}
|
}
|
||||||
@@ -132,9 +130,7 @@ impl<T> Receiver<T> {
|
|||||||
// Release the lock before parking — the unparker will need it.
|
// Release the lock before parking — the unparker will need it.
|
||||||
crate::scheduler::park_current();
|
crate::scheduler::park_current();
|
||||||
// Woken up — record it before looping to check the queue.
|
// Woken up — record it before looping to check the queue.
|
||||||
if let Some(me) = crate::actor::current_pid() {
|
crate::te!(crate::trace::Event::RecvWake(crate::actor::current_pid().unwrap()));
|
||||||
crate::te!(crate::trace::Event::RecvWake(me));
|
|
||||||
}
|
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|||||||
@@ -28,23 +28,42 @@
|
|||||||
use std::alloc::{GlobalAlloc, Layout, System};
|
use std::alloc::{GlobalAlloc, Layout, System};
|
||||||
use std::cell::Cell;
|
use std::cell::Cell;
|
||||||
|
|
||||||
const ALLOC_INTERVAL: u32 = 128;
|
pub const DEFAULT_ALLOC_INTERVAL: u32 = 128;
|
||||||
const TIMESLICE_CYCLES: u64 = 300_000; // ≈ 100µs on a 3 GHz CPU
|
pub const DEFAULT_TIMESLICE_CYCLES: u64 = 300_000; // ≈ 100µs on a 3 GHz CPU
|
||||||
|
|
||||||
thread_local! {
|
thread_local! {
|
||||||
/// While `false`, the allocator hook is a no-op.
|
/// While `false`, the allocator hook is a no-op.
|
||||||
pub static PREEMPTION_ENABLED: Cell<bool> = const { Cell::new(false) };
|
pub static PREEMPTION_ENABLED: Cell<bool> = const { Cell::new(false) };
|
||||||
|
|
||||||
/// Countdown to next RDTSC check. Reset to `ALLOC_INTERVAL` on resume.
|
/// Countdown to next RDTSC check. Reset to `ALLOC_INTERVAL` on resume.
|
||||||
static ALLOC_COUNT: Cell<u32> = const { Cell::new(ALLOC_INTERVAL) };
|
static ALLOC_COUNT: Cell<u32> = const { Cell::new(DEFAULT_ALLOC_INTERVAL) };
|
||||||
|
|
||||||
/// RDTSC value written by the scheduler on every actor resume.
|
/// RDTSC value written by the scheduler on every actor resume.
|
||||||
static TIMESLICE_START: Cell<u64> = const { Cell::new(0) };
|
static TIMESLICE_START: Cell<u64> = const { Cell::new(0) };
|
||||||
|
|
||||||
|
/// Per-thread copy of the configured alloc interval, written once at
|
||||||
|
/// scheduler-thread startup. Kept in a thread-local so the hot path
|
||||||
|
/// (`maybe_preempt`) pays only a TLS load, with no cache-coherency traffic.
|
||||||
|
static CONFIGURED_ALLOC_INTERVAL: Cell<u32> = const { Cell::new(DEFAULT_ALLOC_INTERVAL) };
|
||||||
|
|
||||||
|
/// Per-thread copy of the configured timeslice, written once at
|
||||||
|
/// scheduler-thread startup.
|
||||||
|
static CONFIGURED_TIMESLICE_CYCLES: Cell<u64> = const { Cell::new(DEFAULT_TIMESLICE_CYCLES) };
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Called once per scheduler thread at startup (before any actor runs).
|
||||||
|
/// Writes the runtime-configured preemption knobs into thread-locals so the
|
||||||
|
/// hot path reads them without any cross-thread coherency cost.
|
||||||
|
pub fn configure_preempt(alloc_interval: u32, timeslice_cycles: u64) {
|
||||||
|
CONFIGURED_ALLOC_INTERVAL.with(|c| c.set(alloc_interval));
|
||||||
|
CONFIGURED_TIMESLICE_CYCLES.with(|c| c.set(timeslice_cycles));
|
||||||
|
// Also prime the countdown so the first resume uses the right interval.
|
||||||
|
ALLOC_COUNT.with(|c| c.set(alloc_interval));
|
||||||
}
|
}
|
||||||
|
|
||||||
/// Arm the timeslice. Called by the scheduler on every resume.
|
/// Arm the timeslice. Called by the scheduler on every resume.
|
||||||
pub fn reset_timeslice() {
|
pub fn reset_timeslice() {
|
||||||
ALLOC_COUNT.with(|c| c.set(ALLOC_INTERVAL));
|
ALLOC_COUNT.with(|c| c.set(CONFIGURED_ALLOC_INTERVAL.with(|i| i.get())));
|
||||||
TIMESLICE_START.with(|c| c.set(rdtsc()));
|
TIMESLICE_START.with(|c| c.set(rdtsc()));
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -102,10 +121,10 @@ pub fn maybe_preempt() {
|
|||||||
ALLOC_COUNT.with(|c| {
|
ALLOC_COUNT.with(|c| {
|
||||||
let n = c.get();
|
let n = c.get();
|
||||||
if n == 0 {
|
if n == 0 {
|
||||||
c.set(ALLOC_INTERVAL);
|
c.set(CONFIGURED_ALLOC_INTERVAL.with(|i| i.get()));
|
||||||
if PREEMPTION_ENABLED.with(|e| e.get()) {
|
if PREEMPTION_ENABLED.with(|e| e.get()) {
|
||||||
let start = TIMESLICE_START.with(|s| s.get());
|
let start = TIMESLICE_START.with(|s| s.get());
|
||||||
if rdtsc().saturating_sub(start) > TIMESLICE_CYCLES {
|
if rdtsc().saturating_sub(start) > CONFIGURED_TIMESLICE_CYCLES.with(|t| t.get()) {
|
||||||
// SAFETY: reachable only inside an actor (the scheduler
|
// SAFETY: reachable only inside an actor (the scheduler
|
||||||
// sets PREEMPTION_ENABLED on resume and clears it on
|
// sets PREEMPTION_ENABLED on resume and clears it on
|
||||||
// return). The scheduler stack is therefore valid.
|
// return). The scheduler stack is therefore valid.
|
||||||
|
|||||||
@@ -31,8 +31,8 @@
|
|||||||
//! becomes a measured bottleneck.
|
//! becomes a measured bottleneck.
|
||||||
|
|
||||||
use crate::actor::{
|
use crate::actor::{
|
||||||
clear_current_pid, current_pid, is_actor_done, reset_actor_done,
|
clear_current_pid, is_actor_done, reset_actor_done, set_current_actor_box,
|
||||||
set_current_actor_box, set_current_pid, take_last_outcome, Actor, Outcome,
|
set_current_pid, take_last_outcome, Actor, Outcome,
|
||||||
};
|
};
|
||||||
use crate::channel::Sender;
|
use crate::channel::Sender;
|
||||||
use crate::context::{get_actor_sp, set_actor_sp, switch_to_actor};
|
use crate::context::{get_actor_sp, set_actor_sp, switch_to_actor};
|
||||||
@@ -70,13 +70,19 @@ pub struct Config {
|
|||||||
min: usize,
|
min: usize,
|
||||||
max: usize,
|
max: usize,
|
||||||
exact: Option<usize>,
|
exact: Option<usize>,
|
||||||
|
alloc_interval: u32,
|
||||||
|
timeslice_cycles: u64,
|
||||||
}
|
}
|
||||||
|
|
||||||
impl Config {
|
impl Config {
|
||||||
/// Exact thread count; takes precedence over min/max.
|
/// Exact thread count; takes precedence over min/max.
|
||||||
pub fn exact(n: usize) -> Self {
|
pub fn exact(n: usize) -> Self {
|
||||||
assert!(n >= 1, "scheduler thread count must be ≥ 1");
|
assert!(n >= 1, "scheduler thread count must be ≥ 1");
|
||||||
Self { min: n, max: n, exact: Some(n) }
|
Self {
|
||||||
|
min: n, max: n, exact: Some(n),
|
||||||
|
alloc_interval: crate::preempt::DEFAULT_ALLOC_INTERVAL,
|
||||||
|
timeslice_cycles: crate::preempt::DEFAULT_TIMESLICE_CYCLES,
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
/// Bounded range. Thread count = clamp(available_parallelism, min, max).
|
/// Bounded range. Thread count = clamp(available_parallelism, min, max).
|
||||||
@@ -86,7 +92,28 @@ impl Config {
|
|||||||
if let Some(e) = exact {
|
if let Some(e) = exact {
|
||||||
assert!(e >= 1, "exact must be ≥ 1");
|
assert!(e >= 1, "exact must be ≥ 1");
|
||||||
}
|
}
|
||||||
Self { min, max, exact }
|
Self {
|
||||||
|
min, max, exact,
|
||||||
|
alloc_interval: crate::preempt::DEFAULT_ALLOC_INTERVAL,
|
||||||
|
timeslice_cycles: crate::preempt::DEFAULT_TIMESLICE_CYCLES,
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/// How many allocations (or `smarm::check!()` calls) between RDTSC checks.
|
||||||
|
/// Lower = more responsive preemption, higher = less overhead.
|
||||||
|
/// Default: 128.
|
||||||
|
pub fn alloc_interval(mut self, n: u32) -> Self {
|
||||||
|
assert!(n >= 1, "alloc_interval must be ≥ 1");
|
||||||
|
self.alloc_interval = n;
|
||||||
|
self
|
||||||
|
}
|
||||||
|
|
||||||
|
/// How many TSC cycles constitute one timeslice.
|
||||||
|
/// Default: 300_000 (≈ 100µs on a 3 GHz CPU).
|
||||||
|
pub fn timeslice_cycles(mut self, n: u64) -> Self {
|
||||||
|
assert!(n >= 1, "timeslice_cycles must be ≥ 1");
|
||||||
|
self.timeslice_cycles = n;
|
||||||
|
self
|
||||||
}
|
}
|
||||||
|
|
||||||
/// The number of scheduler threads this config resolves to.
|
/// The number of scheduler threads this config resolves to.
|
||||||
@@ -106,7 +133,11 @@ impl Default for Config {
|
|||||||
let avail = thread::available_parallelism()
|
let avail = thread::available_parallelism()
|
||||||
.map(|n| n.get())
|
.map(|n| n.get())
|
||||||
.unwrap_or(1);
|
.unwrap_or(1);
|
||||||
Self { min: 1, max: avail, exact: None }
|
Self {
|
||||||
|
min: 1, max: avail, exact: None,
|
||||||
|
alloc_interval: crate::preempt::DEFAULT_ALLOC_INTERVAL,
|
||||||
|
timeslice_cycles: crate::preempt::DEFAULT_TIMESLICE_CYCLES,
|
||||||
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -270,10 +301,13 @@ pub(crate) struct RuntimeInner {
|
|||||||
/// Global counters for RFC 000 primitives.
|
/// Global counters for RFC 000 primitives.
|
||||||
pub(crate) io_parked: AtomicU32,
|
pub(crate) io_parked: AtomicU32,
|
||||||
pub(crate) sleeping: AtomicU32,
|
pub(crate) sleeping: AtomicU32,
|
||||||
|
/// Preemption knobs, written into each scheduler thread's locals on startup.
|
||||||
|
pub(crate) alloc_interval: u32,
|
||||||
|
pub(crate) timeslice_cycles: u64,
|
||||||
}
|
}
|
||||||
|
|
||||||
impl RuntimeInner {
|
impl RuntimeInner {
|
||||||
fn new(thread_count: usize) -> Arc<Self> {
|
fn new(thread_count: usize, alloc_interval: u32, timeslice_cycles: u64) -> Arc<Self> {
|
||||||
let stats = (0..thread_count).map(|_| SchedulerStats::new()).collect();
|
let stats = (0..thread_count).map(|_| SchedulerStats::new()).collect();
|
||||||
Arc::new(Self {
|
Arc::new(Self {
|
||||||
shared: Mutex::new(SharedState::new()),
|
shared: Mutex::new(SharedState::new()),
|
||||||
@@ -281,6 +315,8 @@ impl RuntimeInner {
|
|||||||
stats,
|
stats,
|
||||||
io_parked: AtomicU32::new(0),
|
io_parked: AtomicU32::new(0),
|
||||||
sleeping: AtomicU32::new(0),
|
sleeping: AtomicU32::new(0),
|
||||||
|
alloc_interval,
|
||||||
|
timeslice_cycles,
|
||||||
})
|
})
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -295,18 +331,6 @@ impl RuntimeInner {
|
|||||||
crate::preempt::PREEMPTION_ENABLED.with(|c| c.set(prev));
|
crate::preempt::PREEMPTION_ENABLED.with(|c| c.set(prev));
|
||||||
result
|
result
|
||||||
}
|
}
|
||||||
|
|
||||||
/// Returns `None` when the mutex is poisoned.
|
|
||||||
/// Used in `unpark` / channel Drop which can fire after teardown.
|
|
||||||
pub(crate) fn try_with_shared<R>(&self, f: impl FnOnce(&mut SharedState) -> R) -> Option<R> {
|
|
||||||
let prev = crate::preempt::PREEMPTION_ENABLED.with(|c| c.replace(false));
|
|
||||||
let result = match self.shared.lock() {
|
|
||||||
Ok(mut g) => Some(f(&mut g)),
|
|
||||||
Err(p) => Some(f(&mut p.into_inner())),
|
|
||||||
};
|
|
||||||
crate::preempt::PREEMPTION_ENABLED.with(|c| c.set(prev));
|
|
||||||
result
|
|
||||||
}
|
|
||||||
}
|
}
|
||||||
|
|
||||||
// ---------------------------------------------------------------------------
|
// ---------------------------------------------------------------------------
|
||||||
@@ -322,7 +346,7 @@ pub struct Runtime {
|
|||||||
pub fn init(config: Config) -> Runtime {
|
pub fn init(config: Config) -> Runtime {
|
||||||
let n = config.resolved_thread_count();
|
let n = config.resolved_thread_count();
|
||||||
Runtime {
|
Runtime {
|
||||||
inner: RuntimeInner::new(n),
|
inner: RuntimeInner::new(n, config.alloc_interval, config.timeslice_cycles),
|
||||||
thread_count: n,
|
thread_count: n,
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
@@ -526,6 +550,7 @@ fn finalize_actor(inner: &Arc<RuntimeInner>, pid: Pid, outcome: Outcome) {
|
|||||||
// ---------------------------------------------------------------------------
|
// ---------------------------------------------------------------------------
|
||||||
|
|
||||||
fn schedule_loop(inner: &Arc<RuntimeInner>, slot: usize) {
|
fn schedule_loop(inner: &Arc<RuntimeInner>, slot: usize) {
|
||||||
|
crate::preempt::configure_preempt(inner.alloc_interval, inner.timeslice_cycles);
|
||||||
let stats = &inner.stats[slot];
|
let stats = &inner.stats[slot];
|
||||||
|
|
||||||
loop {
|
loop {
|
||||||
|
|||||||
@@ -11,7 +11,7 @@ use crate::actor::current_pid;
|
|||||||
use crate::channel::Sender;
|
use crate::channel::Sender;
|
||||||
use crate::pid::Pid;
|
use crate::pid::Pid;
|
||||||
use crate::runtime::{
|
use crate::runtime::{
|
||||||
self, RuntimeInner, YieldIntent, ROOT_PID, RUNTIME,
|
self, RuntimeInner, YieldIntent, RUNTIME,
|
||||||
};
|
};
|
||||||
use crate::supervisor::Signal;
|
use crate::supervisor::Signal;
|
||||||
use std::sync::Arc;
|
use std::sync::Arc;
|
||||||
|
|||||||
@@ -15,10 +15,7 @@
|
|||||||
//! - Panic on one scheduler thread doesn't kill others
|
//! - Panic on one scheduler thread doesn't kill others
|
||||||
|
|
||||||
use smarm::{channel, runtime::{Config, Runtime}, spawn, yield_now, JoinHandle};
|
use smarm::{channel, runtime::{Config, Runtime}, spawn, yield_now, JoinHandle};
|
||||||
use std::sync::{
|
use std::sync::{atomic::{AtomicBool, AtomicU64, Ordering}, Arc};
|
||||||
atomic::{AtomicBool, AtomicU64, AtomicUsize, Ordering},
|
|
||||||
Arc, Barrier,
|
|
||||||
};
|
|
||||||
use std::time::Duration;
|
use std::time::Duration;
|
||||||
use std::collections::HashSet;
|
use std::collections::HashSet;
|
||||||
|
|
||||||
|
|||||||
Reference in New Issue
Block a user