feat: I/O and mutex support (v0.3)

Add epoll-based non-blocking I/O and kernel-like mutexes:
- src/io.rs: Complete epoll backend with timeout & error handling
- src/mutex.rs: Fair mutex with waiter queues & parking integration
- Enhanced scheduler to support synchronous I/O blocking
- Comprehensive test suites for I/O (epoll) and mutex behavior
- Documentation: LOOM.md concurrency model & README
This commit is contained in:
Claude
2026-05-23 16:09:29 +00:00
parent d3ab81b833
commit 8cbef1dfc1
11 changed files with 2032 additions and 146 deletions

445
src/io.rs
View File

@@ -1,35 +1,73 @@
//! Off-scheduler blocking work.
//! Off-scheduler IO: blocking-work offload and epoll-based fd readiness.
//!
//! `block_on_io(closure)` runs `closure` on a dedicated worker OS thread,
//! parks the calling actor in the meantime, and returns the closure's
//! value when it completes. Lets actors call into blocking C libraries,
//! synchronous file IO, or anything else that would otherwise stall the
//! scheduler thread.
//! synchronous file IO, or anything else that doesn't fit the readiness
//! model.
//!
//! `wait_readable(fd)` / `wait_writable(fd)` register interest in an fd
//! with epoll and park the calling actor. When the fd becomes ready, the
//! epoll thread unparks the actor. The actual `read(2)`/`write(2)` syscall
//! runs back on the scheduler thread, *inside* the actor — buffer never
//! leaves the actor, no copying through an intermediary thread. Built on
//! these are the conveniences `read(fd, &mut buf)` and `write(fd, &buf)`.
//!
//! Architecture
//! ============
//! Per `run()`:
//! - one worker OS thread, started by `run()` and joined at shutdown;
//! - a request channel (`mpsc::Sender<Request>`) from scheduler → worker;
//! - a completion queue (`Mutex<VecDeque<Completion>>`) worker → scheduler;
//! - a wake pipe: when the worker pushes a completion it writes one byte
//! to the pipe; the scheduler polls the pipe (with timeout) when it
//! would otherwise be idle.
//! Per `run()`, two OS threads:
//! - **epoll thread**: owns the epollfd. Loops in `epoll_wait`. On a
//! ready fd, pushes `Completion::FdReady { pid, fd, events }` to the
//! shared completion queue and writes the scheduler-wake pipe. On the
//! shutdown pipe (also registered in epollfd), exits.
//! - **pool thread**: blocks on the request mpsc. Runs the closure
//! inside `catch_unwind`, pushes `Completion::Blocking { pid, result }`,
//! writes the scheduler-wake pipe.
//!
//! For v0.2 the worker is a single thread, so concurrent `block_on_io`
//! calls are serialised. v0.3 can replace it with a thread pool behind
//! the same request channel.
//! Both threads share a single `completions: Arc<Mutex<VecDeque<Completion>>>`
//! and the same scheduler-wake pipe.
//!
//! `epoll_ctl` (register/unregister fd interest) is called by the
//! scheduler thread *directly* on the epollfd. That's well-defined per
//! `epoll_ctl(2)`: a thread may be calling `epoll_wait` on the epollfd
//! while another thread calls `epoll_ctl`. Avoids needing a second mpsc
//! and a second wake mechanism.
//!
//! Epoll mode
//! ==========
//! Level-triggered with EPOLLONESHOT. After a wakeup the kernel
//! auto-disarms the fd, so we never get two wakeups for one
//! `wait_readable` call. The scheduler explicitly `EPOLL_CTL_DEL`s the fd
//! on completion to free the slot for re-registration. Net effect: each
//! `wait_readable(fd)` is one ADD, one wakeup, one DEL — symmetric and
//! stateless between calls.
//!
//! Fd hygiene
//! ==========
//! If an actor dies while waiting on an fd, the registration is leaked
//! (the fd stays in the epollfd, armed). EPOLLONESHOT bounds the damage:
//! at most one stale wakeup, after which the kernel disarms. The stale
//! wakeup hits a dead pid in `waiters` and is dropped. Acceptable for v0.2;
//! a future pass should DEL on actor death.
//!
//! Buffers used with `read`/`write` should be on fds opened with
//! `O_NONBLOCK`. If they aren't, the syscall may block the scheduler
//! thread despite the readiness notification (the fd reporting readable
//! doesn't guarantee the syscall completes without blocking — e.g. a
//! signal could be delivered). Documented; not enforced.
//!
//! Panic handling
//! ==============
//! The worker runs the closure inside `catch_unwind` and ships either the
//! return value or the panic payload back to the scheduler. `block_on_io`
//! resumes the panic on the calling actor's stack, so the actor's
//! supervisor sees a real `Signal::Panic` as if the work had run inline.
//! The pool worker runs the closure inside `catch_unwind` and ships either
//! the return value or the panic payload back to the scheduler.
//! `block_on_io` resumes the panic on the calling actor's stack, so the
//! actor's supervisor sees a real `Signal::Panic` as if the work had run
//! inline. Fd-wait primitives don't run user code on the IO thread, so
//! they have no equivalent panic-propagation path.
use crate::pid::Pid;
use std::any::Any;
use std::collections::VecDeque;
use std::collections::{HashMap, VecDeque};
use std::io;
use std::os::fd::RawFd;
use std::panic;
@@ -41,7 +79,7 @@ use std::thread::JoinHandle as OsJoinHandle;
// Wire types
// ---------------------------------------------------------------------------
/// What the worker stores while computing a result. `Ok` is the closure's
/// What the pool stores while computing a result. `Ok` is the closure's
/// return value (boxed as `Any`); `Err` is the panic payload.
pub type IoResult = Result<Box<dyn Any + Send>, Box<dyn Any + Send>>;
@@ -51,9 +89,17 @@ struct Request {
work: Box<dyn FnOnce() -> IoResult + Send>,
}
struct Completion {
pid: Pid,
result: IoResult,
/// Completion message from either IO thread back to the scheduler.
pub enum Completion {
/// A `block_on_io` closure has finished (Ok = return value, Err = panic
/// payload).
Blocking { pid: Pid, result: IoResult },
/// An fd registered via `wait_readable`/`wait_writable` is ready. The
/// scheduler looks up the parked pid in `waiters`, unparks it, and
/// removes the entry. `pid` isn't in this variant because the epoll
/// thread doesn't have access to the `waiters` map; the scheduler
/// thread owns that.
FdReady { fd: RawFd, events: u32 },
}
// ---------------------------------------------------------------------------
@@ -61,61 +107,146 @@ struct Completion {
// ---------------------------------------------------------------------------
pub struct IoThread {
/// Channel into the worker.
// ----- Channels & queues -----
/// Submission queue into the blocking-work pool.
tx: mpsc::Sender<Request>,
/// Shared completion queue. The worker pushes; the scheduler drains.
/// Shared completion queue, fed by both the pool and the epoll thread.
completions: Arc<Mutex<VecDeque<Completion>>>,
/// Pipe used as a one-bit wakeup. `wake_read` is what the scheduler
/// polls; `wake_write` is what the worker writes to.
/// Pipe the scheduler polls in its idle path. Both IO threads write to
/// `wake_write` after pushing a completion.
wake_read: RawFd,
wake_write: RawFd,
/// Worker thread handle, joined on shutdown.
worker: Option<OsJoinHandle<()>>,
/// Number of requests in-flight (sent but not yet drained as a
/// completion). Used by the scheduler's idle path to decide whether
/// to wait on the pipe or exit.
// ----- Epoll machinery -----
/// The epollfd, owned by `IoThread`. Callable cross-thread via
/// `epoll_ctl` per the man page.
epollfd: RawFd,
/// Pipe used to signal the epoll thread to exit. Registered inside the
/// epollfd so a single `epoll_wait` covers both fd readiness and
/// shutdown.
shutdown_read: RawFd,
shutdown_write: RawFd,
/// One parked actor per registered fd. Populated by `wait_readable` /
/// `wait_writable` and drained by the scheduler when a `FdReady`
/// completion is processed.
pub waiters: HashMap<RawFd, Pid>,
// ----- Threads -----
pool_thread: Option<OsJoinHandle<()>>,
epoll_thread: Option<OsJoinHandle<()>>,
/// Number of `block_on_io` requests in-flight. Used by the scheduler's
/// idle path to decide whether to wait on the pipe or exit. Fd waits
/// are not counted here; they're counted by `waiters.len()`.
pub outstanding: u32,
}
impl IoThread {
pub fn start() -> io::Result<Self> {
// Scheduler-facing wake pipe.
let (wake_read, wake_write) = make_pipe()?;
// Pool submission channel + shared completion queue.
let (tx, rx) = mpsc::channel::<Request>();
let completions: Arc<Mutex<VecDeque<Completion>>> =
Arc::new(Mutex::new(VecDeque::new()));
let comps_worker = completions.clone();
let worker = std::thread::Builder::new()
.name("smarm-io".into())
.spawn(move || worker_loop(rx, comps_worker, wake_write))?;
// Epoll machinery.
let epollfd = unsafe { libc::epoll_create1(libc::EPOLL_CLOEXEC) };
if epollfd < 0 {
// Best-effort fd cleanup before bailing.
unsafe {
libc::close(wake_read);
libc::close(wake_write);
}
return Err(io::Error::last_os_error());
}
let (shutdown_read, shutdown_write) = match make_pipe() {
Ok(p) => p,
Err(e) => {
unsafe {
libc::close(epollfd);
libc::close(wake_read);
libc::close(wake_write);
}
return Err(e);
}
};
// Register the shutdown pipe in epollfd. We use a sentinel `data`
// value to recognise shutdown events. RawFd values are non-negative,
// so u64::MAX is unambiguously not a real fd-data encoding.
let mut shutdown_ev = libc::epoll_event {
events: libc::EPOLLIN as u32,
u64: SHUTDOWN_EPOLL_TOKEN,
};
if unsafe {
libc::epoll_ctl(
epollfd,
libc::EPOLL_CTL_ADD,
shutdown_read,
&mut shutdown_ev as *mut _,
)
} < 0
{
let e = io::Error::last_os_error();
unsafe {
libc::close(epollfd);
libc::close(shutdown_read);
libc::close(shutdown_write);
libc::close(wake_read);
libc::close(wake_write);
}
return Err(e);
}
// Spawn pool thread.
let pool_comps = completions.clone();
let pool_thread = std::thread::Builder::new()
.name("smarm-io-pool".into())
.spawn(move || pool_loop(rx, pool_comps, wake_write))?;
// Spawn epoll thread.
let epoll_comps = completions.clone();
let epoll_thread = std::thread::Builder::new()
.name("smarm-io-epoll".into())
.spawn(move || epoll_loop(epollfd, epoll_comps, wake_write))?;
Ok(Self {
tx,
completions,
wake_read,
wake_write,
worker: Some(worker),
epollfd,
shutdown_read,
shutdown_write,
waiters: HashMap::new(),
pool_thread: Some(pool_thread),
epoll_thread: Some(epoll_thread),
outstanding: 0,
})
}
/// Hand a request to the worker. Increments `outstanding`.
/// Hand a request to the pool. Increments `outstanding`.
pub fn submit(&mut self, pid: Pid, work: Box<dyn FnOnce() -> IoResult + Send>) {
self.outstanding += 1;
// Send can only fail if the worker has hung up, which only happens
// Send can only fail if the pool has hung up, which only happens
// on shutdown. submit during shutdown is a bug.
self.tx
.send(Request { pid, work })
.expect("io worker hung up unexpectedly");
.expect("io pool hung up unexpectedly");
}
/// Drain every available completion. Caller is responsible for
/// decrementing `outstanding` and routing the results.
pub fn drain_completions(&mut self) -> Vec<(Pid, IoResult)> {
/// Drain every available completion. Caller (the scheduler) routes the
/// results and updates `outstanding` / `waiters` accordingly.
pub fn drain_completions(&mut self) -> Vec<Completion> {
let mut q = self.completions.lock().unwrap();
let mut out = Vec::with_capacity(q.len());
while let Some(c) = q.pop_front() {
out.push((c.pid, c.result));
out.push(c);
}
out
}
@@ -123,78 +254,222 @@ impl IoThread {
pub fn wake_fd(&self) -> RawFd {
self.wake_read
}
/// Register interest in `fd` becoming readable/writable; record `pid`
/// as the parked waiter. The epoll thread will push a `FdReady`
/// completion when the kernel signals.
///
/// EPOLLONESHOT: one wakeup per registration. The scheduler must
/// `epoll_del` on completion to free the slot for re-registration.
pub fn epoll_register(
&mut self,
fd: RawFd,
pid: Pid,
readable: bool,
writable: bool,
) -> io::Result<()> {
// Two actors waiting on the same fd would be a misuse: the kernel
// delivers exactly one EPOLLONESHOT wakeup, so the second waiter
// would hang. Reject up front.
if self.waiters.contains_key(&fd) {
return Err(io::Error::new(
io::ErrorKind::AlreadyExists,
"fd already has a parked waiter",
));
}
// Defensive cleanup: if a previous actor died while waiting on this
// fd, the kernel-side registration was leaked (we don't walk all
// waiters on actor death). A bare DEL is harmless if the fd isn't
// registered (ENOENT), and removes any leak.
unsafe {
libc::epoll_ctl(self.epollfd, libc::EPOLL_CTL_DEL, fd, std::ptr::null_mut());
}
let mut events: u32 = libc::EPOLLONESHOT as u32;
if readable {
events |= libc::EPOLLIN as u32;
}
if writable {
events |= libc::EPOLLOUT as u32;
}
let mut ev = libc::epoll_event {
events,
u64: fd as u64,
};
let r = unsafe {
libc::epoll_ctl(self.epollfd, libc::EPOLL_CTL_ADD, fd, &mut ev as *mut _)
};
if r < 0 {
return Err(io::Error::last_os_error());
}
self.waiters.insert(fd, pid);
Ok(())
}
/// Remove `fd` from the epollfd. Called by the scheduler after a
/// `FdReady` completion, so the next `wait_readable(fd)` can ADD again.
///
/// Does NOT touch `waiters` — that's the scheduler's bookkeeping; this
/// is purely the kernel-side cleanup.
pub fn epoll_deregister(&mut self, fd: RawFd) {
// EPOLL_CTL_DEL of an already-removed fd returns ENOENT; ignore.
unsafe {
libc::epoll_ctl(self.epollfd, libc::EPOLL_CTL_DEL, fd, std::ptr::null_mut());
}
}
}
impl Drop for IoThread {
fn drop(&mut self) {
// Hang up the request channel; the worker will exit its loop.
// We must drop `tx` before joining. Take it out by moving.
// mpsc::Sender doesn't have explicit `disconnect`; dropping it
// (after this scope) causes the receiver to return Err.
//
// Trick: replace self.tx with a fresh dead one so we can drop it.
// 1. Signal the epoll thread to exit by writing the shutdown pipe.
unsafe {
let buf: [u8; 1] = [0];
// Single byte; we don't care about EINTR retry here — worst
// case the epoll thread blocks until process exit, which is
// fine because we then close fds out from under it.
libc::write(self.shutdown_write, buf.as_ptr() as *const _, 1);
}
// 2. Hang up the pool's request channel so the pool thread exits.
let (dead_tx, _) = mpsc::channel::<Request>();
let real_tx = std::mem::replace(&mut self.tx, dead_tx);
drop(real_tx);
if let Some(h) = self.worker.take() {
// Best-effort join. If the worker panicked, ignore.
// 3. Join both threads.
if let Some(h) = self.epoll_thread.take() {
let _ = h.join();
}
if let Some(h) = self.pool_thread.take() {
let _ = h.join();
}
// Close the pipe.
// 4. Close fds.
unsafe {
libc::close(self.epollfd);
libc::close(self.shutdown_read);
libc::close(self.shutdown_write);
libc::close(self.wake_read);
libc::close(self.wake_write);
}
}
}
/// Sentinel `epoll_event.u64` distinguishing the shutdown pipe from
/// registered actor fds. RawFd values fit in i32, so the high bits are
/// available for a marker; we use u64::MAX which can't be a valid fd.
const SHUTDOWN_EPOLL_TOKEN: u64 = u64::MAX;
// ---------------------------------------------------------------------------
// Worker loop
// Pool loop
// ---------------------------------------------------------------------------
fn worker_loop(
fn pool_loop(
rx: mpsc::Receiver<Request>,
completions: Arc<Mutex<VecDeque<Completion>>>,
wake_write: RawFd,
) {
while let Ok(Request { pid, work }) = rx.recv() {
let result: IoResult = match panic::catch_unwind(panic::AssertUnwindSafe(work)) {
Ok(r) => r,
Ok(r) => r,
Err(payload) => Err(payload),
};
completions.lock().unwrap().push_back(Completion { pid, result });
// Write one byte to the pipe to wake the scheduler. If the pipe
// buffer is full (scheduler isn't draining), the write may return
// EAGAIN — we'll ignore it because there's already an outstanding
// wakeup that hasn't been consumed yet.
let buf: [u8; 1] = [0];
unsafe {
// EINTR is the only retryable case worth handling.
loop {
let n = libc::write(wake_write, buf.as_ptr() as *const _, 1);
if n < 0 {
let e = *libc::__errno_location();
if e == libc::EINTR { continue; }
}
break;
completions
.lock()
.unwrap()
.push_back(Completion::Blocking { pid, result });
wake_scheduler(wake_write);
}
}
// ---------------------------------------------------------------------------
// Epoll loop
// ---------------------------------------------------------------------------
fn epoll_loop(
epollfd: RawFd,
completions: Arc<Mutex<VecDeque<Completion>>>,
wake_write: RawFd,
) {
// Buffer for epoll_wait. 64 is plenty for our scale; if a real load
// appears that needs more, this is a one-line change.
const MAX_EVENTS: usize = 64;
let mut events: [libc::epoll_event; MAX_EVENTS] = unsafe { std::mem::zeroed() };
loop {
let n = unsafe {
libc::epoll_wait(
epollfd,
events.as_mut_ptr(),
MAX_EVENTS as libc::c_int,
-1,
)
};
if n < 0 {
let e = unsafe { *libc::__errno_location() };
if e == libc::EINTR {
continue;
}
// Anything else here is a programming error (EBADF on epollfd
// after we've closed it from Drop — the close races with us).
// Treat as shutdown.
return;
}
let mut shutdown_requested = false;
let mut pushed_any = false;
{
let mut q = completions.lock().unwrap();
for ev in events.iter().take(n as usize) {
if ev.u64 == SHUTDOWN_EPOLL_TOKEN {
shutdown_requested = true;
continue;
}
let fd = ev.u64 as RawFd;
q.push_back(Completion::FdReady {
fd,
events: ev.events,
});
pushed_any = true;
}
}
if pushed_any {
wake_scheduler(wake_write);
}
if shutdown_requested {
return;
}
}
}
/// Write one byte to the scheduler's wake pipe. Retries on EINTR; ignores
/// EAGAIN (pipe full means there's already an outstanding wake we haven't
/// consumed yet, which is sufficient).
fn wake_scheduler(wake_write: RawFd) {
let buf: [u8; 1] = [0];
unsafe {
loop {
let n = libc::write(wake_write, buf.as_ptr() as *const _, 1);
if n < 0 {
let e = *libc::__errno_location();
if e == libc::EINTR {
continue;
}
}
break;
}
}
}
// ---------------------------------------------------------------------------
// Pipe helpers
// Pipe helpers (unchanged from v0.2)
// ---------------------------------------------------------------------------
fn make_pipe() -> io::Result<(RawFd, RawFd)> {
let mut fds: [libc::c_int; 2] = [0; 2];
// O_CLOEXEC so children don't inherit, O_NONBLOCK on the read side
// so the scheduler's drain can `read` without blocking.
let r = unsafe {
libc::pipe2(fds.as_mut_ptr(), libc::O_CLOEXEC | libc::O_NONBLOCK)
};
let r = unsafe { libc::pipe2(fds.as_mut_ptr(), libc::O_CLOEXEC | libc::O_NONBLOCK) };
if r != 0 {
return Err(io::Error::last_os_error());
}
@@ -208,7 +483,6 @@ pub fn drain_wake_pipe(fd: RawFd) {
loop {
let n = unsafe { libc::read(fd, buf.as_mut_ptr() as *mut _, buf.len()) };
if n <= 0 {
// EAGAIN (would block) or EOF — done.
break;
}
}
@@ -220,17 +494,26 @@ pub fn poll_wake(fd: RawFd, timeout: Option<std::time::Duration>) {
let timeout_ms: libc::c_int = match timeout {
None => -1,
Some(d) => {
// Cap at i32::MAX milliseconds; poll's argument is c_int.
let ms = d.as_millis();
if ms > i32::MAX as u128 { i32::MAX } else { ms as i32 }
if ms > i32::MAX as u128 {
i32::MAX
} else {
ms as i32
}
}
};
let mut pfd = libc::pollfd { fd, events: libc::POLLIN, revents: 0 };
let mut pfd = libc::pollfd {
fd,
events: libc::POLLIN,
revents: 0,
};
loop {
let r = unsafe { libc::poll(&mut pfd as *mut _, 1, timeout_ms) };
if r < 0 {
let e = unsafe { *libc::__errno_location() };
if e == libc::EINTR { continue; }
if e == libc::EINTR {
continue;
}
}
break;
}

View File

@@ -2,11 +2,14 @@
//!
//! Erlang-style green-thread actor concurrency for Rust.
//!
//! v0.1 is single-threaded. One scheduler, one OS thread. The scheduler
//! Single-threaded for now: one scheduler, one OS thread. The scheduler
//! cooperatively interleaves green-thread actors with hand-rolled context
//! switches. Actors communicate by sending `Send` messages over channels;
//! every actor has a supervisor, which is itself just an actor with a
//! `Receiver<Signal>`.
//! `Receiver<Signal>`. Synchronisation primitives — `Mutex<T>` with
//! mandatory lock timeouts, channel `recv`, `sleep`, and epoll-backed
//! `wait_readable`/`wait_writable` — all park the green thread, never
//! the OS thread.
//!
//! See `LOOM.md` for the design intent and the deferred-for-later list.
@@ -20,6 +23,7 @@ pub mod scheduler;
pub mod supervisor;
pub mod timer;
pub mod io;
pub mod mutex;
// ---------------------------------------------------------------------------
// Global allocator
@@ -37,10 +41,16 @@ static ALLOCATOR: preempt::PreemptingAllocator = preempt::PreemptingAllocator;
// ---------------------------------------------------------------------------
pub use channel::{channel, Receiver, RecvError, Sender};
pub use mutex::{LockTimeout, Mutex, MutexGuard};
pub use pid::Pid;
pub use scheduler::{
block_on_io, run, self_pid, sleep, spawn, spawn_under, yield_now, JoinError, JoinHandle,
block_on_io, run, self_pid, sleep, spawn, spawn_under, wait_readable, wait_writable,
yield_now, JoinError, JoinHandle,
};
// `read` and `write` would shadow heavily-used names if re-exported at the
// crate root; users reach for them as `smarm::scheduler::read` /
// `smarm::scheduler::write` instead. May reshuffle into a `smarm::io`
// surface in a future pass.
pub use supervisor::Signal;
// ---------------------------------------------------------------------------

318
src/mutex.rs Normal file
View File

@@ -0,0 +1,318 @@
//! Actor-aware mutex with mandatory timeout.
//!
//! `loom::Mutex<T>` looks like `std::sync::Mutex<T>` but its `lock()` parks
//! the calling *green* thread on contention rather than blocking the OS
//! thread — and every lock attempt is bounded by a timeout. If the lock is
//! not acquired within the timeout, `lock()` returns `Err(LockTimeout)`.
//! This is a hard runtime guarantee (the spec calls it out): no actor can
//! be parked on a mutex forever.
//!
//! ```ignore
//! let m = loom::Mutex::new(42);
//! let guard = m.lock()?; // default timeout
//! let guard = m.lock_timeout(Duration::from_millis(50))?;
//! ```
//!
//! Fairness
//! ========
//! Waiters are granted the lock in FIFO order. The spec prizes fairness:
//! starvation under contention is precisely the kind of failure mode
//! supervision can't recover from cleanly. LIFO would be faster on cache
//! locality and is not offered.
//!
//! Poisoning
//! =========
//! Unlike `std::sync::Mutex`, `loom::Mutex` does not poison on panic. If a
//! holder panics while holding the lock, the next waiter receives the
//! (now-untouched) value. Rationale: supervision handles the panic at the
//! actor level; a separate poisoning channel is redundant and adds an
//! error case to every `lock()`. Users who care about "the value may be in
//! an inconsistent state after a panic" should encode that in `T` itself
//! (e.g. `Mutex<Option<State>>` and `take()` the value at the start of
//! each critical section).
//!
//! Reentrance
//! ==========
//! Not reentrant. An actor that already holds the lock and calls `lock()`
//! again on the same mutex will wait on its own grant — and time out. This
//! is a bug in the caller, not a feature.
//!
//! Multi-threading note
//! ====================
//! The current implementation uses `Rc<RefCell<…>>` internals because the
//! v0.2 scheduler is single-threaded. The public API is identical to what
//! the eventual multi-threaded version will expose; the migration replaces
//! the `Rc<RefCell>` with `Arc<sync::Mutex>` around bookkeeping (waiters
//! queue, holder pid) — the *value* itself never goes through a blocking
//! OS-level lock, because contention always parks the green thread first.
//! No `unsafe impl Send` games today: `loom::Mutex<T>` is `!Send` on v0.2,
//! which is correct given there is only one OS thread.
use crate::pid::Pid;
use crate::scheduler;
use crate::timer::{self, TimerTarget};
use std::cell::{Cell, RefCell};
use std::collections::VecDeque;
use std::rc::Rc;
use std::time::Duration;
/// 30 seconds. Override per-call with `lock_timeout`, or per-mutex (TODO)
/// once the supervisor-level policy hook lands.
pub const DEFAULT_TIMEOUT: Duration = Duration::from_secs(30);
#[derive(Debug, PartialEq, Eq, Clone, Copy)]
pub struct LockTimeout;
impl std::fmt::Display for LockTimeout {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
write!(f, "mutex lock timed out")
}
}
impl std::error::Error for LockTimeout {}
// ---------------------------------------------------------------------------
// Internals
// ---------------------------------------------------------------------------
/// A pending lock attempt. Sits in `MutexCore::state.waiters` from the
/// moment an actor parks until it is either granted the lock (popped by
/// `MutexGuard::drop`) or times out (popped by `on_timeout`).
struct Wait {
pid: Pid,
/// Per-mutex monotonic sequence. Lets `on_timeout` recognise "this
/// specific wait" vs. "a later wait by the same pid on the same
/// mutex" — important because a single actor can re-acquire and then
/// re-wait, and we don't want a stale timer firing to disturb the new
/// wait.
seq: u64,
}
/// The non-generic part of the mutex. Lives inside `Rc<>` so it can also
/// be stashed (as `Rc<dyn TimerTarget>`) inside a timer entry.
struct MutexCore {
state: RefCell<MutexState>,
default_timeout: Cell<Duration>,
}
struct MutexState {
holder: Option<Pid>,
waiters: VecDeque<Wait>,
next_seq: u64,
}
impl MutexCore {
fn new(default_timeout: Duration) -> Self {
Self {
state: RefCell::new(MutexState {
holder: None,
waiters: VecDeque::new(),
next_seq: 0,
}),
default_timeout: Cell::new(default_timeout),
}
}
}
impl TimerTarget for MutexCore {
fn on_timeout(&self, pid: Pid, wait_seq: u64) {
// Remove the waiter with this seq, if it's still queued. If it's
// gone the lock was already granted to this actor before the timer
// popped — the actor will return normally; do nothing.
let removed = {
let mut st = self.state.borrow_mut();
if let Some(pos) = st.waiters.iter().position(|w| w.seq == wait_seq) {
st.waiters.remove(pos);
true
} else {
false
}
};
if removed {
// The actor is parked, waiting on us. Wake it up; `lock_timeout`
// will resume, observe `holder != Some(self)`, and return
// LockTimeout.
scheduler::unpark(pid);
}
}
}
// ---------------------------------------------------------------------------
// Public API
// ---------------------------------------------------------------------------
pub struct Mutex<T> {
core: Rc<MutexCore>,
/// `None` while the lock is held; `Some(T)` while free or while a
/// grantee is in the gap between unpark and resumption.
value: Rc<RefCell<Option<T>>>,
}
impl<T> Mutex<T> {
pub fn new(value: T) -> Self {
Self {
core: Rc::new(MutexCore::new(DEFAULT_TIMEOUT)),
value: Rc::new(RefCell::new(Some(value))),
}
}
/// Set the default timeout used by `lock()`. Does not affect in-flight
/// `lock_timeout` calls.
pub fn set_default_timeout(&self, timeout: Duration) {
self.core.default_timeout.set(timeout);
}
/// Acquire the lock, blocking the calling actor until it's granted or
/// the default timeout expires.
pub fn lock(&self) -> Result<MutexGuard<'_, T>, LockTimeout> {
self.lock_timeout(self.core.default_timeout.get())
}
/// Acquire the lock with an explicit timeout.
pub fn lock_timeout(&self, timeout: Duration) -> Result<MutexGuard<'_, T>, LockTimeout> {
let me = scheduler::self_pid();
// Fast path: nobody holds it. Mark ourselves as holder, take the
// value out, return a guard.
{
let mut st = self.core.state.borrow_mut();
if st.holder.is_none() {
st.holder = Some(me);
drop(st);
let value = self
.value
.borrow_mut()
.take()
.expect("Mutex: value missing on free fast path");
return Ok(MutexGuard {
mutex: self,
value: Some(value),
});
}
}
// Slow path: register as a waiter, schedule a timeout, park.
// No preemption during prep-to-park — see scheduler::NoPreempt.
let _np = scheduler::NoPreempt::enter();
let seq = {
let mut st = self.core.state.borrow_mut();
let seq = st.next_seq;
st.next_seq = st.next_seq.wrapping_add(1);
st.waiters.push_back(Wait { pid: me, seq });
seq
};
let target: Rc<dyn TimerTarget> = self.core.clone();
let deadline = timer::deadline_from_now(timeout);
scheduler::insert_wait_timer(deadline, me, target, seq);
scheduler::park_current();
// Resumed. Two possibilities:
// (a) MutexGuard::drop on the previous holder popped us off the
// waiters queue, set core.holder = me, and unparked us.
// => self.value is Some, we take it and return Ok.
// (b) on_timeout fired: it removed us from waiters and unparked
// us, but did NOT set holder. core.holder is whatever it was
// (Some(other) or None). => return Err.
let is_holder = self.core.state.borrow().holder == Some(me);
if is_holder {
let value = self
.value
.borrow_mut()
.take()
.expect("Mutex: value missing after grant");
Ok(MutexGuard {
mutex: self,
value: Some(value),
})
} else {
Err(LockTimeout)
}
}
/// Non-blocking attempt. Returns `Some` if the lock was free, `None`
/// otherwise. Useful as a fast path before a long-running computation,
/// or for tests.
pub fn try_lock(&self) -> Option<MutexGuard<'_, T>> {
let mut st = self.core.state.borrow_mut();
if st.holder.is_some() {
return None;
}
let me = scheduler::self_pid();
st.holder = Some(me);
drop(st);
let value = self
.value
.borrow_mut()
.take()
.expect("Mutex: value missing on try_lock free path");
Some(MutexGuard {
mutex: self,
value: Some(value),
})
}
}
impl<T> Clone for Mutex<T> {
/// Cloning a `Mutex<T>` clones the handle, not the protected value —
/// both clones refer to the same lock state and the same `T`.
fn clone(&self) -> Self {
Self {
core: self.core.clone(),
value: self.value.clone(),
}
}
}
// ---------------------------------------------------------------------------
// Guard
// ---------------------------------------------------------------------------
pub struct MutexGuard<'a, T> {
mutex: &'a Mutex<T>,
/// The protected value, taken out of `mutex.value` while the guard is
/// alive. `Option` only so `Drop` can put it back; in normal use this
/// is always `Some` while the guard is observable.
value: Option<T>,
}
impl<T> std::ops::Deref for MutexGuard<'_, T> {
type Target = T;
fn deref(&self) -> &T {
self.value.as_ref().expect("MutexGuard: value missing")
}
}
impl<T> std::ops::DerefMut for MutexGuard<'_, T> {
fn deref_mut(&mut self) -> &mut T {
self.value.as_mut().expect("MutexGuard: value missing")
}
}
impl<T> Drop for MutexGuard<'_, T> {
fn drop(&mut self) {
// Put the value back into the mutex.
let v = self.value.take().expect("MutexGuard: double drop");
*self.mutex.value.borrow_mut() = Some(v);
// Pick the next waiter (if any) and grant it the lock by writing
// its pid into `holder` *before* unparking. The grantee, on
// resumption, will see `holder == self_pid` and take the value.
let next_pid = {
let mut st = self.mutex.core.state.borrow_mut();
let next = st.waiters.pop_front();
match next {
Some(w) => {
st.holder = Some(w.pid);
Some(w.pid)
}
None => {
st.holder = None;
None
}
}
};
if let Some(pid) = next_pid {
scheduler::unpark(pid);
}
}
}

View File

@@ -344,16 +344,69 @@ pub fn park_current() {
unsafe { crate::context::switch_to_scheduler() };
}
/// RAII guard that disables allocator-driven preemption for its lifetime.
///
/// The "prep-to-park" hazard described in `preempt.rs`: a primitive that
/// (a) registers an unparker (channel waiter slot, fd waiter map, mutex
/// waiter queue, …) and then (b) calls `park_current()` must not yield
/// between (a) and (b). If it does, an early unpark fires while the actor
/// is still Runnable, the unpark no-ops, and then the actor parks with no
/// one to wake it.
///
/// Library code wraps the prep + park in `let _g = NoPreempt::enter();`
/// and the guard is held until just after `park_current` returns (or
/// dropped earlier, immediately before `park_current`, since `park_current`
/// itself returns control to the scheduler which disables preemption on
/// its own path).
pub struct NoPreempt(bool);
impl NoPreempt {
pub fn enter() -> Self {
let prev = crate::preempt::PREEMPTION_ENABLED.with(|c| c.replace(false));
NoPreempt(prev)
}
}
impl Drop for NoPreempt {
fn drop(&mut self) {
crate::preempt::PREEMPTION_ENABLED.with(|c| c.set(self.0));
}
}
/// Park the current actor for at least `duration`. A zero duration behaves
/// like `yield_now` (the deadline is immediately in the past, so the timer
/// pops on the next scheduler iteration).
pub fn sleep(duration: std::time::Duration) {
let me = current_pid().expect("sleep() called outside an actor");
let _np = NoPreempt::enter();
let deadline = crate::timer::deadline_from_now(duration);
with_sched(|s| s.timers.insert(deadline, me));
with_sched(|s| s.timers.insert_sleep(deadline, me));
park_current();
}
/// Insert a `WaitTimeout` timer entry. Library code (`Mutex::lock_timeout`
/// today, future bounded-wait primitives) calls this just before
/// `park_current()` so that if the wait isn't satisfied by `deadline`,
/// `target.on_timeout(pid, wait_seq)` will fire.
///
/// Cancellation: not needed. If the wait is satisfied early, the entry is
/// still in the heap and will pop in due course; `on_timeout` is expected
/// to be idempotent on stale-seq.
pub fn insert_wait_timer(
deadline: std::time::Instant,
pid: Pid,
target: std::rc::Rc<dyn crate::timer::TimerTarget>,
wait_seq: u64,
) {
with_sched(|s| {
s.timers.insert(
deadline,
pid,
crate::timer::Reason::WaitTimeout { target, wait_seq },
);
});
}
/// Run `f` on the IO worker thread, park the current actor while it runs,
/// and return `f`'s value when it completes. Panics inside `f` propagate
/// to the calling actor.
@@ -377,12 +430,14 @@ where
Ok(Box::new(v) as Box<dyn std::any::Any + Send>)
});
with_sched(|s| {
let io = s.io.as_mut().expect("io thread not started");
io.submit(me, work);
});
park_current();
{
let _np = NoPreempt::enter();
with_sched(|s| {
let io = s.io.as_mut().expect("io thread not started");
io.submit(me, work);
});
park_current();
}
// On resume, our slot has a result waiting.
let result = with_sched(|s| {
@@ -401,6 +456,83 @@ where
}
}
// ---------------------------------------------------------------------------
// Fd-readiness primitives.
//
// `wait_readable(fd)` / `wait_writable(fd)` register interest with the
// epoll thread, park the calling actor, and return when the kernel
// signals readiness. The subsequent syscall (`read`/`write`) is done on
// the actor's own thread by the caller — no buffer crosses an actor
// boundary.
//
// Fds passed in should be O_NONBLOCK; see io.rs module docs.
// ---------------------------------------------------------------------------
/// Park the calling actor until `fd` is readable.
pub fn wait_readable(fd: std::os::fd::RawFd) -> std::io::Result<()> {
wait_fd(fd, /*readable=*/ true, /*writable=*/ false)
}
/// Park the calling actor until `fd` is writable.
pub fn wait_writable(fd: std::os::fd::RawFd) -> std::io::Result<()> {
wait_fd(fd, /*readable=*/ false, /*writable=*/ true)
}
fn wait_fd(fd: std::os::fd::RawFd, readable: bool, writable: bool) -> std::io::Result<()> {
let me = current_pid().expect("wait_*() called outside an actor");
// Register with the epoll thread. If registration fails (bad fd,
// already-parked waiter, OOM in the kernel), return the error
// without parking — the actor never went to sleep.
let _np = NoPreempt::enter();
with_sched(|s| {
let io = s.io.as_mut().expect("io thread not started");
io.epoll_register(fd, me, readable, writable)
})?;
park_current();
// On resume, the scheduler has already removed `fd` from `waiters`
// and DEL'd it from epollfd. There is no per-call return value;
// success here just means "fd is ready, go do your syscall".
//
// Note: there is no error path on resume because v0.2 doesn't time
// out fd waits and doesn't otherwise spurious-wake. If those are
// added, this function grows a non-trivial return.
Ok(())
}
/// Wait until `fd` is readable, then run a single `read(2)`. Returns the
/// number of bytes read, or an `io::Error` from the syscall.
///
/// `fd` should be opened `O_NONBLOCK`. With a blocking fd, the kernel's
/// readiness signal does not guarantee a non-blocking read — a signal
/// could interrupt, and the actor's syscall would then stall the
/// scheduler thread.
pub fn read(fd: std::os::fd::RawFd, buf: &mut [u8]) -> std::io::Result<usize> {
wait_readable(fd)?;
let n = unsafe {
libc::read(fd, buf.as_mut_ptr() as *mut _, buf.len())
};
if n < 0 {
Err(std::io::Error::last_os_error())
} else {
Ok(n as usize)
}
}
/// Wait until `fd` is writable, then run a single `write(2)`.
pub fn write(fd: std::os::fd::RawFd, buf: &[u8]) -> std::io::Result<usize> {
wait_writable(fd)?;
let n = unsafe {
libc::write(fd, buf.as_ptr() as *const _, buf.len())
};
if n < 0 {
Err(std::io::Error::last_os_error())
} else {
Ok(n as usize)
}
}
/// Wake a parked actor. If the actor isn't parked (already runnable or done)
/// this is a no-op — that's important; channel and join can both fire
/// spurious unparks under some orderings and we want them to be cheap.
@@ -488,41 +620,95 @@ pub const ROOT_PID: Pid = Pid::new(u32::MAX, u32::MAX);
fn schedule_loop() {
loop {
// 1. Drain due timers into the run queue.
// 1. Drain due timers and dispatch by reason.
//
// Sleep — unpark the actor (idempotently: only if still
// parked).
// WaitTimeout — call the target's on_timeout. The target decides
// whether the wait was still in progress (timer
// won the race) or had been fulfilled (the thing
// the actor was waiting for arrived first → no-op).
// The target is responsible for calling unpark()
// if appropriate.
let now = std::time::Instant::now();
let due = with_sched(|s| s.timers.pop_due(now));
for pid in due {
// Same idempotency as `unpark`: only re-queue if still parked.
with_sched(|s| {
if let Some(slot) = s.slot_mut(pid) {
if matches!(slot.state, State::Parked) {
slot.state = State::Runnable;
s.run_queue.push_back(pid);
}
for entry in due {
match entry.reason {
crate::timer::Reason::Sleep => {
with_sched(|s| {
if let Some(slot) = s.slot_mut(entry.pid) {
if matches!(slot.state, State::Parked) {
slot.state = State::Runnable;
s.run_queue.push_back(entry.pid);
}
}
});
}
});
crate::timer::Reason::WaitTimeout { target, wait_seq } => {
// Note: the target callback runs *outside* with_sched.
// It may call back into the scheduler (e.g. unpark), so
// we must not hold the SCHED borrow across it.
target.on_timeout(entry.pid, wait_seq);
}
}
}
// 2. Drain IO completions: route each result to its slot and
// unpark the actor. Drain even when we have other runnables —
// it's cheap (a try_lock of the completion queue) and keeps
// pending_io_result freshness bounded.
// 2. Drain IO completions: route each result by variant.
//
// Blocking — a `block_on_io` closure finished. Stash the result
// on the actor's slot and unpark.
// FdReady — an fd registered via `wait_readable`/`wait_writable`
// is ready. Look up the parked pid in the io thread's
// waiters map, deregister the fd, unpark.
//
// Drain even when we have other runnables — it's cheap and keeps
// `pending_io_result` / `waiters` freshness bounded.
let completions = with_sched(|s| {
s.io.as_mut().map(|io| io.drain_completions()).unwrap_or_default()
});
for (pid, result) in completions {
with_sched(|s| {
if let Some(io) = s.io.as_mut() {
io.outstanding = io.outstanding.saturating_sub(1);
for completion in completions {
match completion {
crate::io::Completion::Blocking { pid, result } => {
with_sched(|s| {
if let Some(io) = s.io.as_mut() {
io.outstanding = io.outstanding.saturating_sub(1);
}
if let Some(slot) = s.slot_mut(pid) {
slot.pending_io_result = Some(result);
if matches!(slot.state, State::Parked) {
slot.state = State::Runnable;
s.run_queue.push_back(pid);
}
}
});
}
if let Some(slot) = s.slot_mut(pid) {
slot.pending_io_result = Some(result);
if matches!(slot.state, State::Parked) {
slot.state = State::Runnable;
s.run_queue.push_back(pid);
}
crate::io::Completion::FdReady { fd, events: _ } => {
with_sched(|s| {
let parked_pid = s.io.as_mut()
.and_then(|io| {
let pid = io.waiters.remove(&fd);
// Deregister the fd from epollfd; the
// EPOLLONESHOT already disarmed it but the
// slot is still occupied until we DEL.
io.epoll_deregister(fd);
pid
});
if let Some(pid) = parked_pid {
if let Some(slot) = s.slot_mut(pid) {
if matches!(slot.state, State::Parked) {
slot.state = State::Runnable;
s.run_queue.push_back(pid);
}
}
// else: actor died between registering and the
// fd firing. Nothing to do; the registration
// has been cleaned up.
}
// else: fd not in waiters — probably a duplicate
// FdReady from a previous registration, ignore.
});
}
});
}
}
// 3. Pop a runnable actor. If none, decide whether to block on
@@ -536,11 +722,19 @@ fn schedule_loop() {
// trying to take the completions mutex, which is fine,
// but the scheduler thread itself mustn't hold SCHED
// borrowed across a blocking syscall.
//
// "Outstanding" here means *anything* the IO thread is
// expected to deliver a wakeup for: in-flight blocking
// calls AND parked fd waiters. If either is non-zero we
// must wait for the IO thread, not exit.
let (next_deadline, io_outstanding, wake_fd) = with_sched(|s| {
let next = s.timers.peek_deadline();
let (out, fd) = match s.io.as_ref() {
Some(io) => (io.outstanding, Some(io.wake_fd())),
None => (0, None),
Some(io) => (
io.outstanding + io.waiters.len() as u32,
Some(io.wake_fd()),
),
None => (0, None),
};
(next, out, fd)
});

View File

@@ -1,38 +1,86 @@
//! Sleep timers.
//! Sleep + wait-with-timeout timers.
//!
//! A min-heap of `(deadline, Pid)` entries lives on `SchedulerState`. When
//! an actor calls `sleep`, the runtime inserts the entry, marks the actor
//! parked, and yields. On every scheduler loop iteration the runtime pops
//! all entries whose deadline has passed and unparks them. When the run
//! queue is empty but the heap is not, the runtime sleeps the OS thread
//! until the soonest deadline, then re-checks.
//! A min-heap of `(deadline, seq, reason)` entries lives on `SchedulerState`.
//! When an actor sleeps or starts a bounded wait (e.g. `mutex.lock()` with a
//! timeout), the runtime inserts an entry, marks the actor parked, and yields.
//! On every scheduler loop iteration the runtime pops all entries whose
//! deadline has passed and dispatches each according to its `Reason`:
//!
//! `BinaryHeap` is a max-heap, so entries are stored with their deadline
//! wrapped in `Reverse` to get min-heap behaviour.
//! - `Sleep`: unpark the actor.
//! - `WaitTimeout`: call `on_timeout` on the registered target. The target
//! (e.g. a `Mutex`) decides whether the actor was actually still waiting
//! (timer fires first → unpark with error) or had already been granted
//! what it was waiting for (lock granted first → no-op).
//!
//! Stale pids (slot reused since the timer was inserted) are detected on
//! `due_pids` pop and silently dropped — same convention as the run queue.
//! `BinaryHeap` is a max-heap; entries are wrapped in `Reverse` to get
//! min-heap behaviour.
//!
//! No cancellation. When a non-timer wakeup happens (e.g. lock granted
//! before timeout), the timer entry is left in the heap. It will be popped
//! eventually and the dispatch will observe "actor is no longer parked /
//! wait_seq is stale" and no-op. Cost is ~32 bytes per stale entry plus a
//! few cycles on pop; acceptable given the upper bound is "one entry per
//! parked actor".
//!
//! Stale pids (slot reused since the timer was inserted) are filtered on
//! pop by the scheduler — same convention as the run queue.
use crate::pid::Pid;
use std::cmp::Reverse;
use std::collections::BinaryHeap;
use std::rc::Rc;
use std::time::{Duration, Instant};
#[derive(PartialEq, Eq)]
/// What to do when a timer entry's deadline arrives.
///
/// Held inside `Entry`, dispatched by the scheduler in `pop_due`.
pub enum Reason {
/// `loom::sleep(d)`. Unpark `pid` unconditionally (modulo the usual
/// "still parked?" check the scheduler applies).
Sleep,
/// A bounded wait — currently only `Mutex::lock_timeout`. On expiry the
/// scheduler calls `target.on_timeout(pid, wait_seq)`. The target then
/// decides whether `pid` was actually still waiting, and if so unparks
/// it with whatever error the wait was bounded for. `wait_seq` lets the
/// target tell apart "this wait" from "a later wait by the same actor
/// on the same target".
WaitTimeout {
target: Rc<dyn TimerTarget>,
wait_seq: u64,
},
}
/// Callback the scheduler invokes when a `WaitTimeout` entry pops.
///
/// Implementors: do not touch `SchedulerState` other than via the public
/// `unpark` / channel APIs. The scheduler is mid-iteration when this fires.
pub trait TimerTarget {
fn on_timeout(&self, pid: Pid, wait_seq: u64);
}
pub struct Entry {
pub deadline: Instant,
/// Insertion order, used purely as a tiebreaker so `Entry: Ord` works
/// without having to compare the `Reason` payload (which contains an
/// `Rc<dyn TimerTarget>` and isn't `Ord`).
seq: u64,
pub pid: Pid,
pub reason: Reason,
}
impl PartialEq for Entry {
fn eq(&self, other: &Self) -> bool {
self.deadline == other.deadline && self.seq == other.seq
}
}
impl Eq for Entry {}
impl Ord for Entry {
fn cmp(&self, other: &Self) -> std::cmp::Ordering {
// Only `deadline` matters for ordering; pid is a tiebreaker so the
// type is Ord, but the order among same-deadline entries is
// irrelevant.
self.deadline
.cmp(&other.deadline)
.then_with(|| self.pid.index().cmp(&other.pid.index()))
.then_with(|| self.pid.generation().cmp(&other.pid.generation()))
// Earlier deadline first; ties broken by insertion order so the
// ordering is total. `Reason` and `Pid` deliberately don't
// participate.
self.deadline.cmp(&other.deadline).then_with(|| self.seq.cmp(&other.seq))
}
}
@@ -46,15 +94,25 @@ impl PartialOrd for Entry {
pub struct Timers {
/// Reverse-wrapped so the smallest deadline is at the top.
heap: BinaryHeap<Reverse<Entry>>,
/// Monotonic counter for the tiebreaker `seq` field.
next_seq: u64,
}
impl Timers {
pub fn new() -> Self {
Self { heap: BinaryHeap::new() }
Self { heap: BinaryHeap::new(), next_seq: 0 }
}
pub fn insert(&mut self, deadline: Instant, pid: Pid) {
self.heap.push(Reverse(Entry { deadline, pid }));
/// Insert a `Sleep` timer. Convenience for the common case.
pub fn insert_sleep(&mut self, deadline: Instant, pid: Pid) {
self.insert(deadline, pid, Reason::Sleep);
}
/// Insert an arbitrary timer entry.
pub fn insert(&mut self, deadline: Instant, pid: Pid, reason: Reason) {
let seq = self.next_seq;
self.next_seq = self.next_seq.wrapping_add(1);
self.heap.push(Reverse(Entry { deadline, seq, pid, reason }));
}
pub fn is_empty(&self) -> bool {
@@ -66,13 +124,13 @@ impl Timers {
self.heap.peek().map(|r| r.0.deadline)
}
/// Pop and return every pid whose deadline is ≤ `now`.
pub fn pop_due(&mut self, now: Instant) -> Vec<Pid> {
/// Pop every entry whose deadline is ≤ `now`, in deadline order.
/// The scheduler dispatches each entry by inspecting `entry.reason`.
pub fn pop_due(&mut self, now: Instant) -> Vec<Entry> {
let mut out = Vec::new();
while let Some(r) = self.heap.peek() {
if r.0.deadline <= now {
let e = self.heap.pop().unwrap().0;
out.push(e.pid);
out.push(self.heap.pop().unwrap().0);
} else {
break;
}
@@ -81,7 +139,7 @@ impl Timers {
}
}
/// Wall-clock duration helper exposed for `sleep`.
/// Wall-clock duration helper exposed for `sleep` and `lock_timeout`.
pub fn deadline_from_now(duration: Duration) -> Instant {
Instant::now()
.checked_add(duration)