-
-
Notifications
You must be signed in to change notification settings - Fork 1.5k
[Linux, macOS] Expose memory pressure metrics (PSI + macOS sysctl) #2725
Description
Extends #1932 — broadens the Linux-only PSI request to a cross-platform memory pressure API.
The problem
There is currently no way to detect memory pressure (thrashing, reclaim stalls) through psutil. The existing virtual_memory().available tells you how much free memory remains, but not whether the system is struggling to serve its current workload. These are different signals:
- A system with 500 MB available but zero pressure is fine (workload fits).
- A system with 2 GB available but high pressure is in trouble (active reclaim, page faults, swap storms).
Both Linux and macOS expose kernel-level memory pressure metrics, but today every Python project that needs them must reimplement platform-specific parsing from scratch.
Real-world use case
We maintain exec-sandbox, a QEMU-based VM scheduler. Its admission controller decides whether to launch a new VM based on four gates — one of them is memory pressure. We had to write ~80 lines of platform-specific code to get this signal:
# Linux: parse /proc/pressure/memory for PSI "full avg10"
content = Path("/proc/pressure/memory").read_text()
for line in content.splitlines():
if line.startswith("full "):
for token in line.split():
if token.startswith("avg10="):
pressure_pct = float(token[6:])
# macOS: call sysctlbyname("kern.memorystatus_vm_pressure_level") via ctypes
libc = ctypes.CDLL(ctypes.util.find_library("c"))
val = ctypes.c_int32(0)
sz = ctypes.c_size_t(ctypes.sizeof(val))
libc.sysctlbyname(
b"kern.memorystatus_vm_pressure_level",
ctypes.byref(val), ctypes.byref(sz), None, ctypes.c_size_t(0),
)
level = val.value # 1=NORMAL, 2=WARN, 4=CRITICALThis pattern is common in process schedulers, autoscalers, monitoring agents, and OOM-avoidance systems. PSI is now mainstream — reported by sar, collected by Prometheus node_exporter, Netdata, and documented by Kubernetes.
Proposed API
psutil.pressure_memory() / psutil.pressure_cpu() / psutil.pressure_io()
Return resource pressure as a named tuple. Following psutil conventions (svmem, sswap, scputimes), the type could be spressure:
>>> import psutil
>>> psutil.pressure_memory()
spressure(some_avg10=0.50, some_avg60=0.30, some_avg300=0.15, some_total=123456,
full_avg10=0.10, full_avg60=0.05, full_avg300=0.02, full_total=45678)
>>> psutil.pressure_cpu()
spressure(some_avg10=1.20, some_avg60=0.80, some_avg300=0.40, some_total=789012,
full_avg10=0.00, full_avg60=0.00, full_avg300=0.00, full_total=0)
>>> psutil.pressure_io()
spressure(some_avg10=0.00, some_avg60=0.00, some_avg300=0.00, some_total=0,
full_avg10=0.00, full_avg60=0.00, full_avg300=0.00, full_total=0)Fields:
- some_avg10 / some_avg60 / some_avg300: % of time at least one task was stalled (10s / 60s / 300s exponentially weighted moving averages, range 0.00-100.00)
- full_avg10 / full_avg60 / full_avg300: % of time all non-idle tasks were stalled simultaneously. For CPU at the system level,
fullis reported as 0.00 (undefined at system level since kernel 5.13, meaningful at cgroup level only) - some_total / full_total: cumulative stall time in microseconds (monotonic counter)
Practical example (like getloadavg() in the docs)
>>> import psutil
>>> mem = psutil.pressure_memory()
>>> mem.full_avg10
0.10
>>> # Is the system thrashing right now? (>10% = serious pressure)
>>> mem.full_avg10 > 10.0
False
>>> # Compare short-term vs long-term to detect pressure spikes
>>> if mem.some_avg10 > mem.some_avg300 * 3:
... print("Memory pressure spiking")Platform details
Linux (kernel 4.20+, CONFIG_PSI=y)
Source: /proc/pressure/{cpu,memory,io} (kernel docs)
- Each file contains
someandfulllines withavg10,avg60,avg300(percentage, 2 decimal places) andtotal(microseconds). - CPU
fullline: absent before 5.13; present but always zeroed at system level since 5.13 (meaningful at cgroup level only — commit 890d550d7dbac). - Kernel 6.1 added
/proc/pressure/irq(requiresCONFIG_IRQ_TIME_ACCOUNTING) — only has afullline (nosome). This could be exposed aspsutil.pressure_irq()optionally. - Kernel 6.1 also added per-cgroup PSI on/off control via
cgroup.pressure. - Kernel 5.2 added PSI trigger mechanism (poll/epoll threshold notifications — out of scope for this issue but worth noting).
CONFIG_PSI=yis enabled by default on all major distros: Ubuntu 20.04+, Fedora 34+ (required by systemd-oomd), Debian Bookworm, Alpine ~3.15+.
Kernel format string (from kernel/sched/psi.c):
seq_printf(m, "%s avg10=%lu.%02lu avg60=%lu.%02lu avg300=%lu.%02lu total=%llu\n",
full ? "full" : "some", ...);Example /proc/pressure/memory under load:
some avg10=29.97 avg60=22.82 avg300=11.92 total=92159505
full avg10=8.76 avg60=5.17 avg300=3.38 total=43135045
macOS (10.9+)
Source: kern.memorystatus_vm_pressure_level sysctl (XNU source: kern_memorystatus_notify.c)
- Returns discrete
NOTE_MEMORYSTATUS_PRESSURE_*constants: 1 (NORMAL), 2 (WARN), 4 (CRITICAL). - These are NOT the internal kernel enum values — XNU's
convert_internal_pressure_level_to_dispatch_level()maps 5 internal levels (kVMPressureNormal/Warning/Urgent/Critical/Jetsam) down to 3 userspace values (Warning+Urgent both → WARN=2). - Readable without root on macOS (the
PRIV_VM_PRESSUREcheck is skipped via#if !XNU_TARGET_OS_OSXguard). - The sysctl is
CTLFLAG_MASKEDon production builds — hidden fromsysctl -abut accessible by explicit name viasysctlbyname(). This is an undocumented/private Apple API. - Stable since macOS 10.9 through at least macOS 15 Sequoia (no API changes across versions).
- macOS has no equivalent to Linux PSI's time-windowed averages. The pressure level is strictly a discrete instantaneous state.
Note: vm.memory_pressure sysctl also exists but returns a raw page count (vm_page_free_wanted), not a 0-100 percentage as some sources incorrectly claim. It is not suitable for a pressure "level" API.
macOS mapping
Since macOS only provides a discrete level (no time windows), a reasonable mapping for pressure_memory():
| Level | some_avg10 | Interpretation |
|---|---|---|
| NORMAL (1) | 0.0 | No pressure |
| WARN (2) | 50.0 | Moderate pressure |
| CRITICAL (4) | 100.0 | Severe pressure |
All other fields (avg60, avg300, total, full_*) would be None on macOS.
On unsupported platforms, these functions would raise AttributeError (consistent with how psutil handles platform-specific APIs like psutil.sensors_temperatures()).
Prior art
- Prometheus node_exporter: PSI collector (Go)
- Cloudflare psi_exporter: dedicated PSI exporter (Go)
- Netdata: built-in PSI collection (C)
- sar / sysstat: reports PSI on all major Linux distros
- OpenTelemetry: semantic-conventions#2995 — defining PSI metric names
- Kernel docs: https://docs.kernel.org/accounting/psi.html
- XNU source: https://github.com/apple-oss-distributions/xnu/blob/xnu-10063.101.15/bsd/kern/kern_memorystatus_notify.c
Implementation notes
The Linux implementation is straightforward — parse 2-3 lines of text from /proc/pressure/* (the format has been stable and additive-only since kernel 4.20). The macOS implementation would use sysctlbyname() (already used by psutil for other macOS metrics). Happy to submit a PR if the API design looks reasonable.