Skip to content

[Linux, macOS] Expose memory pressure metrics (PSI + macOS sysctl) #2725

@clemlesne

Description

@clemlesne

Extends #1932 — broadens the Linux-only PSI request to a cross-platform memory pressure API.

The problem

There is currently no way to detect memory pressure (thrashing, reclaim stalls) through psutil. The existing virtual_memory().available tells you how much free memory remains, but not whether the system is struggling to serve its current workload. These are different signals:

  • A system with 500 MB available but zero pressure is fine (workload fits).
  • A system with 2 GB available but high pressure is in trouble (active reclaim, page faults, swap storms).

Both Linux and macOS expose kernel-level memory pressure metrics, but today every Python project that needs them must reimplement platform-specific parsing from scratch.

Real-world use case

We maintain exec-sandbox, a QEMU-based VM scheduler. Its admission controller decides whether to launch a new VM based on four gates — one of them is memory pressure. We had to write ~80 lines of platform-specific code to get this signal:

# Linux: parse /proc/pressure/memory for PSI "full avg10"
content = Path("/proc/pressure/memory").read_text()
for line in content.splitlines():
    if line.startswith("full "):
        for token in line.split():
            if token.startswith("avg10="):
                pressure_pct = float(token[6:])

# macOS: call sysctlbyname("kern.memorystatus_vm_pressure_level") via ctypes
libc = ctypes.CDLL(ctypes.util.find_library("c"))
val = ctypes.c_int32(0)
sz = ctypes.c_size_t(ctypes.sizeof(val))
libc.sysctlbyname(
    b"kern.memorystatus_vm_pressure_level",
    ctypes.byref(val), ctypes.byref(sz), None, ctypes.c_size_t(0),
)
level = val.value  # 1=NORMAL, 2=WARN, 4=CRITICAL

This pattern is common in process schedulers, autoscalers, monitoring agents, and OOM-avoidance systems. PSI is now mainstream — reported by sar, collected by Prometheus node_exporter, Netdata, and documented by Kubernetes.

Proposed API

psutil.pressure_memory() / psutil.pressure_cpu() / psutil.pressure_io()

Return resource pressure as a named tuple. Following psutil conventions (svmem, sswap, scputimes), the type could be spressure:

>>> import psutil

>>> psutil.pressure_memory()
spressure(some_avg10=0.50, some_avg60=0.30, some_avg300=0.15, some_total=123456,
          full_avg10=0.10, full_avg60=0.05, full_avg300=0.02, full_total=45678)

>>> psutil.pressure_cpu()
spressure(some_avg10=1.20, some_avg60=0.80, some_avg300=0.40, some_total=789012,
          full_avg10=0.00, full_avg60=0.00, full_avg300=0.00, full_total=0)

>>> psutil.pressure_io()
spressure(some_avg10=0.00, some_avg60=0.00, some_avg300=0.00, some_total=0,
          full_avg10=0.00, full_avg60=0.00, full_avg300=0.00, full_total=0)

Fields:

  • some_avg10 / some_avg60 / some_avg300: % of time at least one task was stalled (10s / 60s / 300s exponentially weighted moving averages, range 0.00-100.00)
  • full_avg10 / full_avg60 / full_avg300: % of time all non-idle tasks were stalled simultaneously. For CPU at the system level, full is reported as 0.00 (undefined at system level since kernel 5.13, meaningful at cgroup level only)
  • some_total / full_total: cumulative stall time in microseconds (monotonic counter)

Practical example (like getloadavg() in the docs)

>>> import psutil

>>> mem = psutil.pressure_memory()
>>> mem.full_avg10
0.10

>>> # Is the system thrashing right now? (>10% = serious pressure)
>>> mem.full_avg10 > 10.0
False

>>> # Compare short-term vs long-term to detect pressure spikes
>>> if mem.some_avg10 > mem.some_avg300 * 3:
...     print("Memory pressure spiking")

Platform details

Linux (kernel 4.20+, CONFIG_PSI=y)

Source: /proc/pressure/{cpu,memory,io} (kernel docs)

  • Each file contains some and full lines with avg10, avg60, avg300 (percentage, 2 decimal places) and total (microseconds).
  • CPU full line: absent before 5.13; present but always zeroed at system level since 5.13 (meaningful at cgroup level only — commit 890d550d7dbac).
  • Kernel 6.1 added /proc/pressure/irq (requires CONFIG_IRQ_TIME_ACCOUNTING) — only has a full line (no some). This could be exposed as psutil.pressure_irq() optionally.
  • Kernel 6.1 also added per-cgroup PSI on/off control via cgroup.pressure.
  • Kernel 5.2 added PSI trigger mechanism (poll/epoll threshold notifications — out of scope for this issue but worth noting).
  • CONFIG_PSI=y is enabled by default on all major distros: Ubuntu 20.04+, Fedora 34+ (required by systemd-oomd), Debian Bookworm, Alpine ~3.15+.

Kernel format string (from kernel/sched/psi.c):

seq_printf(m, "%s avg10=%lu.%02lu avg60=%lu.%02lu avg300=%lu.%02lu total=%llu\n",
           full ? "full" : "some", ...);

Example /proc/pressure/memory under load:

some avg10=29.97 avg60=22.82 avg300=11.92 total=92159505
full avg10=8.76 avg60=5.17 avg300=3.38 total=43135045

macOS (10.9+)

Source: kern.memorystatus_vm_pressure_level sysctl (XNU source: kern_memorystatus_notify.c)

  • Returns discrete NOTE_MEMORYSTATUS_PRESSURE_* constants: 1 (NORMAL), 2 (WARN), 4 (CRITICAL).
  • These are NOT the internal kernel enum values — XNU's convert_internal_pressure_level_to_dispatch_level() maps 5 internal levels (kVMPressureNormal/Warning/Urgent/Critical/Jetsam) down to 3 userspace values (Warning+Urgent both → WARN=2).
  • Readable without root on macOS (the PRIV_VM_PRESSURE check is skipped via #if !XNU_TARGET_OS_OSX guard).
  • The sysctl is CTLFLAG_MASKED on production builds — hidden from sysctl -a but accessible by explicit name via sysctlbyname(). This is an undocumented/private Apple API.
  • Stable since macOS 10.9 through at least macOS 15 Sequoia (no API changes across versions).
  • macOS has no equivalent to Linux PSI's time-windowed averages. The pressure level is strictly a discrete instantaneous state.

Note: vm.memory_pressure sysctl also exists but returns a raw page count (vm_page_free_wanted), not a 0-100 percentage as some sources incorrectly claim. It is not suitable for a pressure "level" API.

macOS mapping

Since macOS only provides a discrete level (no time windows), a reasonable mapping for pressure_memory():

Level some_avg10 Interpretation
NORMAL (1) 0.0 No pressure
WARN (2) 50.0 Moderate pressure
CRITICAL (4) 100.0 Severe pressure

All other fields (avg60, avg300, total, full_*) would be None on macOS.

On unsupported platforms, these functions would raise AttributeError (consistent with how psutil handles platform-specific APIs like psutil.sensors_temperatures()).

Prior art

Implementation notes

The Linux implementation is straightforward — parse 2-3 lines of text from /proc/pressure/* (the format has been stable and additive-only since kernel 4.20). The macOS implementation would use sysctlbyname() (already used by psutil for other macOS metrics). Happy to submit a PR if the API design looks reasonable.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions