pg_pidstat: Real-Time Per-Backend CPU, Memory, and I/O Monitoring for PostgreSQL
When a PostgreSQL server slows down, the first question is usually: which backend is consuming resources? pg_stat_activity tells you what each backend is doing, but not how much CPU, memory, or disk I/O it uses. You end up jumping between top, pidstat, and pg_stat_activity, manually correlating PIDs.
pg_pidstat solves this by embedding OS-level process metrics directly into PostgreSQL. A single SQL view gives you per-backend CPU percentage, memory consumption, I/O throughput, and context switch rates — all joined with the connection context you already know from pg_stat_activity.
The Problem: Blind Spots in PostgreSQL Monitoring
PostgreSQL’s built-in statistics views (pg_stat_activity, pg_stat_user_tables, pg_stat_io) provide query-level and table-level insights, but they lack process-level OS metrics. When you need to answer questions like:
- Which backend is burning 100% CPU?
- Is that long-running query memory-hungry or I/O-bound?
- Are context switches causing latency spikes?
You typically resort to external tools — pidstat, htop, or custom /proc scrapers — and then manually join their output with pg_stat_activity by PID. This workflow is tedious, error-prone, and impossible to automate in pure SQL.
pg_pidstat eliminates that gap entirely.
Key Features
- Per-backend CPU usage as a percentage of total system capacity
- Memory tracking — both percentage and absolute MB (resident set size)
- I/O throughput — read/write bytes per second and IOPS
- I/O wait detection — flags backends blocked on disk
- Context switch rates — voluntary and non-voluntary, per second
- Full
pg_stat_activityintegration — all standard columns (pid, datname, usename, state, query, etc.) included - Background worker sampling — metrics updated every 1 second automatically
- Lock-free reads — minimal contention with brief exclusive writes only during sampling
Architecture
pg_pidstat is built with pgrx (Rust) and consists of four modules:
1 | ┌─────────────────────────────────────────────────────┐ |
How it works:
- A background worker wakes every second and iterates over all PostgreSQL backend PIDs.
- For each PID, it reads
/proc/[pid]/stat,/proc/[pid]/statm,/proc/[pid]/io, and/proc/[pid]/statusto collect raw CPU ticks, memory pages, I/O byte counters, and context switch counts. - The raw readings are stored in a double-buffered shared memory region. The double-buffer design lets SQL queries read the previous-second snapshot without blocking the worker’s current write.
- When you query the
pg_pidstatview, theactivitymodule computes per-second deltas (rates) from the raw counters and joins them withpg_stat_activitycolumns.
This design ensures zero lock contention on reads — the only exclusive lock is a brief window when the background worker swaps the active buffer.
Metrics Reference
The pg_pidstat view includes all standard pg_stat_activity columns plus these monitoring columns:
| Column | Type | Description |
|---|---|---|
cpu_percent |
float8 |
CPU usage as % of total system capacity |
memory_percent |
float8 |
Resident memory as % of total system RAM |
memory_usage_mb |
float8 |
Resident memory in megabytes |
io_read_bytes_per_sec |
float8 |
Disk read throughput (bytes/sec) |
io_write_bytes_per_sec |
float8 |
Disk write throughput (bytes/sec) |
io_read_ops_per_sec |
float8 |
Read IOPS |
io_write_ops_per_sec |
float8 |
Write IOPS |
io_wait |
bool |
true if the process is in disk sleep state |
voluntary_ctxt_switches_per_sec |
float8 |
Voluntary context switches per second |
nonvoluntary_ctxt_switches_per_sec |
float8 |
Non-voluntary context switches per second |
Usage Examples
Find CPU-Heavy Backends
1 | SELECT pid, datname, usename, state, query, |
Identify I/O-Intensive Queries
1 | SELECT pid, usename, state, query, |
Detect Context Switch Hotspots
High non-voluntary context switches indicate CPU contention — too many backends fighting for CPU time:
1 | SELECT pid, usename, query, |
Full Troubleshooting Dashboard
Combine all metrics for a comprehensive view of backend resource usage:
1 | SELECT pid, datname, usename, state, |
Monitoring Over Time with pg_cron
Pair pg_pidstat with pg_cron to build a historical resource usage table:
1 | CREATE TABLE backend_metrics_history ( |
Why pg_pidstat Over External Monitoring?
| Aspect | External tools (pidstat, top) | pg_pidstat |
|---|---|---|
| PID-to-query mapping | Manual correlation | Automatic (joined with pg_stat_activity) |
| Query interface | Shell parsing / custom scripts | Standard SQL |
| Alerting integration | Requires glue code | Use any SQL-based alerting |
| Historical storage | External time-series DB | INSERT INTO ... SELECT from the view |
| Granularity | Per-process only | Per-backend with database context |
| Deployment | Agent on every host | PostgreSQL extension — no extra process |
The key advantage is context: knowing that PID 12345 uses 90% CPU is useful, but knowing it’s the analytics user running a sequential scan on the orders table in the production database is actionable.
Performance Design
pg_pidstat is designed for minimal overhead in production:
- Release-mode optimization:
opt-level = 3, fat LTO, single codegen unit - Lock-free reads: Double-buffered shared memory means
SELECTfrom the view never blocks - Brief exclusive writes: Only during the buffer swap (microseconds)
- Bounded memory: Fixed-size shared memory for up to 1024 backends
- 1-second sampling: Background worker reads
/proconce per second — negligible system load
Conclusion
pg_pidstat brings OS-level process monitoring into PostgreSQL itself. Instead of correlating PIDs across top, pidstat, and pg_stat_activity, you get CPU, memory, I/O, and context switch metrics in a single SQL view — with full connection context attached.
For anyone running PostgreSQL on Linux and needing to quickly identify resource-heavy backends, pg_pidstat turns a multi-tool investigation into one query.
References
- pg_pidstat GitHub Repository
- pgrx — Rust Framework for PostgreSQL Extensions
- PostgreSQL pg_stat_activity Documentation
此文作者:Daniel Shih(石頭)
此文地址: https://isdaniel.github.io/pg-pidstat-real-time-postgresql-backend-monitoring/
版權聲明:本博客所有文章除特別聲明外,均採用 CC BY-NC-SA 3.0 TW 許可協議。轉載請註明出處!