如何通过perf_event_open(2)在Linux上程序化获取并使用perf list事件
perf_event_open() Without Third-Party Libraries Absolutely—you can replicate what the perf list command does directly using kernel sysfs interfaces and the perf_event_open() syscall, no external libraries required. The key is to understand how perf itself gathers event information internally, then mirror that logic in your code.
Core Approach: Mirror perf's Internal Logic
The perf tool doesn’t rely on third-party event lists—it pulls event data directly from the kernel via sysfs and parses CPU-specific PMU (Performance Monitoring Unit) capabilities. Here’s how to break it down by event category:
1. Enumerate Standard Software/Hardware Events
These are the universal events (like cycles, instructions) defined in the kernel’s <linux/perf_event.h> header. You can map their names to perf_event_attr fields using built-in macros:
- Software Events: Use
PERF_TYPE_SOFTWAREas thetypefield, with config values likePERF_COUNT_SW_CPU_CLOCKorPERF_COUNT_SW_PAGE_FAULTS. - Hardware Events: Use
PERF_TYPE_HARDWAREas thetypefield, with config values likePERF_COUNT_HW_CPU_CYCLESorPERF_COUNT_HW_INSTRUCTIONS. - Hardware Cache Events: Use
PERF_TYPE_HW_CACHE, and combine cache type, operation, and result into theconfigfield using the formula:
Example: For L1D cache misses on reads, useattr.config = (cache_type << 16) | (op << 8) | result;PERF_COUNT_HW_CACHE_L1D(cache type),PERF_COUNT_HW_CACHE_OP_READ(operation), andPERF_COUNT_HW_CACHE_RESULT_MISS(result).
2. Enumerate Raw PMU Events (Model-Specific)
These are the CPU-specific events (like intel_core/inst_retired.any/). The kernel exposes a subset in /sys/devices/cpu/events/, but to get all events like perf list does:
- Check the PMU name via
/sys/devices/cpu/caps/pmu_name(e.g.,intel_corefor Intel 6th-gen+ CPUs). - For Intel CPUs, parse event codes from the CPU’s SDM (Software Developer Manual), then encode them into
attr.config(umask shifted left 8 bits OR event code). - For AMD CPUs, use the event encoding format from their Architecture Programmer’s Manual.
- Uncore events (like QPI, UPI, or Infinity Fabric) are in
/sys/devices/uncore_*/events/directories—parse these similarly to CPU PMU events.
3. Enumerate Tracepoint Events
Tracepoints (like block:block_rq_issue) are exposed in the debug filesystem:
- Ensure
/sys/kernel/debug/is mounted (most systems have this by default). - Walk the
/sys/kernel/debug/tracing/events/directory tree. - For each event directory:
- Read the
idfile to get the tracepoint ID. - Set
attr.type = PERF_TYPE_TRACEPOINTandattr.config = [tracepoint ID].
- Read the
4. Example Code: List Raw PMU Events and Populate perf_event_attr
Here’s a quick snippet to read raw PMU events from sysfs and prepare the attribute structure:
#include <stdio.h> #include <dirent.h> #include <string.h> #include <linux/perf_event.h> void list_raw_cpu_events() { DIR *dir = opendir("/sys/devices/cpu/events/"); if (!dir) { perror("Failed to open events directory"); return; } struct dirent *entry; while ((entry = readdir(dir)) != NULL) { if (entry->d_type != DT_REG) continue; char path[256]; snprintf(path, sizeof(path), "/sys/devices/cpu/events/%s", entry->d_name); FILE *f = fopen(path, "r"); if (!f) continue; char buf[256]; if (fgets(buf, sizeof(buf), f)) { unsigned int event = 0, umask = 0; // Parse event string (e.g., "event=0x3c,umask=0x00") if (sscanf(buf, "event=0x%x,umask=0x%x", &event, &umask) == 2) { struct perf_event_attr attr = {0}; attr.type = PERF_TYPE_RAW; attr.config = (umask << 8) | event; attr.size = sizeof(attr); attr.disabled = 1; // Start paused attr.inherit = 1; // Monitor child processes too printf("Event: %s | Type: %d | Config: 0x%x\n", entry->d_name, attr.type, attr.config); // Use perf_event_open() with this attr and target PID to monitor another process } } fclose(f); } closedir(dir); } int main() { list_raw_cpu_events(); return 0; }
Key Notes
- Root Privileges: Most perf operations (including
perf_event_open()for non-self processes) require root access. - CPU Model Checks: Some raw events are only valid for specific CPU models—use
/proc/cpuinfoto filter events accordingly. - Debug Filesystem: Tracepoint enumeration requires
/sys/kernel/debug/to be mounted.
内容的提问来源于stack exchange,提问作者BeeOnRope




