Intel VTune分析中system_call_after_swapgs耗时占比高的溯源问题
system_call_after_swapgs and Tracking Its Call Source in Intel VTune Hey there, let's break this down step by step—this is a common scenario when profiling code that leans heavily on system calls, so you're not alone here.
What is system_call_after_swapgs?
system_call_after_swapgs is a low-level entry point in the x86_64 Linux kernel, core to the system call handling pipeline. Here's the quick flow of what happens:
- When your user-space code invokes a system call (like
read(),write(), oropen()), it triggers a mode switch from user mode to kernel mode (via thesyscallinstruction on modern x86_64 systems). - The kernel first runs
swapgs—a special instruction that swaps GS segment register values to switch between user-space and kernel-space context (this manages access to kernel-specific data structures). - Right after
swapgs, execution jumps tosystem_call_after_swapgs, which sets up the kernel environment, parses system call arguments, and dispatches to the actual kernel service routine (e.g.,sys_readfor aread()call).
The missing stack info? This is a super early kernel entry point—VTune often doesn't capture full stack traces for these low-level kernel functions by default, especially if kernel debug symbols aren't installed.
How to Find Its Call Source
To track down what's driving so much time spent here, try these practical approaches:
1. Install Kernel Debug Symbols
First, make sure VTune can resolve kernel function names properly. Install the debug package for your running kernel:
- On Debian/Ubuntu:
sudo apt install linux-image-$(uname -r)-dbg - On RHEL/CentOS:
sudo dnf install kernel-debuginfo-$(uname -r) kernel-debuginfo-common-$(uname -m)
Once installed, re-run your VTune analysis—you may start seeing detailed kernel stack traces linking system_call_after_swapgs to actual system call handlers (like sys_read).
2. Use VTune's Advanced Analysis Modes
Switch from the default "Hotspots" mode to Advanced Hotspots or System Analysis mode. These modes are built to capture both user-space and kernel-space stack traces more thoroughly. In the results, check the "System Calls" tab—it will show you exactly which system calls are invoked most frequently, and how much time each consumes.
3. Correlate with User-Space Hotspots
Even if you can't get a full stack from system_call_after_swapgs, its origin is always user-space code that triggered the system call. Look at the user-space hotspot functions in VTune—any function making repeated calls to POSIX APIs (like fread(), write(), or socket()) is likely the source. For example, if you see a user-space function process_large_file() looping on read(), that's the function driving the time spent in system_call_after_swapgs.
4. Supplement with perf for Kernel Stack Traces
If VTune still isn't giving enough detail, use the Linux perf tool to cross-verify:
- Run
perf record -g -p <your_process_pid>while your program runs (the-gflag enables stack trace collection). - Then run
perf reportto view results.perfoften captures the full stack from the user-space system call trigger up throughsystem_call_after_swapgsand the kernel service routine, giving you a clear call chain.
内容的提问来源于stack exchange,提问作者Manfredo




