sched_rr_timeslice_ms参数调整对程序性能的影响及测试疑问

阿华AIGC实验室

2026-4-28

Why Adjusting kernel.sched_rr_timeslice_ms Didn't Change Your Results (And What It Actually Does)

You ran this CPU-bound test process configured for SCHED_RR scheduling:

#include <unistd.h>
#include <sched.h>
#include <stdio.h>
int main() {
    int pid_num = getpid();
    struct sched_param sp = { .sched_priority = 99 };
    int ret = sched_setscheduler(pid_num, SCHED_RR, &sp);
    int policy = sched_getscheduler(pid_num);
    switch(policy) {
        case SCHED_OTHER:
            printf("SCHED_OTHER\n");
            break;
        case SCHED_RR:
            printf("SCHED_RR\n");
            break;
        case SCHED_FIFO:
            printf("SCHED_FIFO\n");
            break;
        default:
            printf("Unknown...\n");
    }
    unsigned long long sum = 0;
    for (unsigned long long i = 0; i < 30000000000; i++)
        sum += i;
    printf("%llu\n", sum);
    return 0;
}

Then you adjusted kernel.sched_rr_timeslice_ms from 1ms to 1000ms, ran perf stat, and saw almost no change in runtime, context switches, or cycle counts. Let's break down why this happened, and what this kernel parameter actually controls.

The Core Problem: Your Test Scenario Doesn't Trigger RR Time Slicing

SCHED_RR's time slice parameter only matters when multiple processes share the same real-time priority. Here's what went on in your test:

You set your test process to priority 99—the highest possible real-time priority.
There were no other SCHED_RR or SCHED_FIFO processes at priority 99 competing for CPU time.
Real-time processes always take precedence over normal (SCHED_OTHER) processes. Since your test process was 100% CPU-bound (no sleeps, I/O, or other yield points), it held the CPU continuously until its loop finished. The scheduler never had a reason to preempt it for another same-priority real-time task.

The ~1700 context switches you saw came from low-priority system processes (like background daemons, the perf tool itself, etc.) getting tiny CPU windows when the kernel handled interrupts or mandatory system tasks—not from your test process hitting a time slice limit. That's why the count barely changed between your two test runs.

What `kernel.sched_rr_timeslice_ms` Actually Does

This parameter defines the maximum CPU time a single SCHED_RR process can hold before the scheduler switches to the next same-priority real-time process in the run queue. Its purpose is to enforce fairness among equal-priority real-time tasks.

For example:

If you spawn 4 identical SCHED_RR processes, all set to priority 50, all running CPU-bound loops:
- With kernel.sched_rr_timeslice_ms=1, the scheduler will switch between these 4 processes every 1ms—leading to a high number of context switches, and measurable overhead from saving/restoring process state.
- With kernel.sched_rr_timeslice_ms=1000, each process gets to run for a full second before switching. This drastically reduces context switches and overhead, which would show up as shorter total runtime in your perf stats.

How to See the Parameter's Effect

To observe the difference, modify your test to spawn multiple SCHED_RR processes (all at priority 99) running the same CPU-bound loop. Then re-run your perf tests with the two time slice settings. You'll clearly see:

Smaller time slices: More context switches, slightly longer total runtime (due to switch overhead).
Larger time slices: Fewer context switches, lower overhead, and shorter total runtime.

Quick Context on Your Test Environment

For reference, you ran this on:

Kernel: 5.13.0-27-generic
OS: Ubuntu 20.04.3 LTS

This kernel version's scheduler handles real-time scheduling as described—so the behavior you saw is consistent with expected Linux real-time scheduling rules.

内容的提问来源于stack exchange，提问作者mahmood