Learning outcomes
- Discuss the challenges and strategies for scheduling on multiprocessor systems.
Discuss the challenges and strategies for scheduling on multiprocessor systems.
In a single-processor system, CPU scheduling mainly answers one question: which ready process should run next? In a multiprocessor system, the scheduler must answer two questions: which process should run next, and on which processor should it run? This makes scheduling more complex because several processors or cores may be available at the same time.
Multiprocessor scheduling is therefore concerned with assigning processes or threads to multiple CPUs in a way that improves CPU utilization, throughput, response time, and fairness while keeping scheduling overhead under control. In modern systems, this topic includes both traditional multiple-CPU machines and multicore processors.
A multiprocessor scheduler must do more than choose the next process. It must also decide how to distribute work across processors. If too many runnable tasks remain on one processor while others are idle, system performance suffers. On the other hand, moving tasks too frequently between processors can also reduce performance because useful cache contents may be lost.
Thus, multiprocessor scheduling is a balancing act between:
Two broad approaches are commonly discussed: asymmetric multiprocessing and symmetric multiprocessing.
In asymmetric multiprocessing (AMP), one processor acts as a master. It handles scheduling and system control, while the other processors execute assigned tasks. This design is easier to manage, but the master processor may become a bottleneck.
In symmetric multiprocessing (SMP), all processors are treated as equals. Each processor may schedule its own work and run both user and kernel activities. This approach offers better performance and scalability, but it is harder to implement because processors may compete for shared scheduling data structures.
Modern operating systems generally prefer SMP because it avoids dependence on a single master CPU and uses processor resources more effectively.
A simple SMP design may place all ready processes in one shared ready queue. Any available processor can pick the next task from that queue. This approach supports natural work sharing, but as the number of processors grows, contention for the queue can become expensive.
Another design gives each processor its own local ready queue. This reduces contention and often improves scalability. However, it can create imbalance if one queue becomes overloaded and another becomes nearly empty. That is why a multiprocessor scheduler also needs a method for load balancing.
Load balancing means distributing runnable tasks so that processors are used more evenly. If one processor is overloaded while another is idle, the system should move work from the busy processor to the idle or less loaded one.
Two standard mechanisms are commonly described:
Both techniques aim to improve utilization, but they must be used carefully. Excessive migration can undo the benefits of cache reuse and increase scheduling overhead.
Processor affinity means that a process is preferably scheduled on the processor where it ran before. This matters because data recently used by the process may still be available in that processor’s cache. If the process is moved to another CPU, that cache advantage may be lost.
Two forms are usually discussed:
Affinity improves locality and often improves performance, but strict affinity may interfere with load balancing. So practical schedulers balance affinity and fair distribution of work.
Modern systems usually contain multicore processors, where several processing cores exist on the same chip. From the operating system’s perspective, each core may appear as a separate logical CPU. However, cores may share caches, memory paths, and other hardware resources.
Because of this, scheduling on multicore systems is not just a repetition of single-CPU scheduling. The operating system must consider:
Some processors also support simultaneous multithreading (SMT), where one physical core exposes more than one logical CPU. This improves utilization, but the logical CPUs still share hardware resources. Therefore, assigning two heavy threads to sibling logical CPUs may not yield ideal performance.
In some multiprocessor systems, especially larger servers, the architecture is NUMA (Non-Uniform Memory Access). In such systems, memory access time depends on where the memory is located relative to the processor.
A CPU can access its local memory faster than memory attached to another processor. Therefore, the scheduler should try to keep a process close to the memory it frequently uses. This is called NUMA-aware scheduling.
A processor may be free, but that does not always mean it is the best place to run a task. If the task’s memory is far away, performance may still suffer.
Not all processors in a system are necessarily identical. In heterogeneous multiprocessing, some cores may be faster and more power-hungry, while others may be slower and more energy efficient. A common example is a big.LITTLE design.
In such systems, the scheduler may place:
This means the scheduler is no longer concerned only with speed and fairness. It must also consider power consumption and thermal limits.
Multiprocessor scheduling must deal with several practical difficulties:
A good scheduler must therefore avoid two extremes: it should neither leave processors idle unnecessarily nor move tasks around so aggressively that overhead destroys the gain.
Multiprocessor scheduling extends basic CPU scheduling to systems with multiple processors or cores. Its core concern is not only choosing the next runnable task, but also selecting the most suitable processor for that task. Modern operating systems address this problem using SMP-based designs, load balancing, processor affinity, multicore awareness, and sometimes NUMA-aware and heterogeneous scheduling policies.
In simple terms, multiprocessor scheduling is about using parallel hardware intelligently: keep processors busy, keep overhead low, and keep work close to the hardware resources that help it run efficiently.
Consider a system with two processors: P0 and P1. Each processor has its own ready queue.
Initial state:
If the scheduler does nothing, P0 will remain busy and P1 will remain idle. This is poor load distribution.
The operating system observes that one processor has several waiting tasks while the other has none.
The scheduler migrates T3 and T4 from P0 to P1.
After migration:
Now both processors can execute tasks in parallel, improving CPU utilization and overall throughput.
Suppose T3 had previously run on P0 and its working data is still present in P0’s cache. Moving it to P1 may reduce cache benefit and slightly slow execution.
Therefore, the scheduler must decide:
A practical scheduler usually tries to balance both goals: do enough migration to prevent idleness, but avoid unnecessary movement that destroys cache locality.
This example shows the core tension in multiprocessor scheduling: load balancing improves parallelism, while processor affinity improves locality. A good scheduler must manage both together.
Multiprocessor scheduling is the process of assigning runnable processes or threads to multiple CPUs or cores in a computer system. In a single-processor system, the scheduler only decides which process should run next. In a multiprocessor system, it must decide both which process should run and on which processor it should run.
The main objectives of multiprocessor scheduling are to keep all processors busy, distribute workload fairly, reduce scheduling overhead, preserve cache locality, and improve system performance. The two broad design styles are asymmetric multiprocessing (AMP), where one master processor performs scheduling decisions, and symmetric multiprocessing (SMP), where all processors are treated equally and each can schedule work. Modern operating systems mostly use SMP.
A scheduler may use either a common ready queue shared by all processors or separate ready queues for individual processors. A common queue simplifies work sharing but may suffer from contention. Separate queues scale better but may lead to workload imbalance.
To solve imbalance, operating systems use load balancing. In push migration, busy processors push tasks to less loaded processors. In pull migration, idle processors pull work from busy ones. This improves utilization but may increase migration overhead.
Another major concept is processor affinity. A process tends to run better on the processor where it executed previously because useful data may still be available in that processor’s cache. Soft affinity is a preference to stay on the same CPU, while hard affinity is a strict binding to a specific CPU or set of CPUs.
In modern systems, multiprocessor scheduling also includes multicore processors, simultaneous multithreading (SMT), NUMA-aware scheduling, and heterogeneous multiprocessing. NUMA-aware scheduling tries to keep a process near the memory it uses, while heterogeneous scheduling places tasks on fast or power-efficient cores depending on system goals.
The central challenge of multiprocessor scheduling is balance: too much migration improves load distribution but hurts locality; too little migration preserves locality but may leave processors idle. Therefore, an effective multiprocessor scheduler tries to achieve both high parallel utilization and efficient resource usage.