Unit 2 · Concurrent Processes

Lecture 2: Principle of concurrency & race conditions

Learning outcomes

Explain the principle of concurrency and distinguish it from parallelism.
Define a race condition and explain why it occurs in concurrent execution.
Identify critical sections and justify the need for synchronization.
State the three requirements of a correct critical-section solution.

Prerequisites

-- Process basics: process states (Ready/Running/Waiting) and context switch idea. -- Basic understanding of shared resources: a shared variable/buffer accessed by two processes. -- Helpful: why a CPU can interrupt a running process (timer interrupt / scheduling).

Lecture Notes

Main content

1. Principle of Concurrency

Modern operating systems support multiple processes and multiple users. Many tasks appear to run “together” because the OS switches the CPU among tasks and overlaps CPU work with I/O work. This is the principle of concurrency: multiple activities make progress during the same time interval.

1.1 Concurrency vs Parallelism

Concurrency

Multiple tasks make progress (not necessarily at the exact same instant).
On a single CPU core, concurrency happens by interleaving (fast context switching).
Creates the illusion of simultaneous execution through time-slicing.

Parallelism

Multiple tasks execute physically at the same instant.
Requires multiple cores or multiple CPUs (true simultaneous execution).
Common in modern machines: different threads run on separate hardware cores.

Concurrency = progress by interleaving; Parallelism = true simultaneous execution on multiple cores.

Why Concurrency is Essential:

Improves CPU utilization: While one process waits for I/O, another can compute.
Improves responsiveness: Interactive applications feel much faster.
Enables multiprogramming: Supports multiuser and multitasking environments seamlessly.

2. Race Conditions

A race condition occurs when two or more processes access and manipulate shared data concurrently, and the final outcome depends on the strict order of execution (timing/interleaving). Because CPU scheduling and hardware interrupts are unpredictable, the resulting data state can become corrupt.

2.1 Why "Simple" Statements Break

Statements that look atomic (single-step) in high-level code like C or Java are translated into multiple low-level machine instructions. If a context switch happens in the middle of these steps, the shared data becomes inconsistent.

High-Level Code

// Shared memory variable
int count = 5;

// Producer Process
count++;

// Consumer Process
count--;

Conceptual Machine Code

// count++ (Requires 3 steps)
r1 = count;
r1 = r1 + 1;
count = r1;

// count-- (Requires 3 steps)
r2 = count;
r2 = r2 - 1;
count = r2;

2.2 A Classic Failing Interleaving

The final count becomes 4 even though one increment and one decrement should keep it at 5. This lost update is the core danger of a race condition.

Core Concept: If correctness depends on "Process A must finish before Process B starts" across concurrent systems, you must apply synchronization to control their interleaving.

3. Critical Section & Synchronization

In concurrent programming, a critical section is the specific segment of code where a process accesses and modifies shared variables, tables, or files. The OS must guarantee that when one process executes inside its critical section, no other process is allowed to execute in theirs.

3.1 Standard Critical-Section Architecture

do {
  // Entry Section (Request permission / Lock)
  
  /* --- CRITICAL SECTION --- */
  /* Access shared memory     */
  
  // Exit Section (Release permission / Unlock)
  
  // Remainder Section (Independent code)
} while (true);

Synchronization logic is strictly implemented in the entry and exit sections to shield the critical section.

3.2 The Three Requirements of a Correct Solution

Requirement	Definition	What it Prevents
Mutual Exclusion	If process P_i is in its critical section, no other process P_j may be in its critical section.	Data corruption / Race conditions.
Progress	If no process is in the critical section, processes wishing to enter cannot be postponed indefinitely.	System deadlocks and freezing.
Bounded Waiting	There must be a strict limit on how many times other processes can enter their critical sections after a process has requested entry.	Process starvation (waiting forever).

3.3 Achieving Synchronization (Visual Model)

Only one process is granted the lock to update shared data at a time; others block and wait.

3.4 Why "Disabling Interrupts" Fails on Modern Systems

On old, single-core systems, developers could simply disable hardware interrupts before modifying a shared variable, preventing context switches. However, on modern multiprocessor architectures, globally disabling interrupts across all cores is extremely slow, harms system clock timing, and drastically reduces overall system efficiency. Modern operating systems require software and hardware synchronization tools instead.

Primary Synchronization Tools:

Atomic Hardware Instructions: Non-interruptible machine instructions (e.g., TestAndSet or CompareAndSwap).
Mutex Locks: A boolean lock ensuring only one thread enters a critical section.
Semaphores: Integer variables used for more complex coordination and counting available resources.
Software Solutions: Algorithms like Peterson's Solution (mostly theoretical today but foundational for learning).

3.5 The Risk of Synchronization: Deadlocks

While synchronization solves race conditions, poorly implemented locks introduce a new danger: Deadlock. If Process A locks Resource 1 and waits for Resource 2, while Process B locks Resource 2 and waits for Resource 1, both processes will freeze indefinitely.

Worked Example

Worked Example: Race Condition in PID Assignment

Consider the OS maintaining a shared variable next_available_pid. If two processes request to create children at nearly the exact same time, both may read the same PID before either writes back the incremented value. This leads to duplicate PID assignment — a catastrophically incorrect state for an OS kernel.

Unsafe Pseudo-Logic (Conceptual)

// Shared kernel variable
next_available_pid = 2615;

// Inside the fork() system call:
pid = next_available_pid;
next_available_pid = next_available_pid + 1;
return pid;

Why It Fails

If two CPUs or threads execute this concurrently, both can read 2615 before the increment step occurs.
Both calls return the exact same PID, causing duplicate identifiers and corrupting kernel tracking tables.
Fix: The OS must protect the PID allocation code with a lock or atomic update (synchronization) to ensure mutual exclusion.

Without synchronization, shared kernel variables like PID counters can become corrupted, leading to system failure.

One-Page Summary

Concurrency: multiple tasks make progress in the same time interval (often by interleaving on a single core).
Parallelism: multiple tasks execute at the same instant (requires multiple cores/CPUs).
Race condition: concurrent access to shared data where the final result depends on execution order (timing).
Main cause: “single” statements (like count++) are multiple steps and can interleave.
Critical section: code segment where shared data is accessed/updated; must be protected.
Critical-section requirements: Mutual Exclusion, Progress, Bounded Waiting.
Need for synchronization: ensure only one process at a time manipulates shared data; otherwise OS/app correctness breaks.
Common tools: locks/mutex, semaphores, atomic instructions, monitors (details in upcoming lectures).