Unit 3 · CPU Scheduling, Threads & Deadlocks

Lecture 4: Process address space & identification

Learning outcomes

Describe a process’s address space layout and the information used to identify processes.

Prerequisites

Process concept - Basic memory idea: code, data, stack - Process creation basics (`fork()` / `exec()` concept) - PCB basics and process state model

Lecture Notes

Main content

Process Address Space & Identification

Learning Outcome

Describe a process’s address space layout and the information used to identify processes.

Introduction

A process is a program in execution. To run correctly, every process must have its own logical memory arrangement and its own identity within the operating system. The memory arrangement is called the process address space, while the identity is defined through process-related identifiers such as the process ID, parent process ID, and related execution information.

The operating system uses the address space to manage execution and memory access, and it uses process identifiers to distinguish one process from another. These two ideas are closely related to process creation, execution, protection, and scheduling.

Key Idea

The address space tells us how a process is laid out in memory.
The identification information tells the OS exactly which process it is managing.

What is a Process Address Space?

A process address space is the range of logical addresses that a process can use during execution. It contains the program instructions, data, dynamically allocated memory, and function call information needed by the process.

In simple terms, it is the memory view that belongs to one isolated process. Even if two processes run the exact same program, each process normally has its own separate address space. This provides isolation and prevents one process from directly interfering with another.

Typical Layout of a Process Address Space

A standard process address space is commonly explained using four major regions: Text (Code), Data, Heap, and Stack.

Text / Code Segment

Contains the executable instructions of the program. It stores the compiled machine code run by the CPU. It is usually treated as read-only to protect the code from accidental modification.

Data Segment

Stores global and static variables. Divided into Initialized data (variables given an initial value) and Uninitialized data (BSS) (variables declared without an explicit value).

Heap

Used for dynamic memory allocation during execution (e.g., malloc() or new). Grows upward as more memory is requested. Useful when size isn't known in advance.

Stack

Stores function call frames. Includes local variables, parameters, and return addresses. A new frame is created on function call and removed on return. Grows downward.

Space Between Heap and Stack

In the usual textbook view, the heap grows upward and the stack grows downward. The free space between them allows the process to expand dynamically during execution. If either region grows excessively and they collide, it causes a memory-related error (like a Stack Overflow).

Process vs. Thread Memory View

A process has its own isolated address space. If it crashes, others are safe.
A thread shares the Code, Data, and Heap with other threads of the same process, but maintains its own separate Stack and CPU registers.

Process Identification

The operating system must uniquely identify each process to manage it. It does so using identifiers, most of which are stored in the Process Control Block (PCB).

Process ID (PID): A unique numeric identifier assigned by the OS. It is used in scheduling, signaling, termination, resource management, and monitoring (e.g., the top or ps commands).
Parent Process ID (PPID): The identifier of the process that created this process. It helps the OS maintain a strict process family hierarchy.
User ID (UID) & Group ID (GID): Identifies the user who owns the process and their group, which dictates permissions and resource access.
Thread ID (TID): In multithreaded systems, threads within a process are identified separately since several threads share one PID.

Identification During Process Creation (fork & exec)

During process creation, a new process receives its own unique PID. In a typical UNIX/Linux fork()-based model:

The child gets a new PID, while the parent keeps its original PID.
If the child later executes a new program using exec(), the entire address space is replaced by the new program, but the PID remains the exact same.

Why Address Space and Identification Matter

The address space gives structure to the memory used by a process. Process identification gives structure to the operating system’s control over that process. Together, they support execution, protection, memory allocation, scheduling, and process hierarchy.

In short, a process is not just “a running program.” It is a managed execution entity with a defined memory layout and a unique identity in the OS.

Worked Example: Understanding Memory & Identification

Suppose process P1 is executing a standard C program containing:

Compiled logic instructions.
A global variable: int count = 10;
A dynamically allocated array: int* arr = malloc(100);
A function call calculate() with local variables.

Step 1: Where do these items go in the Address Space?

The compiled logic instructions are stored in the Text / Code Segment.
The initialized global variable count is stored in the Data Segment.
The memory returned by malloc() is taken from the Heap.
The local variables inside calculate() are pushed onto the Stack.

Step 2: How is the process identified?

The OS assigns a unique PID to P1 (e.g., PID 4050).
P1 was launched via the terminal, so its PPID is the PID of the shell process.
Because a user ran it, P1 runs with that user's UID (User ID).

Step 3: What happens if P1 creates a child?

The child gets a new PID (e.g., PID 4051) and its PPID is 4050.
The child gets its own address space (a separate copy).
If the child uses exec() to run a completely different program, its memory is overwritten with the new program, but its PID remains 4051.

One-Page Summary: Address Space & Identification

A process address space is the logical memory space available to an isolated process during execution. It is composed of four main regions:

Text (Code): Stores compiled program instructions (read-only).
Data: Stores global and static variables (Initialized and BSS).
Heap: Stores dynamically allocated memory (grows upward).
Stack: Stores function call frames, parameters, and local variables (grows downward).

The operating system identifies a process using metadata. The most important identifier is the Process ID (PID), which uniquely identifies the process for scheduling and signaling. It also tracks the Parent Process ID (PPID) to maintain a family tree.

During process creation, a child process gets a brand new PID. If that process later loads a new program with the exec() system call, its memory layout is replaced with the new code and data, but the PID remains unchanged.

Worked Example

Worked Example: Mapping a Running Program into Process Address Space

Consider a process P1 running a simple program that contains:

program instructions,
a global variable int total = 100;,
a function compute() with local variables, and
a dynamically created integer array using malloc().

Step 1: Program starts execution

When the program is loaded, the executable instructions are placed in the text (code) segment. This region contains the machine instructions that the CPU executes.

Step 2: Global data is placed

The variable total is stored in the data segment because it is a global variable with program-wide scope.

Step 3: Function call occurs

When compute() is called, a new stack frame is created in the stack. This frame stores local variables, parameters, and the return address. When the function finishes, this stack frame is removed.

Step 4: Dynamic memory is requested

Suppose the program executes:

int *arr = (int *) malloc(50 * sizeof(int));

The required memory block is allocated from the heap. Unlike stack memory, this memory remains allocated until it is explicitly released.

Step 5: Process identification

The operating system assigns a unique PID to this process, for example PID = 1200. If this process was created by another process, it also has a PPID. These identifiers allow the operating system to manage, monitor, and control the process correctly.

Step 6: If the process creates a child

If P1 creates a child process using fork(), the child receives a new PID. The child gets its own process image and address space. If the child then calls exec(), its program image changes, but its PID remains unchanged.

Conclusion

This example shows that a process is identified not only by its PID, but also by its memory organization. The code segment stores instructions, the data segment stores global/static data, the heap stores dynamic memory, and the stack stores function call information.

One-Page Summary

One-Page Summary: Process Address Space & Identification

A process address space is the logical memory layout available to a process during execution. It is commonly divided into four main regions: text/code, data, heap, and stack. The text segment stores executable instructions, the data segment stores global and static variables, the heap stores dynamically allocated memory, and the stack stores local variables, parameters, and return addresses used during function calls.

In the usual process layout, the heap grows upward as dynamic memory is allocated, while the stack grows downward as function calls are made. Each process normally has its own address space, which provides protection and isolation from other processes. Threads within the same process share code, data, and heap, but each thread has its own stack and register context.

The operating system identifies a process mainly by its Process ID (PID), which is a unique number assigned to that process. A process may also have a Parent Process ID (PPID) showing which process created it. In multiuser systems, user and group ownership information may also be associated with the process.

During process creation, a new child process receives a new PID. If that process later loads a different program using exec(), the program image changes but the PID remains the same because it is still the same process from the operating system’s point of view.

Therefore, process address space explains how a process is organized in memory, while process identification explains how the operating system uniquely recognizes and manages it. Both are fundamental to process execution, protection, and process management.