Skip to content
SE eBook
Menu

DATA STRUCTURE METRICS

Public section
Preferences are saved on this device.

Data Structure Metrics

Data structure metrics assess how data is organised, used, and shared in a software system. They help identify:

  • Data structures that may cause performance problems.
  • Modules that are too tightly coupled through shared data.
  • Areas where maintenance effort may be high because of obscure data usage.

Broadly, we can distinguish two groups of data-related metrics:

  • Object-oriented data structure metrics – focus on relationships and dependencies in class hierarchies. Examples include: Depth of Inheritance Tree (DIT), Number of Children (NOC), Weighted Methods per Class (WMC).
  • Procedural data flow and usage metrics – focus on how data is used in procedural modules, including: amount of data, usage of live variables, program weakness, and data sharing among modules.

We briefly outline some of the commonly discussed metrics below.

1. Amount of Data

One way to approximate the amount of data in a program is to count the number of items in the cross-reference list (variables, constants, labels). In Halstead’s terms, this approximates the number of distinct operands n2.

In simple form:

$$ n_{2} \approx \text{VARS} + \text{Constants} + \text{Labels} $$

A larger operand vocabulary generally indicates that more data items are being manipulated, which can increase the cognitive load on developers.

2. Usage of Data within a Module (Live Variable Metrics)

A variable is live at a particular statement if its current value may be used in the future before being overwritten.

  • Within a procedure, a variable is live from its first reference until its last reference.
  • The average number of live variables in a module gives an indication of how much data is “in play” at any given time.

Let LVi be the average number of live variables in the i-th module. For a program with m modules, the average live-variable metric is:

$$ LV_{\text{program}} = \frac{1}{m} \sum_{i=1}^{m} LV_{i} $$

Similarly, we can define span size, which counts the number of statements between successive uses of a variable. Let SPi be the average span size in the i-th module. For a program with n modules:

$$ SP_{\text{program}} = \frac{1}{n} \sum_{i=1}^{n} SP_{i} $$

A large span suggests that a variable remains “alive” across many statements, which can make it harder to reason about the code and increases the chance of unintended side effects.

3. Program Weakness

A program is typically composed of several modules. If these modules have low cohesion, many live variables, and long spans, the effort required for development and maintenance tends to increase. One way to capture this is via a weakness metric.

For a single module, let:

  • LV – average number of live variables in the module.
  • γ – average life (span) of those variables.

The module weakness can be defined as:

$$ W_{M} = LV \times \gamma $$

For a program with m modules, the overall program weakness is approximated by:

$$ W_{P} = \frac{1}{m} \sum_{i=1}^{m} W_{M_{i}} $$

Higher weakness values indicate that the program is harder to understand, test, and maintain, and may benefit from refactoring or redesign.

4. Sharing of Data among Modules

Modules often share data through global variables, shared data structures, or common database tables. Excessive data sharing:

  • Increases coupling between modules.
  • Makes it harder to reason about side effects and interactions.
  • Can lead to subtle bugs when one module changes shared data in a way that other modules do not expect.

Data-sharing metrics help designers:

  • Identify modules that share too much data.
  • Restructure the system to reduce unnecessary sharing.
  • Improve modularity, encapsulation, and information hiding.
Sharing of Data among Modules Module A Module B Shared Data Pool Global Variables Common Data Structures Database Tables Module C Excessive Data Sharing: Increases coupling between modules ? Harder to reason about side effects Can lead to subtle bugs (unexpected changes) Data-Sharing Metrics Help: Identify modules that share too much Restructure to reduce sharing Improve modularity and encapsulation Figure 16: “Pipes” of data shared among the modules
Login to add personal notes and bookmarks.