Short description: Software maintainability index

Halstead complexity measures are software metrics introduced by Maurice Howard Halstead in 1977[1] as part of his treatise on establishing an empirical science of software development. Halstead made the observation that metrics of the software should reflect the implementation or expression of algorithms in different languages, but be independent of their execution on a specific platform. These metrics are therefore computed statically from the code.

Halstead's goal was to identify measurable properties of software, and the relations between them. This is similar to the identification of measurable properties of matter (like the volume, mass, and pressure of a gas) and the relationships between them (analogous to the gas equation). Thus his metrics are actually not just complexity metrics.

## Calculation

For a given problem, let:

• $\displaystyle{ \,\eta_1 }$ = the number of distinct operators
• $\displaystyle{ \,\eta_2 }$ = the number of distinct operands
• $\displaystyle{ \,N_1 }$ = the total number of operators
• $\displaystyle{ \,N_2 }$ = the total number of operands

From these numbers, several measures can be calculated:

• Program vocabulary: $\displaystyle{ \eta = \eta_1 + \eta_2 \, }$
• Program length: $\displaystyle{ N = N_1 + N_2 \, }$
• Calculated estimated program length: $\displaystyle{ \hat{N} = \eta_1 \log_2 \eta_1 + \eta_2 \log_2 \eta_2 }$
• Volume: $\displaystyle{ V = N \times \log_2 \eta }$
• Difficulty : $\displaystyle{ D = { \eta_1 \over 2 } \times { N_2 \over \eta_2 } }$
• Effort: $\displaystyle{ E = D \times V }$

The difficulty measure is related to the difficulty of the program to write or understand, e.g. when doing code review.

The effort measure translates into actual coding time using the following relation,

• Time required to program: $\displaystyle{ T = {E \over 18} }$ seconds

Halstead's delivered bugs (B) is an estimate for the number of errors in the implementation.

• Number of delivered bugs : $\displaystyle{ B = {E^{2 \over 3} \over 3000} }$ or, more recently, $\displaystyle{ B = {V \over 3000} }$ is accepted.[1]

## Example

Consider the following C program:

main()
{
int a, b, c, avg;
scanf("%d %d %d", &a, &b, &c);
avg = (a+b+c)/3;
printf("avg = %d", avg);
}

The distinct operators ($\displaystyle{ \,\eta_1 }$) are: main, (), {}, int, scanf, &, =, +, /, printf, ,, ;

The distinct operands ($\displaystyle{ \,\eta_2 }$) are: a, b, c, avg, "%d %d %d", 3, "avg = %d"

• $\displaystyle{ \eta_1 = 12 }$, $\displaystyle{ \eta_2 = 7 }$, $\displaystyle{ \eta = 19 }$
• $\displaystyle{ N_1 = 27 }$, $\displaystyle{ N_2 = 15 }$, $\displaystyle{ N = 42 }$
• Calculated Estimated Program Length: $\displaystyle{ \hat{N} = 12 \times log_2 12 + 7 \times log_2 7 = 62.67 }$
• Volume: $\displaystyle{ V = 42 \times log_2 19 = 178.4 }$
• Difficulty: $\displaystyle{ D = { 12 \over 2 } \times { 15 \over 7 } = 12.85 }$
• Effort: $\displaystyle{ E = 12.85 \times 178.4 = 2292.44 }$
• Time required to program: $\displaystyle{ T = { 2292.44 \over 18 } = 127.357 }$ seconds
• Number of delivered bugs: $\displaystyle{ B = { 2292.44 ^ { 2 \over 3 } \over 3000 } = 0.05 }$