Constant folding

From HandWiki
Short description: Type of compiler optimization

Constant folding and constant propagation are related compiler optimizations used by many modern compilers.[1] An advanced form of constant propagation known as sparse conditional constant propagation can more accurately propagate constants and simultaneously remove dead code.

Constant folding

Constant folding is the process of recognizing and evaluating constant expressions at compile time rather than computing them at runtime. Terms in constant expressions are typically simple literals, such as the integer literal 2, but they may also be variables whose values are known at compile time. Consider the statement:

i = 320 * 200 * 32;

Most compilers would not actually generate two multiply instructions and a store for this statement. Instead, they identify constructs such as these and substitute the computed values at compile time (in this case, 2,048,000).

Constant folding can make use of arithmetic identities. If x is numeric, the value of 0 * x is zero even if the compiler does not know the value of x (note that this is not valid for IEEE floats since x could be Infinity or NaN. Still, some environments that favor performance such as GLSL shaders allow this for constants, which can occasionally cause bugs).

Constant folding may apply to more than just numbers. Concatenation of string literals and constant strings can be constant folded. Code such as "abc" + "def" may be replaced with "abcdef".

Constant folding and cross compilation

In implementing a cross compiler, care must be taken to ensure that the behaviour of the arithmetic operations on the host architecture matches that on the target architecture, as otherwise enabling constant folding will change the behaviour of the program. This is of particular importance in the case of floating point operations, whose precise implementation may vary widely.

Constant propagation

Constant propagation is the process of substituting the values of known constants in expressions at compile time. Such constants include those defined above, as well as intrinsic functions applied to constant values. Consider the following pseudocode:

int x = 14;
  int y = 7 - x / 2;
  return y * (28 / x + 2);

Propagating x yields:

int x = 14;
  int y = 7 - 14 / 2;
  return y * (28 / 14 + 2);

Continuing to propagate yields the following (which would likely be further optimized by dead-code elimination of both x and y.)

int x = 14;
  int y = 0;
  return 0;

Constant propagation is implemented in compilers using reaching definition analysis results. If all variable's reaching definitions are the same assignment which assigns a same constant to the variable, then the variable has a constant value and can be replaced with the constant.

Constant propagation can also cause conditional branches to simplify to one or more unconditional statements, when the conditional expression can be evaluated to true or false at compile time to determine the only possible outcome.

The optimizations in action

Constant folding and propagation are typically used together to achieve many simplifications and reductions, by interleaving them iteratively until no more changes occur. Consider this unoptimized pseudocode that returns a number not known without further analysis:

int a = 30;
  int b = 9 - (a / 5);
  int c;

  c = b * 4;
  if (c > 10) {
     c = c - 10;
  }
  return c * (60 / a);

Applying constant propagation once, followed by constant folding, yields:

int a = 30;
  int b = 3;
  int c;

  c = b * 4;
  if (c > 10) {
     c = c - 10;
  }
  return c * 2;

Repeating both steps twice results in:

int a = 30;
  int b = 3;
  int c;

  c = 12;
  if (true) {
     c = 2;
  }
  return c * 2;

As a and b have been simplified to constants and their values substituted everywhere they occurred, the compiler now applies dead-code elimination to discard them, reducing the code further:

int c;
  c = 12;

  if (true) {
     c = 2;
  }
  return c * 2;

In above code, instead of true it could be 1 or any other Boolean construct depending on compiler framework. With traditional constant propagation we will get only this much optimization. It can't change structure of the program.

There is another similar optimization, called sparse conditional constant propagation, which selects the appropriate branch on the basis of if-condition.[2] The compiler can now detect that the if statement will always evaluate to true, c itself can be eliminated, shrinking the code even further:

return 4;

If this pseudocode constituted the body of a function, the compiler could further take advantage of the knowledge that it evaluates to the constant integer 4 to eliminate unnecessary calls to the function, producing further performance gains.

See also

References

  1. Steven Muchnick; Muchnick and Associates (15 August 1997). Advanced Compiler Design Implementation. Morgan Kaufmann. ISBN 978-1-55860-320-2. https://archive.org/details/advancedcompiler00much. "constant propagation OR constant folding." 
  2. Wegman, Mark N; Zadeck, F. Kenneth (April 1991), "Constant Propagation with Conditional Branches", ACM Transactions on Programming Languages and Systems 13 (2): 181–210, doi:10.1145/103135.103136 

Further reading