Peephole optimization

Short description: Compiler optimization technique

Peephole optimization is an optimization technique performed on a small set of compiler-generated instructions; the small set is known as the peephole or window.^[1]

Peephole optimization involves changing the small set of instructions to an equivalent set that has better performance.

For example:

instead of pushing register A onto the stack and then immediately popping the value back into register A, a peephole optimization would remove both instructions;
instead of adding A to A, a peephole optimization might do an arithmetic shift left;
instead of multiplying a floating point register by 8, a peephole optimization might add 3 to the floating point register's exponent; and
instead of multiplying an index by 4, adding the result to a base address to get a pointer value, and then dereferencing the pointer, a peephole optimization might use a hardware addressing mode that accomplishes the same result with one instruction.

The term peephole optimization was introduced by William Marshall McKeeman in 1965.^[2]

Replacement rules

Common techniques applied in peephole optimization:^[3]

Null sequences – Delete useless operations.
Combine operations – Replace several operations with one equivalent.
Algebraic laws – Use algebraic laws to simplify or reorder instructions.
Special case instructions – Use instructions designed for special operand cases.
Address mode operations – Use address modes to simplify code.

There can be other types of peephole optimizations.

Examples

Replacing slow instructions with faster ones

The following Java bytecode

...
aload 1
aload 1
mul
...

can be replaced by

...
aload 1
dup
mul
...

This kind of optimization, like most peephole optimizations, makes certain assumptions about the efficiency of instructions. For instance, in this case, it is assumed that the dup operation (which duplicates and pushes the top of the stack) is more efficient than the aload X operation (which loads a local variable identified as X and pushes it on the stack).

Removing redundant code

Another example is to eliminate redundant load stores.

 a = b + c;
 d = a + e;

is straightforwardly implemented as

MOV b, R0  ; Copy b to the register
ADD c, R0  ; Add  c to the register, the register is now b+c
MOV R0, a  ; Copy the register to a
MOV a, R0  ; Copy a to the register
ADD e, R0  ; Add  e to the register, the register is now a+e [(b+c)+e]
MOV R0, d  ; Copy the register to d

but can be optimised to

MOV b, R0  ; Copy b to the register
ADD c, R0  ; Add c to the register, which is now b+c (a)
MOV R0, a  ; Copy the register to a
ADD e, R0  ; Add e to the register, which is now b+c+e [(a)+e]
MOV R0, d  ; Copy the register to d

Removing redundant stack instructions

If the compiler saves registers on the stack before calling a subroutine and restores them when returning, consecutive calls to subroutines may have redundant stack instructions.

Suppose the compiler generates the following Z80 instructions for each procedure call:

PUSH AF
 PUSH BC
 PUSH DE
 PUSH HL
 CALL _ADDR
 POP HL
 POP DE
 POP BC
 POP AF

If there were two consecutive subroutine calls, they would look like this:

PUSH AF
 PUSH BC
 PUSH DE
 PUSH HL
 CALL _ADDR1
 POP HL
 POP DE
 POP BC
 POP AF
 PUSH AF
 PUSH BC
 PUSH DE
 PUSH HL
 CALL _ADDR2
 POP HL
 POP DE
 POP BC
 POP AF

The sequence POP regs followed by PUSH for the same registers is generally redundant. In cases where it is redundant, a peephole optimization would remove these instructions. In the example, this would cause another redundant POP/PUSH pair to appear in the peephole, and these would be removed in turn. Assuming that subroutine _ADDR2 does not depend on previous register values, removing all of the redundant code in the example above would eventually leave the following code:

PUSH AF
 PUSH BC
 PUSH DE
 PUSH HL
 CALL _ADDR1
 CALL _ADDR2
 POP HL
 POP DE
 POP BC
 POP AF

Implementation

Modern compilers often implement peephole optimizations with a pattern matching algorithm.^[4]

References

↑ Advanced Compiler Design and Implementation. Academic Press / Morgan Kaufmann. 1997-08-15. ISBN 978-1-55860-320-2. https://books.google.com/books?id=Pq7pHwG1_OkC&q=peephole.
↑ "Peephole optimization". Communications of the ACM 8 (7): 443–444. July 1965. doi:10.1145/364995.365000.
↑ Crafting a Compiler. Addison-Wesley. 2010. ISBN 978-0-13-606705-4. http://bank.engzenon.com/download/560e7301-482c-43fd-9f80-16a9c0feb99b/Crafting_a_Compiler_by_Fischer_Cytron_and_LeBlanc.pdf. Retrieved 2018-07-02.
↑ "Chapter 8.9.2 Code Generation by Tiling an Input Tree". Compilers – Principles, Techniques, & Tools (2 ed.). Pearson Education. 2007. p. 540. http://www.informatik.uni-bremen.de/agbkb/lehre/ccfl/Material/ALSUdragonbook.pdf. Retrieved 2018-07-02.

External links

0.00

(0 votes)

Original source: https://en.wikipedia.org/wiki/Peephole optimization. Read more

[Muchnick_1997-1] Advanced Compiler Design and Implementation. Academic Press / Morgan Kaufmann. 1997-08-15. ISBN 978-1-55860-320-2. https://books.google.com/books?id=Pq7pHwG1_OkC&q=peephole.

[McKeeman_1965-2] "Peephole optimization". Communications of the ACM 8 (7): 443–444. July 1965. doi:10.1145/364995.365000.

[Fischer_2010-3] Crafting a Compiler. Addison-Wesley. 2010. ISBN 978-0-13-606705-4. http://bank.engzenon.com/download/560e7301-482c-43fd-9f80-16a9c0feb99b/Crafting_a_Compiler_by_Fischer_Cytron_and_LeBlanc.pdf. Retrieved 2018-07-02.

[Aho_2007-4] "Chapter 8.9.2 Code Generation by Tiling an Input Tree". Compilers – Principles, Techniques, & Tools (2 ed.). Pearson Education. 2007. p. 540. http://www.informatik.uni-bremen.de/agbkb/lehre/ccfl/Material/ALSUdragonbook.pdf. Retrieved 2018-07-02.

[1]

[2]

[3]

[4]

v t e Compiler optimizations
Basic block	Peephole optimization
Loop optimization	Induction variable Strength reduction Loop fusion Loop inversion Loop interchange Loop-invariant code motion Loop nest optimization Loop unrolling Loop splitting Loop unswitching Software pipelining Automatic parallelization
Data-flow analysis	Common subexpression elimination Constant folding Induction variable recognition and elimination Dead store elimination Use-define chain Live variable analysis Available expression
SSA-based	Global value numbering Sparse conditional constant propagation
Code generation	Register allocation Instruction selection Instruction scheduling Rematerialization
Functional	Tail call elimination Deforestation
Global	Interprocedural optimization
Other	Bounds-checking elimination Compile-time function execution Dead code elimination Inline expansion Jump threading
Static analysis	Alias analysis Pointer analysis Shape analysis Escape analysis Array access analysis Dependence analysis Control flow analysis Data-flow analysis

Anonymous

Search

Peephole optimization

Namespaces

More

Page actions

Contents

Replacement rules

Examples

Replacing slow instructions with faster ones

Removing redundant code

Removing redundant stack instructions

Implementation

See also

References

External links

Navigation

Navigation

Help

Translate

Wiki tools

Wiki tools

Anonymous

Search

Peephole optimization

Replacement rules

Examples

Replacing slow instructions with faster ones

Removing redundant code

Removing redundant stack instructions

Implementation

See also

References

External links

Navigation

Wiki tools

Page tools

Other projects

Categories