Illustration of intermediate code optimization process with LLVM

Tram Ho

Usually, the column does not always produce the optimal code. Predicting that, code compilers always have a step to optimize intermediate code to remove unnecessary code parts to optimize resource use while ensuring program’s correctness. This article covers the basics of this process as well as provides some examples of how the LLVM IR code is optimized by methods such as constant folding, inlining functions, loop unrolling, etc.

Intermediate code optimization

Essentially, intermediate code optimization is a program switching technique that tries to improve the intermediate code by making it consume less resources (i.e. CPU, Memory) so that machine code runs faster. .

Image from https://www.guru99.com/compiler-design-phases-of-compiler.html

After the source code is analyzed through the steps ahead and used to generate the intermediate code in the Intermediate Code Generator step as shown in the chart above, the next optimal step will receive input as the code independent of the background. platform but has a structure similar to machine code like Three-address code or LLVM IR, … collectively called intermediate code or intermediate code to optimize this part of code.

A good code optimization process should have the following characteristics:

  • The optimization must be correct, it must not change the meaning of the program in any way.
  • Optimization will increase the speed and performance of the program.
  • The optimization should not adversely affect the overall compile process such as consuming too much time and resources just to optimize a piece of code without too much improvement in real-time resource usage. exam.

Several techniques are used to optimize intermediate code

Optimization techniques can be classified into two categories:

  • Device-independent optimization methods: This code optimization phase tries to improve intermediate code to get better target code as output. The intermediate code part converted here is not related to any CPU register or absolute memory location.
  • Device-specific optimization methods: Machine-dependent optimization is performed after the target code has been generated and when the code is transformed according to the target machine architecture. This optimization involves CPU registers and may have absolute memory references rather than relative references. Optimizers depending on the specifics of the device attempt to make the most of the memory hierarchy.

In this article, some methods that do not depend on the characteristics of the device are illustrated in detail as follows:

Constant folding

Constant folding or constant propagation are compiler-related optimizations used by many modern compilers. This is the process of recognizing and calculating expressions that are constant at compile time instead of calculating them at runtime.

To visualize more clearly consider the following example:

In this case the expression 28 / x + 2 has a constant value, so we can optimally generate the intermediate code generated from the code and optimize them by using:

Then the result obtained has the main code as follows:

As can be seen in the image above, the corresponding variable x in the source code is not created and the expression 28 / x + 2 is replaced by a value of 4.

Dead code elimination

Dead code elimination is followed by an optimized method to eliminate code commonly referred to as dead code, which includes code that can never be executed (inaccessible code) and code. affects only dead variables (written to, but never re-read), that is, unrelated to the program.

Removing parts of code does not affect the result of the program minimizes the size of the program, and it allows the program to avoid doing unrelated work, reducing execution time and saving time. check system resources.

Looking at the following example, we can see that the variable b not used after assignment as well as the lines of code after returning the value of c not executed. They are the dead code as mentioned above and will be removed during optimization. Remove these scripts using the following statement:

We obtained the new LLVM IR code as follows:

You can see that the above IR code just calculates the value of c and returns it. The above dead code sections have been removed.

Canonicalize induction variables

Canonicalize induction variables is a method of analyzing and converting induction variables (and the calculations derived from them) into simpler forms suitable for further analysis and transformations. Consider the following example we have:

It can be seen that the condition i*i < 4 can be replaced with a simpler condition i < 2 , which is what the induction sensor normalization method does when we use the above statement with flag. -indvars . The results are as follows:

Can’t you read what LLVM IR looks like in assembly? Me too. However, looking at the other code, we can guess that the assignment %exitcond is used to define the stop condition of something. Along with the fact that icmp is a predicate for an integer comparison and ne is used to test two unequal values, we can guess that this is the condition i < 2 is used as the stop condition of the loop. repeat. So indeed, the comparison i*i < 4 can be replaced with a simpler one for optimal performance in the future.

summary

This article covers the basics of the intermediate code optimization process as well as provides some examples of some of the simple methods used in this process. It can be seen that the three methods outlined above are just simple methods and their individual application results in only local optimizations and it is difficult to significantly improve the performance of an entire chapter. submit. Therefore, these methods and countless other methods are often used together such as using copy propagation before decoding, … and repeated many times until the optimal code is obtained. total. Thank you for taking the time to read this article.

Share the news now

Source : Viblo