ROSE Compiler Framework/Inliner
The ROSE Inliner inlines functions at function callsites.
Background
[edit | edit source]An inline function is one for which the compiler copies the code from the function definition directly into the code of the calling function rather than creating a separate set of instructions in memory. Instead of transferring control to and from the function code segment, a modified copy of the function body may be substituted directly for the function call. In this way, the performance overhead of a function call is avoided. The inline specifier is only a suggestion to the compiler that an inline expansion can be performed; the compiler is free to ignore the suggestion.
User Instructions
[edit | edit source]You must enable EDG 5.0 to inline C++11 code
- --enable-edg_version=5.0
install the tool
[edit | edit source]configure and build ROSE as instructed at https://github.com/rose-compiler/rose/wiki
inlineEverything -c [options] input.c
Usage
[edit | edit source]This is a program transformation tool to inline function calls in your C/C++ or Fortran code.
Usage: inlineEverything -c [options] input.c
The optional options include:
- -skip-postprocessing: Skip postprocessing which cleanups code
- -process-headers: Process calls within header files
- -verbose: Printout debugging information
- -limit N: Inline up to N functions, then stop
- -main-only: Inline only functions reachable from main()
Source code
[edit | edit source]API and implementation:
// main API bool doInline(SgFunctionCallExp*, bool)
The tool's source code
- https://github.com/rose-compiler/rose/tree/develop/tests/nonsmoke/functional/roseTests/astInliningTests/inlineEverything.C
- command line processing is handled in this source file
Algorithm
[edit | edit source]216 // Main inliner code. Accepts a function call as a parameter, and inlines 217 // only that single function call. Returns true if it succeeded, and false 218 // otherwise. The function call must be to a named function, static member 219 // function, or non-virtual non-static member function, and the function 220 // must be known (not through a function pointer or member function 221 // pointer). Also, the body of the function must already be visible. 222 // Recursive procedures are handled properly (when allowRecursion is set), by 223 // inlining one copy of the procedure into itself. Any other restrictions on 224 // what can be inlined are bugs in the inliner code. 225 bool 226 doInline(SgFunctionCallExp* funcall, bool allowRecursion)
Major steps
[edit | edit source]- eligibility check: skip things which cannot be inlined
- if a function call is used as an expression operand: e.g. a = func1() + func2();
- generate a temp variable to obtain the returned value: e.g. temp = func1();
- replace the function call expression with the temp variable. e.g. a = temp + temp;
- a slight optimization: if the function call is the only expression operand: e.g. a= func1(). No temp variable is needed (a can be directly used without another temp variable serving as intermediate variable.)
- obtain the list of actual argument
- make a copy of the body of the function to be inlined
- rename labels in an inlined function definition. goto statements to them will be updated.
- in the function body
- create local variables , one per formal argument, initialize each with the actual argument
- build a paramMap: mapping formal arguments (SgInitializedName) to new local variables (SgVariableSymbol)
- this pointer is handled similarly: create a local variable , initialized with the caller's this pointer
- replace variable references in the body with actual arguments // ReplaceParameterUseVisitor(paramMap).traverse(funbody_copy, postorder);
- insert a label to indicate the end of the inlined function body // rose_inline_end__
Limitations of this algorithm is not very clean:
- It generates new local variables and a label.
Eligibility check
[edit | edit source]What can be inlined
- a named function,
- static member function, or
- qualified name does not start with "::std:: " // skip std:: functions
- non-virtual non-static member function, // skip virtual functions, static member functions cannot access this->data (non-static data). That is why we check non-static for this ptr case.
- the function must be known (not through a function pointer or member function pointer). // empty function reference expression
- the body of the function must already be visible in the current AST. // skip functions with empty body
postprocessing
[edit | edit source]inlineEverything.C has a clean up step to make the outlined code better
- cleanupInlinedCode(sageProject);
- remove unused labels: SageInterface::removeUnusedLabels(top);
- remove jumps to next statement: SageInterface::removeJumpsToNextStatement(top);
- simpleCopyAndConstantPropagation() // In code with declarations such as "int foo = bar", where foo and bar are not modified, replace "foo" with "bar" and remove the declaration
- remove null statements: RemoveNullStatementsVisitor().traverse(top, postorder);
- move declaration to first use: MoveDeclarationsToFirstUseVisitor().traverse(top, postorder); e.g. int x__2 =7; w= x_2 +3; ==> w=7+3;
- doSubexpressionExpansionSmart () // Replaces all uses of a variable by its initialing expression. Requires that initname has an assign initializer Replaces all uses of initname in initname's scope by copy of its initializer expression. Then removes initname.
- changeAllMembersToPublic(sageProject);
This step does lots of things and can easily trigger some bugs.
- Even wrose, the cleanup step operates on the entire AST, including C++ headers.
// Post-inline AST normalizations // DQ (6/12/2015): These functions first renames all variable (a bit heavy handed for my tastes) // and then (second) removes the blocks that are otherwise added to support the inlining. The removal // of the blocks is the motivation for renaming the variables, but the variable renaming is // done evarywhere instead of just where the functions are inlined. I think the addition of // the blocks is a better solution than the overly agressive renaming of variables in the whole // program. So the best solution is to comment out both of these functions. All test codes // pass (including the token-based unparsing tests). // renameVariables(sageProject); // flattenBlocks(sageProject); cleanupInlinedCode(sageProject); // In code with declarations such as "int foo = bar", where foo and bar are // not modified, replace "foo" with "bar" and remove the declaration void simpleCopyAndConstantPropagation(SgNode* top) { FindReferenceVariablesVisitor().traverse(top, preorder); FindCopiesVisitor().traverse(top, preorder); FindUsedDeclarationsVisitor vis; vis.traverse(top, preorder); RemoveUnusedDeclarationsVisitor(vis.used_decls, set<SgFunctionDeclaration*>()).traverse(top, postorder); }
Testing
[edit | edit source]Test directory with an example translator and test input files
By looking into Makefile.am, the example translator's source code will generate an executable named "inlineEverything" in your buildtree.
translator: inlineEverything
[edit | edit source]This is the tool you can try to inline your sample code.
- inlineEverything
The same Makefile.am's make check rules contain sample command lines to use the tool.
To test one individual input file (such as template_functions.C), type
- make inlineEverything_template_functions.C.passed // TODO: udpate to new ways to trigger the tests
Command line options
[edit | edit source]inlineEverything --help
---------------------Tool-Specific Help----------------------------------- This is a program transformation tool to inline function calls in your C/C++ or Fortran code. Usage: inlineEverything -c [options] input.c The optional options include: -skip-postprocessing: Skip postprocessing which cleanups code -process-headers: Process calls within header files -verbose: Printout debugging information
postprocessing
[edit | edit source]--------------input---------------- bash-4.2$ cat specimen25_1.C template<typename T> void swap(T& x, T& y) { T tmp = x; x = y; y = tmp; } int foo (int a, int b) { swap(a,b); } int main() { } ----- command line ------------- bash-4.2$ inlineEverything -c specimen26_1.C -----------output: with postprocessing (cleanup) -------------- bash-4.2$ cat rose_specimen25_1.C template < typename T > void swap ( T & x, T & y ) { T tmp = x; x = y; y = tmp; } int foo(int a,int b) { { int tmp = a; a = b; b = tmp; } } int main() { } // output without postprocessing: cleanup template < typename T > void swap ( T & x, T & y ) { T tmp = x; x = y; y = tmp; } int foo(int a,int b) { { int &x__2 = a; int &y__3 = b; int tmp = x__2; x__2 = y__3; y__3 = tmp; rose_inline_end__4: ; } } int main() { }
Correctness checking
[edit | edit source]make check only check if the number of inlined functions is the same as expected.
# Note: must use the name convention of specimenXX_N.C , in which N is the number of function calls inlined. # The specimens are named so that the number between the "_" and next "." is the number of function calls that # we expect this specimen to inline. inlineEverything_specimens = \ specimen01_1.C .. inlineEverything_test_targets = $(addprefix inlineEverything_, $(addsuffix .passed, $(inlineEverything_specimens))) TEST_TARGETS += $(inlineEverything_test_targets) $(inlineEverything_test_targets): inlineEverything_%.passed: % inlineEverything inlineEverything.conf @$(RTH_RUN) \ TITLE="inlineEverything $< [$@]" \ SPECIMEN="$(abspath $<)" \ NINLINE="$$(echo $(notdir $<) |sed --regexp-extended 's/specimen[0-9]+_([0-9]+).*/\1/')" \ TRANSLATOR="$$(pwd)/inlineEverything" \ $(srcdir)/inlineEverything.conf $@ cat inlineEverything.conf # Test configuration file (see "scripts/rth_run.pl --help" for details) # Tests the inliner # Run the tests in subdirectories for ease of cleanup. subdir = yes # Run the test and then make sure the output contains a certain string cmd = ${VALGRIND} ${TRANSLATOR} -rose:verbose 0 ${SPECIMEN} -o a.out |tee ${TEMP_FILE_0} cmd = grep "Test inlined ${NINLINE} function" ${TEMP_FILE_0} cmd = cat -n rose_* cmd = ./a.out # Extra stuff that might be useful to specify in the makefile title = ${TITLE} disabled = ${DISABLED} timeout = ${TIMEOUT}
examples
[edit | edit source]We use a set of increasingly more complex examples to explain the inlining algorithm used.
More example input and output files are available at:
naked call: no parameters in nor return output
[edit | edit source]extern int x; void incrementX() { x++; } int main() { incrementX(); return x; } //----------output, without postprocessing -------- extern int x; void incrementX() { x++; } int main() { // the function body is copied here { x++; rose_inline_end__2: // a label for the end of a function is generated. ; } return x; } //-----------output, with postprocessing for clean up // unused label and empty statement are removed extern int x; void incrementX() { x++; } int main() { // the function body is copied here { x++; } return x; }
function with a return
[edit | edit source]// a function with a return extern int x; int incrementX() { x++; return x; } int main() { incrementX(); return x; } //---------- output without postprocessing // a function with a return extern int x; int incrementX() { x++; return x; } int main() { { x++; { // a return statement is translated into a block, go to the exit point in the end x; goto rose_inline_end__2; } rose_inline_end__2: // a label for the end of the function: the exit point ; } return x; } //-------- with postprocessing, the code look the same // a function with a return extern int x; int incrementX() { x++; return x; } int main() { { x++; { goto rose_inline_end__2; } rose_inline_end__2: ; } return x; }
function calls as two expressions
[edit | edit source]// input code ----------------------- int foo(int i) { return 5+i; } int main(int, char**) { int w; w = foo(1)+ foo(2); return w; } //--------------after inlining----------------- // You can see that a temparory variable is used to capture the returned value of a function call. // Then the temp variable is used to replace the original function call expression int foo(int i) { return 5 + i; } int main(int ,char **) { int w; int rose_temp__4; { int i__2 = 1; { rose_temp__4 = 5 + i__2; goto rose_inline_end__3; } rose_inline_end__3: ; } int rose_temp__8; { int i__6 = 2; { rose_temp__8 = 5 + i__6; goto rose_inline_end__7; } rose_inline_end__7: ; } w = rose_temp__4 + rose_temp__8; return w; } //----- postprocessing does not simplify the code any further int foo(int i) { return 5 + i; } int main(int ,char **) { int rose_temp__4; { { rose_temp__4 = 5 + 1; goto rose_inline_end__3; } rose_inline_end__3: ; } int rose_temp__8; { { rose_temp__8 = 5 + 2; goto rose_inline_end__7; } rose_inline_end__7: ; } int w = rose_temp__4 + rose_temp__8; return w; }
function call as a single expression
[edit | edit source]An optimized transformation:
- not blindly generate a temp variable to capture the value of a function call.
- instead directly reuse the original declaration of the lhs variable inside the function body
int foo(int i) { return 5+i; } int main(int, char**) { int w; w = foo(1); return w; } //-------------after inlining ---------- int foo(int i) { return 5 + i; } int main(int ,char **) { int w; { int i__2 = 1; { w = 5 + i__2; goto rose_inline_end__3; } rose_inline_end__3: ; } return w; } //postprocessing does not simplify the code further. int foo(int i) { return 5 + i; } int main(int ,char **) { int w; { { w = 5 + 1; goto rose_inline_end__3; } rose_inline_end__3: ; } return w; }
3 operand operations
[edit | edit source]the code is normalized.
#include <stdlib.h> int foo() { exit (1); return 0; } int main(int, char**) { int w, x = 7; w = x == 8 ? foo() : 0; return w; } //----------- after inlining --------------- #include <stdlib.h> int foo() { exit(1); return 0; } int main(int ,char **) { int w; int x = 7; if (x == 8) { int rose_temp__4; { exit(1); { rose_temp__4 = 0; goto rose_inline_end__2; } rose_inline_end__2: ; } w = rose_temp__4; } else { w = 0; } return w; }
data member access function
[edit | edit source]#include <vector> typedef int Index_t ; struct Domain { public: // non-reference type Index_t numNode() { return m_numNode ; } void AllocateNodeElemIndexes() { Index_t numNode = this->numNode() ; } #if 0 // the best inline result should look like the following void AllocateNodeElemIndexes_inlined() { Index_t numNode = m_numNode; // call site 1 inlined } #endif private: Index_t m_numNode ; } domain; //----------------------------after inlining ---------------- #include <vector> typedef int Index_t; struct Domain { // non-reference type inline Index_t numNode() { return (this) -> m_numNode; } inline void AllocateNodeElemIndexes() { //x. split declaration + initializer into two parts // a temporary variable to transfer value of initializer Index_t rose_temp__3; //x. a new code block to embed the function body { struct Domain *this__1 = this__1; { rose_temp__3 = this__1 -> m_numNode; //x. goto the label after function call goto rose_inline_end__2; } //x. label after the function call rose_inline_end__2: ; } Index_t numNode = rose_temp__3; } Index_t m_numNode; } domain;
C++ template functions
[edit | edit source]--------------input---------------- bash-4.2$ cat specimen25_1.C template<typename T> void swap(T& x, T& y) { T tmp = x; x = y; y = tmp; } int foo (int a, int b) { swap(a,b); } int main() { } ----- command line ------------- bash-4.2$ inlineEverything -c -skip-postprocessing specimen26_1.C -----------output: with postprocessing (cleanup) -------------- // output without postprocessing: cleanup template < typename T > void swap ( T & x, T & y ) { T tmp = x; x = y; y = tmp; } int foo(int a,int b) { { int &x__2 = a; // local variables for each formal arguments, initialized with actual arguments int &y__3 = b; int tmp = x__2; // variable references are replace with the local variables x__2 = y__3; y__3 = tmp; rose_inline_end__4: // a label to indicate the end of the outlined function body. ; } } int main() { }
multiple-level function calls
[edit | edit source]//------------input int foo(int x) { return x + 3; } int bar(int y) { return foo(y) + foo(2); } int main(int, char**) { int w; w = bar(1); return 0; } //--------------- output, no postprocessing ------------ int foo (int x) { return x + 3; } int bar (int y) { int rose_temp__4; { int x__2 = y; { rose_temp__4 = x__2 + 3; goto rose_inline_end__3; } rose_inline_end__3: ; } int rose_temp__8; { int x__6 = 2; { rose_temp__8 = x__6 + 3; goto rose_inline_end__7; } rose_inline_end__7: ; } return rose_temp__4 + rose_temp__8; } int main (int, char **) { int w; { int y__10 = 1; int rose_temp__4; { int x__2 = y__10; { rose_temp__4 = x__2 + 3; goto rose_inline_end__3__1; } rose_inline_end__3__1: ; } int rose_temp__8; { int x__6 = 2; { rose_temp__8 = x__6 + 3; goto rose_inline_end__7__2; } rose_inline_end__7__2: ; } { w = rose_temp__4 + rose_temp__8; goto rose_inline_end__11; } rose_inline_end__11: ; } return 0; }
Tutorial
[edit | edit source]Official documentation about how to call the inlining API is:
- Chapter 36 "Calling the Inliner of ROSE" tutorial: http://rosecompiler.org/ROSE_Tutorial/ROSE-Tutorial.pdf
Troubleshooting
[edit | edit source]TEST inlineEverything ../../../../../../sourcetree/tests/nonsmoke/functional/roseTests/astInliningTests/template_functions.C [inlineEverything_template_functions.C.passed]
inlineEverything_template_functions.C [out]: Note: C++11 input files to ROSE are NOT supported using EDG 4.9 configuration with GNU compilers 4.9 and greater (configure ROSE using EDG 4.12)