Jump to content

ROSE Compiler Framework/Pointer Analysis

From Wikibooks, open books for an open world

Aliasing Relations

[edit | edit source]

An access path: the l-value of an expression that is constructed from variables, pointer dereference operators, and structure field selection operators.

  • For C: "*" deference, "." field selection, "->" dereference and field selection

Aliasing occurs when there exists more than one access path to a storage location. An access path is the l-value of an expression constructed from variables, pointer indirection operators, and structure field selection operators. For e.g., in C it includes ‘*’ dereference operator, ‘.’, ‘->’ operator etc. Consider the following statement:

 p = &r;

After this statement, *p and r refer to the same storage location and thus become aliases of each other, which can be expressed as the relation <*p, r>.

Two access paths are

  • a must-alias at a statement S if they refer to the same storage locations in all execution instances of S.
  • a may-aliases at S if they refer to the same storage location at some execution instance of S.

Trivial alias <x, x> holds for all access paths x, provided x does not result in a dereference of the null pointer.

  • Semantics: x is self aliasing to itself
  • if x and y are non-NULL pointers, <x,y> implies <*x, *y>

Compact representation

[edit | edit source]

A compact representation is used for each alias relation

Algorithm

[edit | edit source]

The current implementation implements Michael Hind, Michael Burke, Paul Carini, and Jong-Deok Choi. 1999. Interprocedural pointer alias analysis. ACM Trans. Program. Lang. Syst. 21, 4 (July 1999), 848-894. link.

Features

  • Intra procedural analysis: summarizes the alias information of each function
    • Flow-sensitive: control flow paths are considered
  • Inter-procedural analysis: considering call graphs to propagates alias information across the call graph to determine the “type” of the object
    • Context in-sensitive: call stack information is ignored
  • Complexity: computing pointer aliases is considered an NP-Hard problem; hence this algorithm uses approximation methods to compute aliases.
  • Uses compact representation graphs to represent alias information.

Testing

[edit | edit source]

make check rules:

  • [/path/build_rose/tests/roseTests/programAnalysisTests/generalDataFlowAnalysisTests]make verify-pointerAliasAnalysis

We use a primitive way to test the correctness.

  • Input test code: we add pragmas to indicate expected results
  • Test translator: compare lattices generated by the analysis with the pragma information.

Example input code

[edit | edit source]
// test_ptr2.C
int a;

void foo(int* &x)
{
    int b  = 10;  
    x = &b;
   // x has an alias set {b}
    #pragma rose x:Aliases:{b}{}b:Aliases:{}{}  

}
    
void main()
{   
    int *p;
    a = 20;
    foo(p);
   // p has an alias set {b} due to function call. 
    #pragma rose a:Aliases:{}{}p:Aliases:{b}{}  
}

No exact string match

[edit | edit source]

Exact string match is not used, due to the following reason:

The reason I did that is because unlike other analysis whose lattices contain exact information such as constantPropagation 

(Ex: [VarsExprsProductLattice: level=uninitialized
d: ConstantPropagationLattice:[level: constantValue, val = 9]
] ) my analysis contains temporary memory addresses(since it is a per variable lattice) which may be specific only to that run.

For e.g.,: [VarsExprsProductLattice: level=uninitialized
__expression_0x107b80200-SgIntVal: Aliases:{ }{}
__expression_0x107b80268-SgIntVal: Aliases:{ }{}
__expression_0x107b802d0-SgIntVal: Aliases:{ }{}
__expression_0x107b99a00-SgAssignOp: Aliases:{ }{}
__expression_0x107b99a70-SgAssignOp: Aliases:{ }{("p",p,0) ("a",a,-1)}
__expression_0x107b99ae0-SgAssignOp: Aliases:{ }{("p",p,0) ("b",b,-1)}
__expression_0x107b99b50-SgAssignOp: Aliases:{ }{}
__expression_0x107b99bc0-SgAssignOp: Aliases:{ }{("q",q,0) ("p",p,0)}
__expression_0x107bca800-SgAddressOfOp: Aliases:{ }{}
__expression_0x107bca868-SgAddressOfOp: Aliases:{ }{}
__expression_0x107bca8d0-SgAddressOfOp: Aliases:{ }{}
__expression_0x107c20a00-SgCastExp: Aliases:{ }{}
a: Aliases:{ }{}
b: Aliases:{ }{}
c: Aliases:{ }{}
p: Aliases:{ b }{}
q: Aliases:{ b }{}
x: Aliases:{ }{}
]

In order to verify the correctness of this output, all we need to check is the pointer aliases for the pointer variables -->
a: Aliases:{ }{}
b: Aliases:{ }{}
c: Aliases:{ }{}
p: Aliases:{ b }{}
q: Aliases:{ b }{}
x: Aliases:{ }{}

Since this is only the substring of the complete lattice, I used substr find rather than the exact string match.

I hope this clarifies your question. I will put a note in the code too.

Thanks
Akshatha

List

  • move to rose/src.
  • creating a temp variable for each new expression may not be necessary since GDF supports temporary expressions as objects.

References

[edit | edit source]

List of related papers

  • David Bacon and Peter Sweeney, “Fast Static Analysis of C++ virtual function calls”, OOPSLA'96
  • Michael Burke, Paul Carini, Jong-Deok Choi and Michael Hind, “Interprocedural Pointer Alias Analysis”, TOPLAS ‘99
  • Paul Carini Harini Srinivasan, “Flow-Sensitive Interprocedural Type Analysis for C++”, TechReport ’95
  • David J. Pearce , Paul H.J. Kelly , Chris Hankin, Efficient field-sensitive pointer analysis of C, ACM Transactions on Programming Languages and Systems (TOPLAS), v.30 n.1, p.4-es, November 2007