Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flow sensitive wpa misses alias of global pointer? #1449

Open
Qcloud1223 opened this issue Apr 29, 2024 · 7 comments
Open

Flow sensitive wpa misses alias of global pointer? #1449

Qcloud1223 opened this issue Apr 29, 2024 · 7 comments

Comments

@Qcloud1223
Copy link
Contributor

Hi,

I'm having unexpected results when analyzing a complicated program, and the problem boils down to a seemingly simple issue. Consider this very simple program:

int *a;

int main()
{
    int b = 0;
    a = &b;

    return 0;
}

And the (related part of) PAG is also straightforward:
image

Andersen's analysis shows *a is an alias of b:

$ wpa -stat=false -ander -print-aliases global-ptr.bc
...
MayAlias var5[a (base object)@] -- var12[@main]

while flow sensitive says no:

$ wpa -stat=false -fspta -print-aliases global-ptr.bc
...
NoAlias var5[a (base object)@] -- var12[@main]

I'm having a hard time understanding this. Though we are doing whole program analysis, a = &b will eventually be executed, so flow sensitive analysis should not overlook it. I believe such behavior can only happen when feeding context prior to a = &b, to DDA. Is this indeed unexpected behavior or I'm just mixing concepts up?

Plus, is there a simple way to find all aliases of a value? Currently SVF has nice and clean interface to get pts and revPts of a value, but the only interface I find for alias checking is alias(node1, node2), which is used to traverse all PAG nodes to find all aliases.

@yuleisui
Copy link
Collaborator

It works for me for both analyses (-fspta and -ander). You could try the below code:

extern void MAYALIAS(void*,void*);
int *a;

int main()
{
    int b = 0;
    a = &b;

    MAYALIAS(a,&b);
    return 0;
}

clang -S -c -emit-llvm ex.c -o ex.ll
wpa -fspta ex.ll

[FlowSensitive] Checking MAYALIAS
	 SUCCESS :MAYALIAS check <id:18, id:12> at ()

@Qcloud1223
Copy link
Contributor Author

Thanks for your reply! I'm able to reproduce this, and the resulting PAG is here:

image

WPA says:

[FlowSensitive] Checking _Z8MAYALIASPvS_
         SUCCESS :_Z8MAYALIASPvS_ check <id:19, id:20> at ()

I can see that node 19 and node 20 is created as alias of a and &b, and SVF says they are aliases.

I'm wondering why SVF needs this to work. Also, can I analyze this program without modifying its source code?

@Qcloud1223
Copy link
Contributor Author

Here is another finding: using -ander will make pts{5} = {13}, even node 5 is not a ValVar. Using -fspta gives a empty pts for node 5.

FYI, I'm interested in which object a points to, and I come up with 2 possible way:

  1. Check the PTS of node 5. But FlowSensitive generates an empty PTS.
  2. Check the aliases of node 5. But FlowSensitive does not show any alias.

Even when I add MAYALIAS query (and any other function calls will work), I will have to traverse the PAG to actually get the new nodes created for function calls (node 19 and 20 in the example above), and then I can finally check they are aliases. But there is still no easy way to know I should run alias(19, 20)...

@yuleisui
Copy link
Collaborator

Here is another finding: using -ander will make pts{5} = {13}, even node 5 is not a ValVar. Using -fspta gives a empty pts for node 5.

If node 5 is a top-level pointer, it is fine to query its points-to using pts(5), but if it is an address taken object, you should query using a location id pts(5, loc).

FYI, I'm interested in which object a points to, and I come up with 2 possible way:

  1. Check the PTS of node 5. But FlowSensitive generates an empty PTS.
  2. Check the aliases of node 5. But FlowSensitive does not show any alias.

Even when I add MAYALIAS query (and any other function calls will work), I will have to traverse the PAG to actually get the new nodes created for function calls (node 19 and 20 in the example above), and then I can finally check they are aliases. But there is still no easy way to know I should run alias(19, 20)...

@yuleisui
Copy link
Collaborator

I would suggest a simple way of always querying top-level pointers but not address-taken objects. You could do that when an object is loaded to a pointer so you could query that pointer. In fact, only top-level pointers/registers are used for aliases and queries in real code.

@Qcloud1223
Copy link
Contributor Author

If node 5 is a top-level pointer, it is fine to query its points-to using pts(5)

Here node 5 is in the PAG above, and that does represent a top-level pointer, i.e., int *a in code.

but if it is an address taken object, you should query using a location id pts(5, loc)

Sorry, I did not really get what "location id" is (I guess it's something like context?). As far as I know, performing wpa does not take context as argument when checking pts, since there is only one final result.

I would suggest a simple way of always querying top-level pointers but not address-taken objects.

That is exactly what I did. However, using -fspta on top-level pointers gives an unexpected result:

# manual breakpoint set after PTA is done
$ gdb --args wpa -stat=false -ander global-ptr.bc
(gdb) p _pta->getPts(5).count()
$1 = 1
# top level pointer points to stack variable
(gdb) p *_pta->getPts(5).begin()
$2 = 13

$ gdb --args wpa -stat=false -fspta global-ptr.bc
# top level variable points to nothing
(gdb) p _pta->getPts(5).count()
$1 = 0

I'm expecting -ander and -fspta to give the same result on getPts(5), but they do not.

Sorry if I've mixed things up in previous posts. I hope now the question is a little clearer.

@yuleisui
Copy link
Collaborator

Node 5 can't be queried using pts(5) as it is an object which can be defined multiple times at different program points/locations.

You could only use the below APIs to get their pts:
getDFInPtsSet
getDFOutPtsSet

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants