Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

getCallsOfSelf does not return calls made through interface reference #1225

Open
jow290764 opened this issue Jan 5, 2024 · 4 comments
Open

Comments

@jow290764
Copy link

jow290764 commented Jan 5, 2024

Consider the following code base:

// 1. Interface
public interface MyInterface {
	void doSomething();
}
// 2. Two implementations of the interface
public class MyClassA implements MyInterface {

	@Override
	public void doSomething() {
		// arbitrary code
	}
}
public class MyClassB implements MyInterface {

	@Override
	public void doSomething() {
		// arbitrary code
	}
}
// Finally some other code calling doSomething through the interface
public class MyClassCaller {
	protected void callDoSomething() {

		MyInterface myInterface = new MyClassA();
		myInterface.doSomething();
	}

Then calling getCallsOfSelf on JavaMethod object for MyClassA.doSomething will find nothing.
But expected would be to find MyClassCaller.callDoSomething.

Changing type of myInterface in callDoSomething from MyInterface to MyClassA leads to getCallsOfSelf finding callDoSomething

I was using the following dependency to archunit:

		<dependency>
			<groupId>com.tngtech.archunit</groupId>
			<artifactId>archunit-junit5-engine</artifactId>
			<version>1.2.1</version>
		</dependency>
@hankem
Copy link
Member

hankem commented Jan 7, 2024

This is actually expected behavior: ArchUnit does not (and in general cannot) track the type of all objects at runtime, and

	void callDoSomething() {
		MyInterface myInterface = new MyClassA();
		myInterface.doSomething();
	}

is compiled to the following byte code:

  void callDoSomething();
    Code:
       0: new           #7                  // class MyClassA
       3: dup
       4: invokespecial #9                  // Method MyClassA."<init>":()V
       7: astore_1
       8: aload_1
       9: invokeinterface #10,  1           // InterfaceMethod MyInterface.doSomething:()V
      14: return

– which doesn't know anything about a call to MyClassA.doSomething, which will only be resolved via runtime polymorphism.

@jow290764
Copy link
Author

jow290764 commented Jan 8, 2024

That is unfortunate because then it is not reliably possible to determine whether a specific method is directly or transitively invoked by certain other methods.

I am not familiar with Java bytecode, and I accept your answer, but I have heard that there are analysis tools such as ASM, Byte Buddy, or BCEL that can extract information about the implementation of interfaces from bytecode.
So, theoretically, it seems possible. Perhaps there could be discussions about whether it might be made possible in the future.

@hankem
Copy link
Member

hankem commented Jan 8, 2024

The information about implementation of interfaces is available in ArchUnit, but your question is about resolving runtime/dynamic polymorphism.

While it may seem obvious that

		MyInterface myInterface = new MyClassA();
		myInterface.doSomething();

calls MyClassA's doSomething() method, would you also expect that

	void callDoSomething() {
		doSomething(new MyClassA());
	}

	void callDoSomething(MyInterface myInterface) {
		myInterface.doSomething();
	}

is recognized? You can see that this can become arbitrarily complicated for a static code analysis tool.

To me, this seems impossible (in general), but I'm of course open to suggestions.
We can also discuss whether a limited scope (e.g. recognize your scenario, but not mine) might be reasonable.

@jow290764
Copy link
Author

I concur that precisely tracking the diverse paths taken by a program's control flow to conclusively assert that only objects of type T1 - Tn, implementing an interface MyInterface, can emerge at a particular juncture—excluding objects of type Tn+1 - Tn+m, which also implement MyInterface—would be a formidable task.

However, should one implement a mode positing that, when leveraging interfaces, any implementation of the interface could potentially manifest wherever the interface is employed, then it would become comparatively straightforward to compute all theoretically feasible transitive chains.

Or, could there be an aspect I've inadvertently overlooked?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants