Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support directory extraction in modular OSGi runtimes (issue #506) #511

Draft
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

HannesWell
Copy link

In order to resolve issue #506 I implemented a solution based on @timothyjward suggestion(#506 (comment)) to support extraction of directories in case javacpp is a bundle in a OSGi runtime.

The indention of this solution is to be independent of a specific OSGi framework implementation.

In contrast to @timothyjward suggestion I used Bundle.findEntries() instead of BundleWiring.listResources() because it is a bit less code (no need to obtain the BundleWiring) and already gives the desired URLs of all files in the directory and not only the String paths within the bundle (avoids 'manual' URL construction).
Since the javacpp bundles don't have a Bundle-Classpath header specified in their Manifest it defaults to the dot ".", so the bundle-classpath is just the jar content. And since the search for resources is limited to local resources, I think the result should always be the same. In Equinox findEntries() returns a URL using the bundleentry schema instead of bundleresources, but that's not an issue.

I have tested it with Equinox and the cpython javacpp preset and it works quite well.

The only flaw is that the javacpp windows-dll's (I assume it is the same case for linux) are cached twice:

  1. in the folder for the javacpp bundle
  2. in the folder of the cpython bundle

The second location is obviously not wanted. I tried to investigate that and I think the reason is that the dll's are loaded a second time in the context of a class of the cpython bundle/preset in Loader.load(Class, Properties, boolean, String) when oldUrls == null || urls.length > 0 is true and urls is not empty.

I think this problem could be avoided if we were able to obtain the bundle from the resource-URLs directly. Then it would be possible to find the containing bundle. But unfortunately I don't know a proper official/API way to do this. If any of you knows one, it is more than welcome.

The only approach I found, that would hopefully work with different OSGi implementations, was to maintain a map of URL-hosts to the corresponding bundle/bundle-id that is filled when the Loader is initialized and also keeps track of installed/uninstalled bundles via a BundleListener.
At least in Equinox the resource-URL's host is always the same for the same bundle. Do you know if this is the case for other OSGi implementations as well?
This as the additional advantage that there is no need to pass a class-argument all the way down. I can submit this approach with separate commit to this PR.

@HannesWell HannesWell marked this pull request as draft August 17, 2021 20:41
@timothyjward
Copy link
Contributor

At least in Equinox the resource-URL's host is always the same for the same bundle. Do you know if this is the case for other OSGi implementations as well?

The guarantees about the URL scheme are described in the specification. You will need to be careful using the host as this sometimes includes information about the bundle revision which changes when the bundle is refreshed or updated.

The most relevant section of the specification is here:
http://docs.osgi.org/specification/osgi.core/7.0.0/framework.module.html#i3178562

@saudet
Copy link
Member

saudet commented Aug 18, 2021

Like I said before, I'd like it if we could do everything with reflection, but if this is not desirable, another way of going about it is to put everything related to OSGi in another class, just like stuff related to MXBeans isn't in Pointer, because it's not available on Android, but in tools/PointerBufferPoolMXBean. In this case, I'd see something like maybe tools/OsgiBundleResourceLoader as per Spring?
https://docs.spring.io/spring-osgi/docs/current/api/org/springframework/osgi/io/OsgiBundleResourceLoader.html

Also doing without having to pass around the Class object everywhere would be great, yes. Thanks for looking into this!

- fixes extraction paths and OSGi related code into own class
- prevents NoClassDefFoundError in non-OSGi environments
@HannesWell
Copy link
Author

I have move all code that directly interacts with OSGi APIs into tools/OSGiBundleResourceLoader and also extract the bundle from the URL.
There are still some TODOs and time measurements that I will resolve/remove once we agreed on the code.

At least in Equinox the resource-URL's host is always the same for the same bundle. Do you know if this is the case for other OSGi implementations as well?

The guarantees about the URL scheme are described in the specification. You will need to be careful using the host as this sometimes includes information about the bundle revision which changes when the bundle is refreshed or updated.

The most relevant section of the specification is here:
http://docs.osgi.org/specification/osgi.core/7.0.0/framework.module.html#i3178562

Thanks for your remark. As you said the OSGi spec does not specify any requirements on the URLs host and I did not find a better way to extract the bundle a URL refers to. I tried to work around a potentially changing host by attaching Bundle/FrameworkListeners but I had no immediate test case yet because our Eclipse application is quite static at the point cpython is used.
So @timothyjward what do you think about this approach? Could it work?

Of course it would be much simpler if there were a OSGi service that provides the bridge from the URL to the bundle or e.g. if the connections created for such URLs would implement BundleReference like the ClassLoaders do.

@HannesWell
Copy link
Author

@timothyjward: Furthermore, is there a deeper meaning why you suggested BundleWiring.listResources() over findEntries()? You have mentioned both.

@timothyjward
Copy link
Contributor

@timothyjward: Furthermore, is there a deeper meaning why you suggested BundleWiring.listResources() over findEntries()? You have mentioned both.

The two methods do different things.

findEntries() Is a simple way to access a path within the bundle, much like getting a JarEntry from a JarFile. findEntries() ignores any internal classpath in the bundle, and therefore does not require the bundle to be resolved.

listResources() is different in that it uses the Bundle Classloader to find the resource. This therefore honours the bundle classpath, and therefore may not be able to locate some things that can be found by findEntries(). On the other hand as listResources() uses the classloader it can also see things imported from other bundles.

The main reason that you might choose one over the other is that the native code is loaded using a Class as context. It’s therefore pretty clear to me that the classpath is supposed to be taken into account, and the library should be located relative to the classpath (e.g. in the same package as the class). This would point me to listResources as the better candidate. You would probably want to pass the flag which tells it not to return resources from outside the bundle though.

private static void initialize() {
BundleContext context = getBundleContext();
if (context != null) {
indexAllBundles(context);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code could be simplified by using a BundleTracker. It would also fix the race condition where a bundle is added between the index call and adding the listener.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for that hint. I applied it.

}

private static String getBundleURLHost(Bundle bundle) {
return bundle.getEntry("/").getHost();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This works with getEntry, but not with getResource as there can be multiple entries on the bundle classpath.

Enumeration<URL> directoryEntries = OSGiBundleResourceLoader.getBundleDirectoryContent(resourceURL);
if (directoryEntries != null && directoryEntries.hasMoreElements()) { // a not empty directory
String directoryName = resourceURL.getPath();
while (directoryEntries.hasMoreElements()) {
Copy link
Member

@saudet saudet Aug 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This while loop looks pretty similar to the one above. Is the idea that some OSGi implementations may support other things than JAR files?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is intentionally. Actually I would have liked to move the extraction part into an own method. But because in case of a JarURLConnection the URL is created just before extractResource is called again this is not possible without creating the URL before (and therefore also in cases of a directory.

And yes a bundle can be a jar but also can be extracted into a directory (which is for example very common if you develop Eclipse plug-ins and use them in a debugged application started from the IDE).

@HannesWell
Copy link
Author

@timothyjward: Furthermore, is there a deeper meaning why you suggested BundleWiring.listResources() over findEntries()? You have mentioned both.

The two methods do different things.

findEntries() Is a simple way to access a path within the bundle, much like getting a JarEntry from a JarFile. findEntries() ignores any internal classpath in the bundle, and therefore does not require the bundle to be resolved.

listResources() is different in that it uses the Bundle Classloader to find the resource. This therefore honours the bundle classpath, and therefore may not be able to locate some things that can be found by findEntries(). On the other hand as listResources() uses the classloader it can also see things imported from other bundles.

The main reason that you might choose one over the other is that the native code is loaded using a Class as context. It’s therefore pretty clear to me that the classpath is supposed to be taken into account, and the library should be located relative to the classpath (e.g. in the same package as the class). This would point me to listResources as the better candidate. You would probably want to pass the flag which tells it not to return resources from outside the bundle though.

Thanks for your elaboration. I understand your point regarding bundle native code, but I think in this case it is not applicable because we are just looking for children of a URL pointing to a directory. So we are just interested in the actual content of the directory and not of the 'path' with which they are referenced. When the directory can be found through the classloader I'm pretty sure that all its children can be found as well by concatenating a corresponding sub-path.
And since a URL store the location of a resource and not how it can be found, I think this is equivalent in terms of discoverability/accessibility.
The reason why I would prefer findEntries() is that is consistent to how a JarFile is searched in a non-OSGi environment, where it is just looked up what's in the jar and not what can be found on the classpath. And it gives me the URL directly which is more convenient.

@HannesWell
Copy link
Author

Because I'm unsure if the current approach using a BundleTracker and FrameworkListener catches all possible changes in an URLs host, @timothyjward mentioned that for example in Felix a changed BundleRevision/Wiring might reflect into a changed URL-Host and other implementations might create a completely different host, so there could be more pitfalls I don't observe in my Equinox case because the URL host is simply not specified, and because IMHO the current approach is already quite complicated, I'm thinking of a different approach to get the Bundle of a URL:

In case javacpp runs as a bundle in a OSGi environment the calls to Class.getResource() and ClassLoader.getResources() could be redirected to the OSGiBundleResourceLoader as well and then the Class(Loader)'s bundle could be saved together with the returned URL in a static WeakHashMap.
This way we don't have to relay on implementation specific behaviour that might or might not be consistent. And as fas as I can oversee it, Class.getResource() and ClassLoader.getResources() is only called in Loader.findResources(), so the redirection is not a big issue. Or are also URLs passed to the loader that are created at other locations?

What do you think?

@@ -22,6 +22,10 @@

package org.bytedeco.javacpp;

import static org.bytedeco.javacpp.tools.OSGiBundleResourceLoader.getOSGiClassLoaderResources;
import static org.bytedeco.javacpp.tools.OSGiBundleResourceLoader.getOSGiClassResource;
import static org.bytedeco.javacpp.tools.OSGiBundleResourceLoader.isOSGiRuntime;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's try to keep the imports clean of OSGi stuff. Android in particular has a tendency to trip on those things since it tries to load all the classes there, and fails. This also means that we can't put anything that loads those classes by default, so you'll need to revert those changes too.

@saudet
Copy link
Member

saudet commented Aug 30, 2021

What do you think?

Are you asking me? It sounds fine, yes, as long as by default it doesn't try to load anything, other than by reflection with fail safe.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants