Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenJ9 Shared Class Cache read feature #1501

Open
wants to merge 3 commits into
base: develop
Choose a base branch
from

Conversation

knewbury01
Copy link
Contributor

Hello! I am opening this PR to generate a discussion of whether there would be any interest to adopt this feature into the main Soot project, or get any advice possible on if there is a more appropriate place to make this feature available...

This feature is a backend source reading support for a very specific type of source: ROMClasses from a Shared Class Cache (SCC) from a JVM that is built with (a modified custom version of) OpenJ9 in it.

ROMClasses have a similar format to classfile format, except with a few caveats, which means that the pre-existing implementations for reading classfile format will not work on the SCC. It may be advantageous for some projects to consume ROMClass format, in a manner preferred to regular classfile format, as the classes in the SCC are guaranteed to be those in an executing application, compared to the assumption that those read in a classfile format are the same variants, in a distributed static analysis scenario. I understand that this is a very specific use case, however there may be others, which is another reason that I would like to present the feature, to see if its something that someone could use in another way.

Right now I know the build appears to fail, as I mention, the OpenJ9 version required for this feature to work is custom, and not an official release from the Eclipse OpenJ9 development cycle, but, I am working on PR'ing into OpenJ9 the necessary components to officially support this feature. Depending on whether/when that can be adopted officially, I can alternatively provide a succinct (automated way) to obtain the JVM build to use this feature.

Additionally if there is interest, I would be able to take a look into writing a custom Github workflow to include testing of this feature in the CI, if that is necessary,

And lastly , I am aware that the current contribution is lacking in test support, but that is something that I could spend time on, depending on what the thoughts here are,

Again, I'd like to simply start a discussion about whether this feature is interesting from a main Soot project perspective,

Thank you for your time

Kristen Newbury added 3 commits December 3, 2020 21:59
@mbenz89
Copy link
Contributor

mbenz89 commented Jan 2, 2021

Hi! Thanks for this very interesting feature! Could you please elaborate a bit more on how analyzing the SCC helps in the scenario of distributed static analysis (or other scenarios)? I'm especially curious, how "class variations" are introduced in such a scenario when analyzing the usual .class format.

Also, I would be interested in how significant you would rate this feature. Is it only relevant for distributed analysis? Why was it necessary for you to implement it in the first place?

What would be the steps to run an analysis using your input format? I guess it would be something like: having the class files -> somehow process them with OpenJ9 to ROMClasses -> use those as input to Soot (in a distributed fashion).

@knewbury01
Copy link
Contributor Author

So, the scenario that this feature was developed for was an analysis as a service type style (ie, that we hoped could be done on a different machine than the running application, but there is currently no support for synchronizing shared class caches (SCCs) to remote SCCs) , however if anyone were to do research on that direction this approach would probably be ideal compared to having to sync environments via sending classes on one's own since that approach would also require the researcher to handle custom classloading

As it stands we still hope that resolving classes from a SCC can simplify handling analysis of variants, as I previously mentioned, the idea here would be that if there are multiple versions of some classfile in the application environment, if we simply load (for analysis) from the SCC we should hopefully be able to do so only knowing the classpath that the application was using, but we dont need to know the details of the classloading delegation hierarchy (which is probably usually straightforward, but there may be exceptions). We can simply rely upon the fact that the JVM will manage the SCC for us. For example if a class is updated on the filesystem, our analysis environment will also have to updated. But when we use the SCC for analysis, the JVM will make sure it has a good representation of which classes should be used for the application, in the SCC - (to the best of my knowledge)) and we dont need to manage as much of that ourselves.

Im not sure on how significant this feature is, I mean, being honest I cant think of a scenario where its necessary to use it, but moreso its hopefully an advantageous approach, and Im hoping that it can be an interesting step to enable other work (maybe relating to using static analysis but somehow in a runtime environment with the goal of providing feedback to currently executing applications).

The steps to use the feature are similar to what you've guessed, I admit, unfortunately these steps currently rely upon a custom JVM version, but in the future this may be possible to contribute to an official release:

  1. obtain the custom OpenJ9 JVM build
  2. run the application, in the OpenJ9 JVM, with an SCC attached, which will populate the SCC
  3. startup Soot (also use the custom OpenJ9 build, with same SCC attached) with a src-prec option for scc, as you would for any other alt source

Additionally, I can supply a more official doc of how to obtain that custom JVM build, if necessary,

Thats it, let me know if you have any further questions or if anything is unclear!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants