F2F: Class initialization in qbicc and Leyden #523
Replies: 2 comments
-
Here's some pre-meeting notes on heap serialization mechanism. OverviewOne step in producing a native executable for an application is evaluating selected application and library static initializers at compile time and embedding the resulting Java heap objects into the produced executable. Our strategy for doing this is to build up the relevant Java heap in the compiler's heap, using the host JVM's heap. We then traverse this object graph using reflection and serialize it to a compact binary representation that is emitted into the generated .ll files. The compiler also generates class specific deserialization functions into the same .ll file that will be used at runtime. During an "early" runtime phase, this binary representation is processed to create an initial Java heap for the application. An alternate approach, which we decided not to pursue, is emitting a byte[] containing a fully formed initial heap into a .ll file. This avoids the serialization/deserialization steps, but either (a) requires assuming a known heap start address or (b) performing a pass through the heap at runtime to adjust internal heap pointers once the start address is known. It also forces all initial heap objects into a single, non-GC managed memory space. ImplementationThe compile-time side is in org.qbicc.plugin.serialization. The runtime side is in org.qbicc.runtime.deserialization. Key Concepts:
What is implemented is pretty close to the initial design described in detail in #309, with some modifications to the "tagging scheme" to save some bytes here and there. Current StatusThere is a functional implementation, but it has some limitations. The process is driven by a Each heap root is assumed to be class instance (plus its reachable heap). There is no support for serialization top-level primitive data (the assumption was this could be handled by emitting an initialized primitive value instead of using serialization). Instances of java.lang.Class are not currently serialized (as we have not defined a runtime representation for java.lang.Class objects). Class instances with non-trivial native resources "hidden" in primitive fields (threads, mutexes, file descriptors, etc) will be naively serialized by just writing their Java-level values (int, long, etc.). This is almost always doomed to fail because the matching native resources will not exist at runtime. We currently serialize in big-endian format. Once we have our build-time constant support working, we should instead serialize using target platform endianness. Using target platform endianness, will enable using bulk memory copy operations for deserializing primitive arrays. |
Beta Was this translation helpful? Give feedback.
-
And some pre-meeting notes on Class initialization. OverviewBroadly, there are two times at which a class can be initialized - either at buildtime (BTI) or runtime (RTI). A dynamic JVM only allows RTI today and follows the processes defined in the JVM spec to detect when a class, and its supers, must be initialized. A native image can move some, or possibly even all, class initialization to build time. How to successfully employ BTI, in a way that respects the programmers intent, is an open question. SubstrateVM demonstrates one set of options and they have highlighted some of the challenges for users with BTI in https://github.com/vjovanov/taming-build-time-initalization. ChallengeThe challenge native images face is finding the right model for BTI that provides the most benefits (smaller image sizes and faster startup) without jettisoning the behaviour of the dynamic JVM. Our task is to explore this space in a way that can inform the efforts for OpenJDK's Project Leyden. We focus on BTI while recognizing that there will always be classes that need to be initialized - either partly or completely - at runtime. This covers cases like Random number generators, initializers that call into JNI, or take other actions that need to be dependent on the runtime environment rather than the build environment. The work I've been doing so far has been adding support for class initialization that is conservatively correct with respect to the JVMS. This allows for RTI with an eventual focus on "turning the dial" as far to BTI as possible. PrinciplesFind solutions to specifying BTI that respect the RTI environment so that there isn't a divergence between the dynamic JVM and native images. While we want the benefits of native image, we don't want to split the platform or break compatibility with the dynamic JVM. Prefer solutions that let the programmer specify their intent in the source code. Respect the author's intent. While exploring the technical solutions to enabling BTI, we should also be thinking about the user model for how they will tell the runtime (dynamic or static) when they want their initialization to occur. This is similar to the approach taken with interface default methods where the author of the API can add them, but others outside the interface cannot extend it with their own default methods. OptionsSome of the options to discuss and explore follow. These are intended to jump start the conversation:
|
Beta Was this translation helpful? Give feedback.
-
Links
Attendees
Agenda
Notes
Next steps
Background reading
Beta Was this translation helpful? Give feedback.
All reactions