Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JNI improvements #886

Draft
wants to merge 57 commits into
base: main
Choose a base branch
from
Draft

JNI improvements #886

wants to merge 57 commits into from

Conversation

javagl
Copy link
Contributor

@javagl javagl commented Mar 26, 2024

Addresses #880

I know, the title is too generic. It would be better to clearly lay out a work plan and create a clean sequence of self-contained, specific PRs. But there are just too many (unrelated) things to do. "You can't cross a chasm in two small jumps". The first pass has to be tackled somewhat holistically. The result may not be as 'good' as I'd like it to be, but I hope that it will be 'better' along many dimensions. If this is not considered to be a viable approach, then that's OK. Then I'll just create my own repo with JNI bindings for KTX, with blackjack and error handling.


The preliminary task list that I tried to create in the linked issue is now replicated here, and I'll try to update it and extend it as we move on:

  • Wrap up Properly convert Java char to C char in inputSwizzle #876
  • Wrap up Expose supercompression functions via JNI #879
  • Expose CreateFromMemory
  • Improve the use of constants
    • There should be documentation for these constants, and stringFor ... methods to convert them to strings
    • Updates:
      • Aligned the names of all constants with the native version

      • Added documentation and string functions for all constants, except for...

        • KtxInternalformat
        • VkFormat

        (See Synchronizing auto-generated files with bindings #887 )

      • NOTES:

        • The ktxTexture2.supercompressionScheme does not appear in the native documentation
        • The constants for KtxPackAstcEncoderMode are not documented on the native side
  • Fix the KtxTestLibraryLoader to also work on Windows
  • The JNI class- and field initializations should only be done once. Right now, things like env->GetFieldID(ktx_texture_class, "instance", "J"); are called in each function. The jfieldID should be obtained once initially.
  • (continuously) Error checks, error checks, error checks ... (no JVM crashes, period!)
    • Updates (I did a first pass for this, but will keep an eye on places where checks have to be added):
      • Passing null to any method should throw a NullPointerException.
      • Trying to use a texture after destroy() was called should throw an IllegalStateException.
      • When a JNI function causes an error (like an out-of-memory error), then the native functions have to return immediately, because there is a pending exception. This is not done in all places yet...
      • Java methods that do not return error codes (like the create... methods) now generally throw a KtxException when the implementation caused an error code that was not KTX_SUCCESS
      • TODO: I wanted to test this with createInfo.setBaseWidth(-123); // INVALID!, but ... this is implicitly converted into a ktx_uint32_t. We may have to think about how to handle the absence of unsigned in Java in some cases...
  • (continuously) Comments, comments, comments...
    • Updates:
      • I have added basic JavaDocs, and enabled the creation of the JavaDocs in the POM.
  • Consider consistent formatting. I know, it's a detail, but just auto-formatting everything with the Eclipse default could already be an improvement. I do have strong preferences for my own projects, but here, the main point is that the formatting should be consistent...

This PR currently builds on the state from #876 and #879 (so when these PRs are merged, then the first checkboxes above can be marked as 'done')


Not (yet) supposed to be addressed here:

  • Try to find a clean solution for JAR release to Maven Central #624

  • Update the documentation of deflateZstd and deflateZLIB
    EDIT: This was now done as part of Expose supercompression functions via JNI #879, but only for the native side and for the Java bindings (other bindings are still missing the doc update or the functions)

    • As pointed out in a comment, this documentation should include the information that 'the data pointer is updated'. This will go into the native documentation first, but it will also have to be passed through to the python bindings, the Java binding, and maybe to the JS binding. But... the deflateZLIB function is still missing in the Python binding (and the deflate* functions are just about to be added to the Java binding), so ... this may have to be tracked and addressed separately
  • Improve thoroughness of deflate* tests by comparing the deflated values

    • As pointed out in another comment, the unit test for the deflate* functions should compare the result of inflating the data again to the original input. The inflate* methods are not yet exposed (anywhere), so this may have to be deferred for now
  • Add test for input swizzling in ASTC

    • The PR for fixing the swizzling in JNI includes a test for the swizzling for Basis. It does not include one for ASTC. These are different code paths, and should both be tested. But right now, I'm not sure what could be checked for the ASTC result (see this PR comment for context).
  • The ktxHash... functionalities are not yet mapped through from the native side.

@javagl
Copy link
Contributor Author

javagl commented Mar 26, 2024

I locally started a few updates, including some basic error handling, so that things like KtxTexture2.create(null, ...) will no longer crash the JVM, but throw a NullPointerException instead. (Many more error checks to add...).

One thing that could/should be done next: The KtxTexture class has a destroy() method, documented like this:

    /**
     * Destroy the KTX texture and free memory image resources
     *
     * NOTE: If you try to use a {@link KTXTexture} after it's destroyed, that will cause a segmentation
     * fault (the texture pointer is set to NULL).
     */
    public native void destroy();

Now, the goal is: No segmentation faults. And we'll probably not be able to completely avoid this destroy() method. So the question is how to handle that.

I think that trying to use a KtxTexture after destroy() was called should also throw an exception - maybe an IllegalStateException. I'll probably draft this (with the type of the exception being easy to change). But it will affect basically each and every native function, which has to check whether the thiz parameter has been destroyed...

@javagl
Copy link
Contributor Author

javagl commented Mar 26, 2024

There currently are two functions in the KtxTexture... class that receive "data" in the form of byte arrays:

  • public native int setImageFromMemory(int level, int layer, int faceSlice, byte[] src);
  • public static native KtxTexture2 createFromMemory(ByteBuffer byteBuffer, int createFlags);

(The latter has been added as part of this PR)

Many libraries that are using JNI expect direct ByteBuffer instances as the input for all functions, for a variety of reasons. But I'm strongly in favor of additionally offering the conveniece of a byte[] array. This does come at a cost for the implementor. The different cases of arrays and direct (or non-direct...) buffer have to be handled, but ... this is doable (some utility functions to fetch and release the data are shown in the draft for createFromMemory).

On the Java side, the ByteBuffer-based signature is more versatile. Even if someone has a byte[] array, this can trivially be wrapped into a ByteBuffer at the call site:

byte array[] = ...;
t = KtxTexture2.createFromMemory(ByteBuffer.wrap(array), ...);

But as a middle-ground, I'd suggest to...

  • introduce an additional setImageFromMemory(..., ByteBuffer src) method (keeping the one that accepted the byte[] array, for backward compatibility and convenience)
  • Add a createFromMemory(byte array[]...) method, for consistency with the two flavors of setImageFromMemory

An aside: The documentation of the setImageFromMemory currently says

* Set the image data - the passed data should be not modified after passing it! The bytes
* will be kept in memory until {@link KTXTexture#destroy} is called.

This statement is not true, and will be updated accordingly.


Remotely related: There currently is a function public native byte[] writeToMemory(); which returns a byte[] array. This probably makes sense, but the function itself has to be reviewed and tested.

@MarkCallow
Copy link
Collaborator

* Set the image data - the passed data should be not modified after passing it! The bytes
* will be kept in memory until {@link KTXTexture#destroy} is called.

What does "passed data" refer to here? Is it the data at the pointer location that was passed to setImageFromMemory? Not that it matters since you say it isn't true. Indeed the data is copied into the ktxTexture object.

@javagl
Copy link
Contributor Author

javagl commented Mar 27, 2024

What does "passed data" refer to here? Is it the data at the pointer location that was passed to setImageFromMemory? Not that it matters since you say it isn't true. Indeed the data is copied into the ktxTexture object.

There are multiple "layers" here, and this revolves exactly about the question where data is copied. Imagine the following Java pseudo code, similar to what could appear in the KTX JNI bindings:

byte input[] = new byte[10];

// Create a texture from the input
KtxTexture2 texture = KtxTexture2.createFrom(input);

// XXX Will this affect the result?
input[5] = 123;

// Encode the input data that was given earlier
texture.encode();

The comment suggested that this input[5] = 123 would be "visible" for the 'encode' function that is called later. This would mean that the native library would store "a pointer" to the Java byte[] array. Something like this is actually possible, but with many, many, caveats - and here, it certainly is not the case.

@javagl
Copy link
Contributor Author

javagl commented Mar 27, 2024

A detail @ShukantPal : Currently, the bindings are set up for Java 11. I don't see a strong reason for that. Unless there is a reason, I'd suggest lowering the version requirement to "the lowest version that works" (and I think that 8 should do it).

@javagl
Copy link
Contributor Author

javagl commented Mar 29, 2024

Responding to my own comment from above: I decided to align the constant names with the names that they have in the native layer. The prefixes already are present, for example, in the (auto-generated) classes like VkFormat. Comparing and/or maintaining the other constants when the prefixes are removed is difficult.

I know, something like KtxTextureCreateStorage.KTX_TEXTURE_CREATE_NO_STORAGE is a mouthful, and looks redundant. But in doubt, the client/user could use static imports, turning this into KTX_TEXTURE_CREATE_NO_STORAGE (i.e. the exact same thing as in native code). Compared to the previous one, KtxTextureCreateStorage.NO, this seems reasonable. (This one would turn into NO with a static import - good luck doing a full-text search for that...).

I also started adding the documentation from the native side to these constant definitions and some of the related methods (i.e. the ones for using and setting them). These might be useful on a certain level - before and after:

Khronos KTX Docs

@javagl
Copy link
Contributor Author

javagl commented Apr 2, 2024

The last passes contained a few steps of that dull, repetitive, boring, repetitive, tedius, and repetitive handling of JavaDoc, based on the documentation from the native side.

And apparently, the JavaDocs have never been used or created before. Most of the (few) existing ones had been invalid. Now they are fixed.

I have added the creation of the JavaDoc to the POM. The current state is attached here:

libktx-4.0.0-SNAPSHOT-javadoc.zip

There are a few degrees of freedom for reducing (or handling) the redundancy of JavaDoc comments for cases like the ...Param classes, which correspond to the native ...Param structs, but introduce setters/getters for each field. The current approach is what I consider to be a reasonable middle-ground:

  • The field itself receives a very short comment (roughly: The first sentence from the native documentation)
  • The getter receives a similarly short comment, but links to the setter (with See [link])
  • The return value of the getter and the parameter of the setter receive a very short description (sometimes even resorting to "The setting" or "The value", just to keep things manageable...)
  • The actual documentation is put into the setter

The rationale here is that

  • users don't see the field itself, which is private
  • users who see the getter have the direct link to the setter
  • the setter is where the actual information is most relevant - mainly, the valid ranges, constraints, and the effect of the setting.

Here's an example for the uastcRDOMaxSmoothBlockErrorScale:

Khronos KTX JavaDoc

(Yeah, .. who's ever gonna change that anyhow? But I see complete JavaDoc as some sort of minimal baseline, to be at least on par with the native docs, even if the docs might have to be extended to be really useful for end-users...)

@javagl
Copy link
Contributor Author

javagl commented Apr 11, 2024

About the current build failures:

What's surprising is that I also don't see these errors locally, and they don't happen in the Windows build. Iff I understood that correctly, very roughly: Windows builds are done by GitHub actions. Linux/MacOs are done by Travis - and the error only appears in https://app.travis-ci.com/github/KhronosGroup/KTX-Software/jobs/620089455#L866 but not in https://github.com/KhronosGroup/KTX-Software/actions/runs/8543283946/job/23406762987#step:14:48

I'll probably have to add some quirky printf-logging in the JNI layer to even be able to guess why CreateFromMemory might fail on Linux/MacOS...

@MarkCallow
Copy link
Collaborator

About the current build failures:

What's surprising is that I also don't see these errors locally, and they don't happen in the Windows build. Iff I understood that correctly, very roughly: Windows builds are done by GitHub actions. Linux/MacOs are done by Travis - and the error only appears in https://app.travis-ci.com/github/KhronosGroup/KTX-Software/jobs/620089455#L866 but not in https://github.com/KhronosGroup/KTX-Software/actions/runs/8543283946/job/23406762987#step:14:48

I'll probably have to add some quirky printf-logging in the JNI layer to even be able to guess why CreateFromMemory might fail on Linux/MacOS...

Don't obsess about Travis vs. Actions. Both macOS and Linux builds are failing. These builds are run on different machines, even though both under Travis. Look instead at the environments on the runners, particularly the Java/JVM versions in use.

@MarkCallow
Copy link
Collaborator

About the current build failures:

There are 2 circumstances in which KTX_INVALID_OPERATION can be returned during ktxTexture2_CreateFromMemory:

  1. If the offset of the supercompression global data or of the first mip level is greater than the byte length specified to ktxTexture2_CreateFromMemory. This comes from ktxMemStream_setpos.
  2. If you are trying to load the data and it has already been loaded. For example calling ktxTexture2_LoadImageData after calling ktxTexture2_CreateFromMemory with KTX_TEXTURE_CREATE_LOAD_IMAGE_DATA_BIT set. This comes from ktxTexture2_LoadImageData.

Not clear why either of these would be triggered on Linux and macOS but not Windows.

@javagl
Copy link
Contributor Author

javagl commented Apr 12, 2024

Thanks for the details @MarkCallow .

I added some printf-debug-log output (in the hope that it would show up in the build logs), but during that, noticed an embarrassing error in the JNI function for acquiring the input data for CreateFromMemory: A sign error for the computation of the size of the data caused a value like -1000 to be interpreted as a size_t, so the function was called with something like size=1844674407370955061 (~ uint64 max value).

This is fixed now. Let's see whether the build passes.

Two points remain open:

  • Why didn't this cause an error on Windows? (It looks like Windows is ignoring the size parameter in some way?)
  • It is related to the question that I already brought up in the (updated) bullet point list of the first post here a while ago:
    • TODO: ... We may have to think about how to handle the absence of unsigned in Java in some cases...

The latter is a more general question. Unsigned types are error-prone, particularly in interfaces - see e.g. https://www.aristeia.com/Papers/C++ReportColumns/sep95.pdf , which points out exactly what happened here. And when crossing the language barrier between Java and C++, one has to be particularly careful about that...

# Conflicts:
#	interface/java_binding/src/main/cpp/libktx-jni.cpp
#	interface/java_binding/src/test/java/org/khronos/ktx/test/KtxTestLibraryLoader.java
#	interface/java_binding/src/test/java/org/khronos/ktx/test/KtxTexture2Test.java
# Conflicts:
#	interface/java_binding/src/main/cpp/KtxTexture2.cpp
#	interface/java_binding/src/main/java/org/khronos/ktx/KtxTexture2.java
@javagl
Copy link
Contributor Author

javagl commented Apr 17, 2024

@MarkCallow We'll have to think about a sensible strategy for reviewing and merging this.

Looking at the changes will hardly make sense. Nearly every file of the JNI bindings was modified. I tried to update the bullet point list in the first post with the progress. Beyond that, I can probably try to give a short summary here, with an overview of the most important changes/commits:

  • Added JavaDocs. This has been spread over many commits, only one example: 2810d2b
  • Update the KtxTestLibraryLoader to work on Windows, via bf0c644
  • [Breaking]: I updated all constants to match the name that they have in the native layer. For example, c3c8600#diff-7e22517bac1468d9f5cff73b4d8e6ecdb2cc11608424890e8f2ac6d9bb06fa19
  • Added stringFor-functions that take the respective enum constants, and return their string representation (e.g. 332510f ). Note that this only returns the string representation, and not the elaborate description that some of the functions in the native layer return
  • [Build process]: Added the step to generate the JavaDocs to the Maven pom.xml at b747849
  • The field- and method IDs that are used for accessing the Java structures from the JNI layer are now initialized once, when the library is loaded (via 2c86122 )
  • One of the most important points: Error handling:
    • Passing in null to any function will not cause a JVM crash any more, but throw a NullPointerException instead (e.g. 8f58109 )
    • When an OutOfMemoryError is caused in the native layer, then the functions return immediately (and throw this pending error), e.g. a5adb8a
    • When trying to use a texture on which destroy was already called, then this will not cause a JVM crash, but throw an IllegalStateException instead (mainly done in 3673c0a )

Some of this is just "baseline stuff" to follow the recommendations from https://developer.ibm.com/articles/j-jni/ . But some points still have subjective elements to them.


I have actively been using the resulting library in a small dummy application where you can drag-and-drop a PNG file, see a preview of the PNG and its KTX version, drag around some sliders for the compression/quality (and see a live preview). So I'm confident that this "mostly works". But there are still a few things to wrap up.

  • Some of the JNI implementations still have to be reviewed to ensure that they don't perform operations with pending exceptions, and maybe run with -verbose:jni to spot further possible pitfalls that already existed in the code
  • There should be a version of setImageFromMemory that does not accept a byte[] array, but a ByteBuffer as the last parameter (mentioned in JNI improvements #886 (comment) )
  • The open points of the bullet point list from the first post

But maybe this can already go through a first review pass, based on the current state.

@MarkCallow
Copy link
Collaborator

How do we make further progress on this?

I find having the full KTX_ prefixes rather clunky. Can't we follow the WebGL model which removed all the GL_ prefixes?

@javagl
Copy link
Contributor Author

javagl commented May 2, 2024

That's the question: What could be a sensible way to review this, given that it is ... nearly a "rewrite" of the JNI bindings?

Is the bullet point list above sufficient? Should I invite further potential reviewers? I'm pretty flexible here. If you wish, we could also arrange a short call to go through the changes, what's left to do, and what could be the next steps.

I find having the full KTX_ prefixes rather clunky. Can't we follow the WebGL model which removed all the GL_ prefixes?

From quickly skimming over some Python binding files, it looks like the Python bindings are also just omitting the KTX_, so the question is then: Should the naming be consistent with the C-layer or should it be consistent with the bindings? I think that changing
KtxTextureCreateStorage.KTX_TEXTURE_CREATE_NO_STORAGE to
KtxTextureCreateStorage.TEXTURE_CREATE_NO_STORAGE
does not make a crucial difference here, but when you say that the KTX_ prefixes should be omitted, I'll remove them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants