Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JNI improvements #886

Merged
merged 75 commits into from
Sep 14, 2024
Merged

JNI improvements #886

merged 75 commits into from
Sep 14, 2024

Conversation

javagl
Copy link
Contributor

@javagl javagl commented Mar 26, 2024

Addresses #880

I know, the title is too generic. It would be better to clearly lay out a work plan and create a clean sequence of self-contained, specific PRs. But there are just too many (unrelated) things to do. "You can't cross a chasm in two small jumps". The first pass has to be tackled somewhat holistically. The result may not be as 'good' as I'd like it to be, but I hope that it will be 'better' along many dimensions. If this is not considered to be a viable approach, then that's OK. Then I'll just create my own repo with JNI bindings for KTX, with blackjack and error handling.


The preliminary task list that I tried to create in the linked issue is now replicated here, and I'll try to update it and extend it as we move on:

  • Wrap up Properly convert Java char to C char in inputSwizzle #876
  • Wrap up Expose supercompression functions via JNI #879
  • Expose CreateFromMemory
  • Improve the use of constants
    • There should be documentation for these constants, and stringFor ... methods to convert them to strings
    • Updates:
      • Aligned the names of all constants with the native version

      • Aligned the names of constants with the other bindings (see JNI improvements #886 (comment) )

      • Added documentation and string functions for all constants, except for...

        • KtxInternalformat
        • VkFormat

        (See Synchronizing auto-generated files with bindings #887 )

      • NOTES:

        • The ktxTexture2.supercompressionScheme does not appear in the native documentation
        • The constants for KtxPackAstcEncoderMode are not documented on the native side
  • Fix the KtxTestLibraryLoader to also work on Windows
  • The JNI class- and field initializations should only be done once. Right now, things like env->GetFieldID(ktx_texture_class, "instance", "J"); are called in each function. The jfieldID should be obtained once initially.
  • (continuously) Error checks, error checks, error checks ... (no JVM crashes, period!)
    • Updates (I did a first pass for this, but will keep an eye on places where checks have to be added):
      • Passing null to any method should throw a NullPointerException.
      • Trying to use a texture after destroy() was called should throw an IllegalStateException.
      • When a JNI function causes an error (like an out-of-memory error), then the native functions have to return immediately, because there is a pending exception. This is not done in all places yet...
      • Java methods that do not return error codes (like the create... methods) now generally throw a KtxException when the implementation caused an error code that was not KTX_SUCCESS
  • (continuously) Comments, comments, comments...
    • Updates:
      • I have added basic JavaDocs, and enabled the creation of the JavaDocs in the POM.
  • Add bindings for ktxErrorString

Not (yet) supposed to be addressed here:

  • Try to find a clean solution for JAR release to Maven Central #624

  • There are a few cases where we might have to think about how to handle the absence of unsigned types in Java. For example, when calling createInfo.setBaseWidth(-123); // INVALID!, then this value is implicitly converted into a ktx_uint32_t.

  • Update the documentation of deflateZstd and deflateZLIB
    EDIT: This was now done as part of Expose supercompression functions via JNI #879, but only for the native side and for the Java bindings (other bindings are still missing the doc update or the functions)

    • As pointed out in a comment, this documentation should include the information that 'the data pointer is updated'. This will go into the native documentation first, but it will also have to be passed through to the python bindings, the Java binding, and maybe to the JS binding. But... the deflateZLIB function is still missing in the Python binding (and the deflate* functions are just about to be added to the Java binding), so ... this may have to be tracked and addressed separately
  • Improve thoroughness of deflate* tests by comparing the deflated values

    • As pointed out in another comment, the unit test for the deflate* functions should compare the result of inflating the data again to the original input. The inflate* methods are not yet exposed (anywhere), so this may have to be deferred for now
  • Add test for input swizzling in ASTC

    • The PR for fixing the swizzling in JNI includes a test for the swizzling for Basis. It does not include one for ASTC. These are different code paths, and should both be tested. But right now, I'm not sure what could be checked for the ASTC result (see this PR comment for context).
  • Consider consistent formatting. I know, it's a detail, but just auto-formatting everything with the Eclipse default could already be an improvement. I do have strong preferences for my own projects, but here, the main point is that the formatting should be consistent...

  • The ktxHash... functionalities are not yet mapped through from the native side.

    • Consider following the new JS binding here. It adds findKeyValue, addKVPair and deleteKVPair methods to the ktxTexture2 object rather than exposing ktxHashList.

@javagl
Copy link
Contributor Author

javagl commented Mar 26, 2024

I locally started a few updates, including some basic error handling, so that things like KtxTexture2.create(null, ...) will no longer crash the JVM, but throw a NullPointerException instead. (Many more error checks to add...).

One thing that could/should be done next: The KtxTexture class has a destroy() method, documented like this:

    /**
     * Destroy the KTX texture and free memory image resources
     *
     * NOTE: If you try to use a {@link KTXTexture} after it's destroyed, that will cause a segmentation
     * fault (the texture pointer is set to NULL).
     */
    public native void destroy();

Now, the goal is: No segmentation faults. And we'll probably not be able to completely avoid this destroy() method. So the question is how to handle that.

I think that trying to use a KtxTexture after destroy() was called should also throw an exception - maybe an IllegalStateException. I'll probably draft this (with the type of the exception being easy to change). But it will affect basically each and every native function, which has to check whether the thiz parameter has been destroyed...

@javagl
Copy link
Contributor Author

javagl commented Mar 26, 2024

There currently are two functions in the KtxTexture... class that receive "data" in the form of byte arrays:

  • public native int setImageFromMemory(int level, int layer, int faceSlice, byte[] src);
  • public static native KtxTexture2 createFromMemory(ByteBuffer byteBuffer, int createFlags);

(The latter has been added as part of this PR)

Many libraries that are using JNI expect direct ByteBuffer instances as the input for all functions, for a variety of reasons. But I'm strongly in favor of additionally offering the conveniece of a byte[] array. This does come at a cost for the implementor. The different cases of arrays and direct (or non-direct...) buffer have to be handled, but ... this is doable (some utility functions to fetch and release the data are shown in the draft for createFromMemory).

On the Java side, the ByteBuffer-based signature is more versatile. Even if someone has a byte[] array, this can trivially be wrapped into a ByteBuffer at the call site:

byte array[] = ...;
t = KtxTexture2.createFromMemory(ByteBuffer.wrap(array), ...);

But as a middle-ground, I'd suggest to...

  • introduce an additional setImageFromMemory(..., ByteBuffer src) method (keeping the one that accepted the byte[] array, for backward compatibility and convenience)
  • Add a createFromMemory(byte array[]...) method, for consistency with the two flavors of setImageFromMemory

An aside: The documentation of the setImageFromMemory currently says

* Set the image data - the passed data should be not modified after passing it! The bytes
* will be kept in memory until {@link KTXTexture#destroy} is called.

This statement is not true, and will be updated accordingly.


Remotely related: There currently is a function public native byte[] writeToMemory(); which returns a byte[] array. This probably makes sense, but the function itself has to be reviewed and tested.

@MarkCallow
Copy link
Collaborator

* Set the image data - the passed data should be not modified after passing it! The bytes
* will be kept in memory until {@link KTXTexture#destroy} is called.

What does "passed data" refer to here? Is it the data at the pointer location that was passed to setImageFromMemory? Not that it matters since you say it isn't true. Indeed the data is copied into the ktxTexture object.

@javagl
Copy link
Contributor Author

javagl commented Mar 27, 2024

What does "passed data" refer to here? Is it the data at the pointer location that was passed to setImageFromMemory? Not that it matters since you say it isn't true. Indeed the data is copied into the ktxTexture object.

There are multiple "layers" here, and this revolves exactly about the question where data is copied. Imagine the following Java pseudo code, similar to what could appear in the KTX JNI bindings:

byte input[] = new byte[10];

// Create a texture from the input
KtxTexture2 texture = KtxTexture2.createFrom(input);

// XXX Will this affect the result?
input[5] = 123;

// Encode the input data that was given earlier
texture.encode();

The comment suggested that this input[5] = 123 would be "visible" for the 'encode' function that is called later. This would mean that the native library would store "a pointer" to the Java byte[] array. Something like this is actually possible, but with many, many, caveats - and here, it certainly is not the case.

@javagl
Copy link
Contributor Author

javagl commented Mar 27, 2024

A detail @ShukantPal : Currently, the bindings are set up for Java 11. I don't see a strong reason for that. Unless there is a reason, I'd suggest lowering the version requirement to "the lowest version that works" (and I think that 8 should do it).

@javagl
Copy link
Contributor Author

javagl commented Aug 24, 2024

The last sequence of commits are...

  • "Update ... names" commits: These are changing the names of constants, to omit any redundant prefixes. For example, KtxPackAstcBlockDimension.KTX_PACK_ASTC_BLOCK_DIMENSION_D4x4 is now just called KtxPackAstcBlockDimension.D4x4, which is aligned with the naming in the JavaScript- and Python bindings
  • Added a binding for ktxErrorString. It's a judgement call where and how to add this. In the native layer, it's a "free-floating" function that does not belong anywhere. On the Java side, I thought that it would make sense to offer it as KtxErrorCode.createString(int).
  • A minor fix for ktxErrorString on the native side

The last one appears to be related to the (open) question about how to handle the absence of unsigned types in Java. But strictly speaking, it isn't: Even in the native layer, something like
const char* s = ktxErrorString(-1234);
would have caused ... Undefined Behavior, I guess.

Visual Studio even points this out:

Khronos Ktx Error enum

So I changed that check to

    if (error < 0 || error > KTX_ERROR_MAX_ENUM)
        return "Unrecognized error code";
    ...

@javagl
Copy link
Contributor Author

javagl commented Aug 24, 2024

I think that the build failure should already have been sorted out - right now, it marks some actions as 'failed' because they have been cancelled (maybe because I edited the first post while the build was running?).

In doubt, I could push ... "some change" ... to trigger a re-run (because I apparently cannot request a re-run manually).

Apart from that, this PR should be 'Ready for review' (which might be short, because intermediate states have already been reviewed and discussed).

Miscellaneous:

  • At some point, the change log may have to include a hint that the Java bindings have been ~"incompatibly refactored". I think this is not a big deal, because there probably haven't been so many people been using the previous state (and those who did hopefully appreciate the changes, even though some of them may be "breaking" in the strictest sense)
  • The open points from the first comment of this PR (Maven release, formatting, ....) will be promoted into dedicated issues where appropriate

@javagl javagl marked this pull request as ready for review August 24, 2024 17:07
@MarkCallow
Copy link
Collaborator

In GitHub Actions when a job in a workflow fails the remaining jobs are cancelled. The VS2019 job in the Windows workflow is failing. I just tried re-running this job but it still fails in the same place: problems creating the Javadoc. I will try rerunning the cancelled jobs. Some of the Linux build jobs are failing for a similar reason. I did not try to re-run them. I'm don't know yet why some macOS jobs are failing.

When you add more commits to a PR, existing queued builds are cancelled, running builds may be cancelled. A new build is started for the new commit when an existing job is cancelled or finishes. All this means in this case that the errors remain in the code of the most recent commit because a build was done using it.

@MarkCallow
Copy link
Collaborator

The failures in the Emscripten jobs look to be due to an updated clang in the latest Emscripten Docker image. Research is needed.

@MarkCallow
Copy link
Collaborator

MarkCallow commented Aug 26, 2024

The failures in the Emscripten jobs look to be due to an updated clang

Yup. It appears clang 20 now warns about conflictingduelling floating point options and we build with -Werror so the warning stops the build. The conflictingduelling options are in the ASTC encoder build. Don't you worry about this @javagl, I'll deal with it.

@javagl
Copy link
Contributor Author

javagl commented Aug 26, 2024

I just tried re-running this job but it still fails in the same place: problems creating the Javadoc.

Yes, I didn't dive deeply enough into the travis output for that one. This was caused by a {@link #...} in the JavaDoc that referred to a renamed constant. (I probably should have run that step locally to rule that out). Should be fixed now, but I'll keep an eye on possible further errors.

@MarkCallow
Copy link
Collaborator

Status update.

I worked around the new spam warning from clang 20 fixing the Emscripten build but GitHub just deployed a new version of its Windows runner which breaks builds with python due to python-cffi/cffi#117. I don't think I can do anything until a new version of cffi with PR python-cffi/cffi#118 merged is released. There is no way to tell GitHub Actions to use a particular version of the runner otherwise I'd make it use the previous working version.

This is incredibly frustrating and is adding to the issues that have stalled successful builds for going on 4 weeks now. I have at least 5 fixes backed up in local branches waiting until CI is unblocked. The time taken to deal with all of this is why I have not completed review of this PR.

@javagl
Copy link
Contributor Author

javagl commented Aug 31, 2024

Thanks for the update.

I've recently heard about a bunch of trouble in CIs, due to ~"updates of the GitHub Windows/MSVC version". It should really be possible to pinpoint that to a specific version in CI.
(Even though I'm well aware of the other issues that this might cause - if this was available since the dawn of time, GitHub would still have to maintain a Windows 3.1 build infrastructure 😆 )

That reminds me of one thing regarding the JNI that is also related to versioning (mentioned as a side note in an earlier comment ) : Right now, the JNI part assumes Java 11. Unless there is a strong reason to require Java 11, I'd change that to "the lowest version that works" (likely Java 8).

However, I assume that this PR will eventually be merged without many modifications (because the current hassle is pretty unrelated to JNI). So I'll probably start addressing things like the Maven issue locally, based on the state of this PR, and ... when the dust has settled here, open a PR for that.

@MarkCallow
Copy link
Collaborator

I reviewed all the recent commits. They look good except that I was expecting you to use the new ktxErrorString function in the stringFor method of KtxErrorCode. Why not use it?

I don't understand why you saw the warning or error about "Unchecked lower bound" and we didn't see it in CI and I don't see it in Intellicode in the IDE. We have warning level W4 set and warnings as errors. I have Microsoft Visual Studio Community 2022 Version 17.10.1 VisualStudio.17.Release/17.10.1+34928.147 and Visual C++ 2022 00482-90000-00000-AA364. Anyway thank you for the fix.

I think I have found a workaround for the python issue on GitHub. Build in progress. Still having issues with Travis.

I am appalled to see this PR has been open for 6 months. Sorry for the loooong delay.

@javagl
Copy link
Contributor Author

javagl commented Sep 4, 2024

I reviewed all the recent commits. They look good except that I was expecting you to use the new ktxErrorString function in the stringFor method of KtxErrorCode. Why not use it?

At this point, it's only a matter of consistency. The stringFor function exists in all "enum-classes". And it always returns the string representation of the enum value. I didn't want to deviate from that pattern, insofar that this would imply that there is no (reasonable) way to translate an error code like 42 into the string that actually reflects the "NAME_OF_THE_CONSTANT". So I thought that

  • stringFor consistently returning the constant name
  • KtxErrorCode.createString returning the human-readable string

made sense. But if you prefer, I can change it to replace the stringFor function here.

@MarkCallow
Copy link
Collaborator

Please merge the latest changes from main, which includes the fix for clang 20 in the Emscripten build. After a successful build this PR will be ready for merge.

Please provide a summary of the changes, particularly any breaking changes, that I can use as the commit message and can include in the Release Notes.

@javagl
Copy link
Contributor Author

javagl commented Sep 9, 2024

Technically, there is a whole bunch of "breaking changes", because certain parts have nearly been a "rewrite". But as I mentioned earlier: I think that hardly anybody really used the previous state, so it should not cause too much trouble. The build is currently running. I'll try to write a summary of the changes soon.

@javagl
Copy link
Contributor Author

javagl commented Sep 10, 2024

I had a short look at the current RELEASE_NOTES.md, but am not sure in how far any existing "pattern" is applicable here. This is not a single bugfix or feature. But something like the following might be added in the release notes, as a summary:


Java Wrapper

There was a larger refactoring of the Java Wrapper, focussing on error handling, documentation, completeness, and usability improvements ( #886 ). This includes several breaking changes. Not all breaking changes can be listed here explicitly. But in all cases, it should be easy for clients to update their code to the new state. The most important changes for the Java Wrapper are:

  • Made deflateZstd, deflateZLIB, and createFromMemory available in the KtxTexture2 class
  • Fixed a bug where the KtxBasisParams#setInputSwizzle function caused data corruption
  • Increased the consistency of the naming and handling of constants:
    • All constant names are UPPER_SNAKE_CASE, omitting common prefixes
    • Classes that define constants offer a stringFor function that returns the string representation of a constant
    • The KtxCreateStorage class was renamed to KtxTextureCreateStorage
    • The KtxPackUASTCFlagBits class was renamed to KtxPackUastcFlagBits
  • Improved error handling:
    • When functions receive a parameter that is null, then this will no longer crash the JVM, but throw a NullPointerException instead
    • When setInputSwizzle receives invalid arguments, then an IllegalArgumentException will be thrown
    • When one of the create... family of functions causes an error, meaning that the respective KtxTexture2 object cannot be created, then a KtxException will be thrown
  • Added JavaDocs, and enabled the generation of JavaDocs as part of the Maven build process
  • Internal improvements (like JNI field caching, and avoiding to call JNI functions with pending exceptions)

(Feel free to edit or adjust that (or request changes) as you see fit)


An aside: Is there a reason why the RELEASE_NOTES.md contains lines like

Allow A8B8G8R8 formats. (#861) (5ac5cf126) (@MarkCallow)

that are not linked, as in

Allow A8B8G8R8 formats. (#861) (5ac5cf126) (@MarkCallow)

?

@MarkCallow
Copy link
Collaborator

Thanks Marco. Your summary looks good.

An aside: Is there a reason why the RELEASE_NOTES.md contains lines like

Allow A8B8G8R8 formats. (#861) (5ac5cf1) (@MarkCallow)

that are not linked, as in

Allow A8B8G8R8 formats. (#861) (5ac5cf126) (@MarkCallow)

Because that information is not in Git. It is only on GitHub. The script that builds the list of commits operates on the Git repo. When you view the Release Notes that have been copied to the Releases section of the GitHub repo GitHub renders the page with links. For some reason when it renders the MarkDown page it does not. I have not been bothered enough by this to spend time seeing if it is possible to update the script to add links. I usually read the release notes in Releases .

@MarkCallow MarkCallow merged commit 2ee4332 into KhronosGroup:main Sep 14, 2024
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants