Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Protect pointer after calling release methods #19

Open
blueberry opened this issue Jan 15, 2018 · 3 comments
Open

Protect pointer after calling release methods #19

blueberry opened this issue Jan 15, 2018 · 3 comments

Comments

@blueberry
Copy link

Scenarios:

  1. some of the enqueueReleaseX methods is called twice on the same objects
  2. an operation that needs valid context is called after the context has been released

Result: JVM crashes

This is expected behavior, but I am wondering whether there is a way to keep a flag in all NativePointerObject instances, that tracks whether the pointer is "valid"? This might result in exceptions, but at least the majority of JVM crashes could be avoided.

I also believe it could be backwards compatible, since the flag setting and checking would be an internal thing that does not have to be exposed to clients.

This issue is relevant to JCuda, too.

@gpu
Copy link
Owner

gpu commented Jan 16, 2018

A JVM crash can be "expected", to some extent, but of course, it's highly undesirable. And it is always nice when there is a simple, straightforward way to prevent it and turn it into a healthy, gracious exception.

Introducing a boolean flag in the pointer objects seems like a reasonable approach at the first glance. Particularly, for the objects that are released immediately. (Really enqueueing the task to release something is only done for the GL objects).

But generally speaking, such a flag induces a state. And considering the goal of multi-threaded use, we're at the point that (in similar flavors) has already caused headaches.

One of the crucial questions would be how this will be checked beforehand. Again, a straightforward approach would be to add a "guard" to each method call that involves "invalidatable" objects:

void clDoSomething(cl_context context, cl_mem mem0, cl_mem mem1...) {
    if (!context.isValid()) throw up;
    if (!mem0.isValid()) throw up;
    if (!mem1.isValid()) throw up;
    doSomethingNative(...);
}

but considering that this, in turn, might be an operation that only enqueues something, this may simply not be enough: A cl_mem could still be valid at this point, but become invalid before the (enqueued) operation is actually executed...

I'll definitely try to see whether there is an option to do this safely and sensibly. Any suggestions would be welcome.

@blueberry
Copy link
Author

There are two ways to approach it IMO:

  1. "Total" protect, where you try to stop those things completely, which as you suggest is rather difficult or impossible.
  2. "Lighter" approaches.

For example, a light approach would be to at least enable the user to check whether the pointer is valid at all whenever in the user's application that might be sensible. Currently, there is no way to check such things. Additionally, we cannot protect the user from trying to use memory after its context has been released, but at least we can stop users from crashing the VM by calling nqReleaseContext twice by making every subsequent call to the release method a no-op; this, I guess, does not have downsides since it can be done in each release method separately.

@gpu
Copy link
Owner

gpu commented Jan 17, 2018

You may have brought up something that can be considered as a more general (and more basic) issue. Interestingly, the docs at https://www.khronos.org/registry/OpenCL/sdk/2.0/docs/man/xhtml/clReleaseContext.html don't seem to state this clearly, but I'll have to review this and look for further infos.

The point is: The docs don't seem to say what happens to a context handle after calling clReleaseContext. So from a first glance at the docs, it's not clear what happens when calling clReleaseContext twice with the same context.

I'll have to try this with the AMD/NVIDIA implementations. If they modify the given context, it might be that the second call just returns CL_INVALID_CONTEXT, but this is currently not routed through the Java layer. (The docs/specs do not state this explicitly).

This, in fact, as a far higher priority than any additional "safety net" that could be achieved with boolean flags, because it is necessary in order to achieve a behavior that is "as close as possible" to the underlying OpenCL implementations.

I'll test and review this ASAP.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants