Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] xmp thread safety #3011

Closed
wants to merge 1 commit into from
Closed

Conversation

serghov
Copy link

@serghov serghov commented Jul 8, 2024

The XMP subsystem used in Exiv2 employs a global state and is not thread-safe. While Exiv2 has some locking mechanisms, they seem inadequate. Currently, if you try to read or write XMP metadata from multiple files in separate threads, you will quickly encounter threading issues, and at best, your app will segfault.

This happens because, in several places, Exiv2 writes under locks but performs reads without locking.

The worst offenders are the XmpParser::decode and XmpParser::encode functions. Both of these functions attempt to register XMP namespaces if they do not exist, which is not a thread-safe operation.

I would also like to mention that XmpParser::decode is not a const operation, it will try to add namespaces if they do not exist, so even reading multiple images from different threads results in undefined behaviors.

This pull request does not fix all of these issues.

It adds global locking to the encode, decode, and initialize functions; however, I am sure there are more threading issues in XMP.

As is, this is a quick and dirty solution for my particular use case.

It very much is a work in progress for general case.

@serghov serghov changed the title WIP: xmp thread safety [WIP] xmp thread safety Jul 8, 2024
@serghov serghov marked this pull request as draft July 8, 2024 14:03
@kevinbackhouse
Copy link
Collaborator

Thank you for looking into this! The XMP subsystem is a copy of an old version of Adobe's XMP-Toolkit-SDK. Ideally we'd like to get rid of it completely and just use a library. But last time I looked into it, I wasn't able to build their code.

@kevinbackhouse
Copy link
Collaborator

LOL, I just noticed that a 3-year-old PR of mine is still open: adobe/XMP-Toolkit-SDK#49

@serghov
Copy link
Author

serghov commented Jul 8, 2024

I didn't even notice that the version of XMP-Toolkit-SDK used is that old. I will look into updating it, however this might require refactoring a whole bunch of code.

Does exiv2 have api breakage rules?

@kevinbackhouse
Copy link
Collaborator

Does exiv2 have api breakage rules?

We try not to break our public API if we don't have to. It would mean doing a major release. For example 0.27 -> 0.28 had some API changes.

@kevinbackhouse
Copy link
Collaborator

By the way, there is also a discussion thread about XMP here: #2671.

@serghov
Copy link
Author

serghov commented Jul 9, 2024

I took a look at the XMP-Toolkit-SDK, and it is quite unusual. The build system is... weird: it uses make to run a cmake command, and then uses the generated makefiles to build the project with make again. Additionally, the whole thing is built in-source.

They have rust bindings here, and as far as I can tell, they avoid using the build system in XMP-Toolkit-SDK altogether, and bootstrap their own instead.

It appears that distributions don't package XMP-Toolkit-SDK, so having it as an external dependency might cause packaging problems downstream.

I have managed to build exiv2 with the latest version of XMP-Toolkit-SDK. I can try to clean this up and make another pull request. It seems to fix the threading issues.

@kmilos
Copy link
Collaborator

kmilos commented Jul 9, 2024

It appears that distributions don't package XMP-Toolkit-SDK, so having it as an external dependency might cause packaging problems downstream.

Correct.

One (potentially better than today?) way to do this is via a git submodule, and then use our own CMake build system, as w/ the in-tree copy currently.

@serghov
Copy link
Author

serghov commented Jul 9, 2024

I will try to make a submodule happen.

There are some issues with hardcoded paths to dependencies such as expat, but I think I can work around those somehow.
And they seem to be using an md5 implementation that is a modified version from something called adobe source library.

I wonder if it would be easier to fork the whole thing, rewrite the build and dependency system and use the fork.

@neheb
Copy link
Collaborator

neheb commented Jul 9, 2024

I believe libexempi is the replacement for this. I wasn't successful at replacing xmp when I tried.

@kmilos
Copy link
Collaborator

kmilos commented Jul 10, 2024

Not sure libexempi builds on Windows, not very cross-platform friendly IIRC last time I had a look...

Edit: Yep, autotools build system only (though I see maybe meson on master), and then

checking for library containing dlopen... no
configure: error: unable to find the dlopen() function

There is a dlfcn wrapper for MSYS2 to work around this, but not sure about MSVC builds then...

Looks like we'll have to wait for the next release w/ meson to re-evaluate - master doesn't build for me on MSYS2 currently.

@serghov
Copy link
Author

serghov commented Jul 10, 2024

I have opened another pull request at #3014. I think it would be better to move this discussion over there.

I don't think there is much point in continuing to work on fixing thread safety with the older version of XMP-Toolkit-SDK, since newer versions seem to fix almost all issues.

I am closing this pull request.

@serghov serghov closed this Jul 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants