Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

prevent crash when metadata time is undefined #28

Merged
merged 3 commits into from
Dec 5, 2021

Conversation

guillaumechauvat
Copy link
Contributor

@guillaumechauvat guillaumechauvat commented May 9, 2021

This fixes a crash on my machine, using a Fuji X-T30. I don't know why mdatatime is None here yet.

@damonlynch
Copy link
Owner

If I understand your patch correctly, this does not fix the root cause, but works around one of the symptoms. Is that correct? If so, then it's better to try to fix the real cause of the problem.

Please attach a complete tar file with the log files, which you can generate following these instructions:

https://damonlynch.net/rapid/documentation/#reportproblem

@guillaumechauvat
Copy link
Contributor Author

That's correct. Still, I was thinking it might be better to do this on top of fixing the root cause.

@guillaumechauvat
Copy link
Contributor Author

guillaumechauvat commented May 9, 2021

Here is the log file.
rpd-bug-report-20210507.tar.gz

With my change, this appears in the the log with --debug:

INFO: Scanning X-T30
DEBUG: Scanning /store_10000001/DCIM on X-T30
INFO: Extracting sample photo metadata for X-T30
DEBUG: https://www.damonlynch.net:443 "GET /rapid/version.json HTTP/1.1" 301 178
DEBUG: Starting new HTTPS connection (1): damonlynch.net:443
WARNING: Scanner failed to load metadata from DSCF5166.RAF on X-T30
WARNING: Scanner failed to extract date time metadata from DSCF5166.RAF on X-T30
DEBUG: Could not determine Device timezone setting for X-T30
INFO: Extracting sample photo metadata for X-T30
WARNING: Scanner failed to load metadata from DSCF5167.RAF on X-T30
WARNING: Scanner failed to extract date time metadata from DSCF5167.RAF on X-T30
INFO: Extracting sample photo metadata for X-T30
WARNING: Scanner failed to load metadata from DSCF5168.RAF on X-T30
WARNING: Scanner failed to extract date time metadata from DSCF5168.RAF on X-T30
INFO: Extracting sample photo metadata for X-T30
WARNING: Scanner failed to load metadata from DSCF5169.RAF on X-T30
WARNING: Scanner failed to extract date time metadata from DSCF5169.RAF on X-T30
INFO: Extracting sample photo metadata for X-T30
WARNING: Scanner failed to load metadata from DSCF5172.RAF on X-T30
WARNING: Scanner failed to extract date time metadata from DSCF5172.RAF on X-T30
DEBUG: Sending 11 scanned files from X-T30 to sink
DEBUG: Sending 11 scanned files from X-T30 to sink
INFO: 11 total files scanned on X-T30

There seems to always be a warning for exactly 5 files, unless fewer are present on the camera. I am still trying to investigate this.

Edit:
The first time the program is launched with new files, this happens:

DEBUG: Working on task load_from_bytes for DSCF5177.RAF
WARNING: Could not read metadata from /store_10000001/DCIM/105_FUJI/DSCF5177.RAF. GExiv2: unsupported format (501)
DEBUG: Saving thumbnail for gphoto2://Fuji%20Fujifilm%20X-T30//store_10000001/DCIM/105_FUJI/DSCF5177.RAF in RPD thumbnail cache

It seems like it could be related to Exiv2/exiv2#1416 ?

@damonlynch
Copy link
Owner

That's because scan.py always tries to extract metadata from up to 5 files, so it can determine the device's approach to timezones.

In your case it seems like either your version of exiv2 simply doesn't support the file format revision of your camera, or there is a bug in the use of exiv2.

I'd start by checking if the exiv2 command line binary can read the RAF files your camera produces. Then, take it from there.

@guillaumechauvat
Copy link
Contributor Author

guillaumechauvat commented May 10, 2021

It works if I set all_tags_offset['raf'] to something very large like 8000000. No value below a few million bytes works.
The warnings like

WARNING: Could not read metadata from /store_10000001/DCIM/105_FUJI/DSCF5184.RAF. GExiv2: unsupported format (501)

also disappears if orientation_offset['raf'] is set to a large value.

exiv2 can read the files. The metadata extracted with exiv2 ex varies around 12-13 kB. You can find an example file here if it helps.

@damonlynch
Copy link
Owner

Thanks for your tests. That indicates something like 8MB the file needs to be read in before its metadata can be extracted. It also indicates RAF files have probably changed since I last tested this (I tested every file format I could get my hands on to see how much of the file needed to be read in order to extract its metadata).

As RAW files continue to evolve, this problem of increasingly larger chunks of a file needing to be read before its metadata can be successfully extracted is not a problem that will go away on its own. A more robust solution needs to be developed, in which the program learns to recognize the correct size of the chunk it needs to extract. Are you willing to help develop a proper fix to this bug, rather than a superficial fix?

@guillaumechauvat
Copy link
Contributor Author

I can spend time doing a proper fix, but I would need some context: what is the reason for using open_buf() with just the exif portion of the file, as opposed to using open_path() for example? Is it important for performance?

I have never looked into how raw files were encoded, but I guess the problem might be the presence of a full resolution developed JPEG in the middle of the metadata. That would explain why it requires such a large buffer and why the size varies between pictures. It doesn't seems easy or robust to try to guess in advance how much will be needed. One possible approach could be try doubling the size as long as it fails (or until the buffer is larger than the file size) to estimate the buffer size required, and store the largest value encountered for each camera model / file format in the user configuration.

What do you think?

@damonlynch
Copy link
Owner

My sincere apologies for letting this sit for so long without resolution. For now I have simply adjusted the code to read in 8000000 bytes, as your code change suggests. This is a brute force method, and is far from ideal, but will need to do for now. A better long-term solution would be to investigate using Fuse to temporarily mount the camera file system. I already did the code for this with iOS devices (see raphodo/scan.py), where it is necessary. It seems to be working very well for iOS devices. It may well work just as well for PTP cameras too. The advantage of using Fuse is that there is no need for offsets. The code can simply pretty much treat the file as it would any other a file on a local file system. However there may be disadvantages too, e.g. performance or reliability. I don't know. It needs to be investigated. Thus far I've put my time into other aspects of the code. Is this something you are interested in investigating yourself, or do you already have experience using it?

@damonlynch damonlynch merged commit 1fde469 into damonlynch:main Dec 5, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants