prevent crash when metadata time is undefined #28

guillaumechauvat · 2021-05-09T14:59:55Z

This fixes a crash on my machine, using a Fuji X-T30. I don't know why mdatatime is None here yet.

damonlynch · 2021-05-09T19:06:17Z

If I understand your patch correctly, this does not fix the root cause, but works around one of the symptoms. Is that correct? If so, then it's better to try to fix the real cause of the problem.

Please attach a complete tar file with the log files, which you can generate following these instructions:

https://damonlynch.net/rapid/documentation/#reportproblem

guillaumechauvat · 2021-05-09T19:27:33Z

That's correct. Still, I was thinking it might be better to do this on top of fixing the root cause.

guillaumechauvat · 2021-05-09T19:34:23Z

Here is the log file.
rpd-bug-report-20210507.tar.gz

With my change, this appears in the the log with --debug:

INFO: Scanning X-T30
DEBUG: Scanning /store_10000001/DCIM on X-T30
INFO: Extracting sample photo metadata for X-T30
DEBUG: https://www.damonlynch.net:443 "GET /rapid/version.json HTTP/1.1" 301 178
DEBUG: Starting new HTTPS connection (1): damonlynch.net:443
WARNING: Scanner failed to load metadata from DSCF5166.RAF on X-T30
WARNING: Scanner failed to extract date time metadata from DSCF5166.RAF on X-T30
DEBUG: Could not determine Device timezone setting for X-T30
INFO: Extracting sample photo metadata for X-T30
WARNING: Scanner failed to load metadata from DSCF5167.RAF on X-T30
WARNING: Scanner failed to extract date time metadata from DSCF5167.RAF on X-T30
INFO: Extracting sample photo metadata for X-T30
WARNING: Scanner failed to load metadata from DSCF5168.RAF on X-T30
WARNING: Scanner failed to extract date time metadata from DSCF5168.RAF on X-T30
INFO: Extracting sample photo metadata for X-T30
WARNING: Scanner failed to load metadata from DSCF5169.RAF on X-T30
WARNING: Scanner failed to extract date time metadata from DSCF5169.RAF on X-T30
INFO: Extracting sample photo metadata for X-T30
WARNING: Scanner failed to load metadata from DSCF5172.RAF on X-T30
WARNING: Scanner failed to extract date time metadata from DSCF5172.RAF on X-T30
DEBUG: Sending 11 scanned files from X-T30 to sink
DEBUG: Sending 11 scanned files from X-T30 to sink
INFO: 11 total files scanned on X-T30

There seems to always be a warning for exactly 5 files, unless fewer are present on the camera. I am still trying to investigate this.

Edit:
The first time the program is launched with new files, this happens:

DEBUG: Working on task load_from_bytes for DSCF5177.RAF
WARNING: Could not read metadata from /store_10000001/DCIM/105_FUJI/DSCF5177.RAF. GExiv2: unsupported format (501)
DEBUG: Saving thumbnail for gphoto2://Fuji%20Fujifilm%20X-T30//store_10000001/DCIM/105_FUJI/DSCF5177.RAF in RPD thumbnail cache

It seems like it could be related to Exiv2/exiv2#1416 ?

damonlynch · 2021-05-09T19:56:56Z

That's because scan.py always tries to extract metadata from up to 5 files, so it can determine the device's approach to timezones.

In your case it seems like either your version of exiv2 simply doesn't support the file format revision of your camera, or there is a bug in the use of exiv2.

I'd start by checking if the exiv2 command line binary can read the RAF files your camera produces. Then, take it from there.

guillaumechauvat · 2021-05-10T10:34:47Z

It works if I set all_tags_offset['raf'] to something very large like 8000000. No value below a few million bytes works.
The warnings like

WARNING: Could not read metadata from /store_10000001/DCIM/105_FUJI/DSCF5184.RAF. GExiv2: unsupported format (501)

also disappears if orientation_offset['raf'] is set to a large value.

exiv2 can read the files. The metadata extracted with exiv2 ex varies around 12-13 kB. You can find an example file here if it helps.

damonlynch · 2021-05-12T13:56:52Z

Thanks for your tests. That indicates something like 8MB the file needs to be read in before its metadata can be extracted. It also indicates RAF files have probably changed since I last tested this (I tested every file format I could get my hands on to see how much of the file needed to be read in order to extract its metadata).

As RAW files continue to evolve, this problem of increasingly larger chunks of a file needing to be read before its metadata can be successfully extracted is not a problem that will go away on its own. A more robust solution needs to be developed, in which the program learns to recognize the correct size of the chunk it needs to extract. Are you willing to help develop a proper fix to this bug, rather than a superficial fix?

guillaumechauvat · 2021-05-13T09:23:54Z

I can spend time doing a proper fix, but I would need some context: what is the reason for using open_buf() with just the exif portion of the file, as opposed to using open_path() for example? Is it important for performance?

I have never looked into how raw files were encoded, but I guess the problem might be the presence of a full resolution developed JPEG in the middle of the metadata. That would explain why it requires such a large buffer and why the size varies between pictures. It doesn't seems easy or robust to try to guess in advance how much will be needed. One possible approach could be try doubling the size as long as it fails (or until the buffer is larger than the file size) to estimate the buffer size required, and store the largest value encountered for each camera model / file format in the user configuration.

What do you think?

damonlynch · 2021-12-05T16:05:21Z

My sincere apologies for letting this sit for so long without resolution. For now I have simply adjusted the code to read in 8000000 bytes, as your code change suggests. This is a brute force method, and is far from ideal, but will need to do for now. A better long-term solution would be to investigate using Fuse to temporarily mount the camera file system. I already did the code for this with iOS devices (see raphodo/scan.py), where it is necessary. It seems to be working very well for iOS devices. It may well work just as well for PTP cameras too. The advantage of using Fuse is that there is no need for offsets. The code can simply pretty much treat the file as it would any other a file on a local file system. However there may be disadvantages too, e.g. performance or reliability. I don't know. It needs to be investigated. Thus far I've put my time into other aspects of the code. Is this something you are interested in investigating yourself, or do you already have experience using it?

prevent crash when metadata time is undefined

0c966bc

increase buffer size for Fujifilm metadata

35e53f4

Merge branch 'main' into main

8c3716e

damonlynch merged commit 1fde469 into damonlynch:main Dec 5, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

prevent crash when metadata time is undefined #28

prevent crash when metadata time is undefined #28

guillaumechauvat commented May 9, 2021 •

edited

Loading

damonlynch commented May 9, 2021

guillaumechauvat commented May 9, 2021

guillaumechauvat commented May 9, 2021 •

edited

Loading

damonlynch commented May 9, 2021

guillaumechauvat commented May 10, 2021 •

edited

Loading

damonlynch commented May 12, 2021

guillaumechauvat commented May 13, 2021

damonlynch commented Dec 5, 2021

prevent crash when metadata time is undefined #28

prevent crash when metadata time is undefined #28

Conversation

guillaumechauvat commented May 9, 2021 • edited Loading

damonlynch commented May 9, 2021

guillaumechauvat commented May 9, 2021

guillaumechauvat commented May 9, 2021 • edited Loading

damonlynch commented May 9, 2021

guillaumechauvat commented May 10, 2021 • edited Loading

damonlynch commented May 12, 2021

guillaumechauvat commented May 13, 2021

damonlynch commented Dec 5, 2021

guillaumechauvat commented May 9, 2021 •

edited

Loading

guillaumechauvat commented May 9, 2021 •

edited

Loading

guillaumechauvat commented May 10, 2021 •

edited

Loading