Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Commits an alternative Zeiss CZI Reader #4092

Open
wants to merge 20 commits into
base: develop
Choose a base branch
from

Conversation

NicoKiaru
Copy link
Contributor

Hello,

This PR is a draft which ultimate goal would be to rewrite the logic of the CZI reader.

The goal is to achieve a compatilibility as good as the original reader while begin more efficient in terms of initial metadata parsing speed as well as being more memory efficient for big files (Tb).

I clearly do not want to get this merged right away - rather, I'd like to know if you could run some tests on your internal set of files and report if some file types are problematic or not.

Related links:

@NicoKiaru NicoKiaru marked this pull request as draft September 8, 2023 13:20
@dgault dgault marked this pull request as ready for review September 12, 2023 13:32
@dgault dgault added the include label Sep 18, 2023
@dgault
Copy link
Member

dgault commented Sep 19, 2023

@NicoKiaru, I included this PR in the full CI test suite last night, though the initialisation of the tests failed with a number of different exceptions: https://merge-ci.openmicroscopy.org/jenkins/job/BIOFORMATS-test-folder/lastFailedBuild/console

I believe the below are all of the unique exceptions that are see:

NullPointerException: null:

   [testng] java.lang.NullPointerException: null
   [testng] 	at loci.formats.in.ZeissCZIReader$MetadataInitializer$Corner.fromBlocks(ZeissCZIReader.java:3124)
   [testng] 	at loci.formats.in.ZeissCZIReader$MetadataInitializer.setSpaceAndTimeInformation(ZeissCZIReader.java:3310)
   [testng] 	at loci.formats.in.ZeissCZIReader$MetadataInitializer.initializeMetadata(ZeissCZIReader.java:2127)
   [testng] 	at loci.formats.in.ZeissCZIReader.initFile(ZeissCZIReader.java:1286)
   [testng] 	at loci.formats.FormatReader.setId(FormatReader.java:1496)
   [testng] 	at loci.formats.ImageReader.setId(ImageReader.java:875)

NullPointerException: null:

   [testng] java.lang.NullPointerException: null
   [testng] 	at loci.formats.in.ZeissCZIReader$MetadataInitializer.setSpaceAndTimeInformation(ZeissCZIReader.java:3256)
   [testng] 	at loci.formats.in.ZeissCZIReader$MetadataInitializer.initializeMetadata(ZeissCZIReader.java:2127)
   [testng] 	at loci.formats.in.ZeissCZIReader.initFile(ZeissCZIReader.java:1286)
   [testng] 	at loci.formats.FormatReader.setId(FormatReader.java:1496)
   [testng] 	at loci.formats.ImageReader.setId(ImageReader.java:875)

IndexOutOfBoundsException: Index 0 out of bounds for length 0:

   [testng] java.lang.IndexOutOfBoundsException: Index 0 out of bounds for length 0
   [testng] 	at java.base/jdk.internal.util.Preconditions.outOfBounds(Preconditions.java:64)
   [testng] 	at java.base/jdk.internal.util.Preconditions.outOfBoundsCheckIndex(Preconditions.java:70)
   [testng] 	at java.base/jdk.internal.util.Preconditions.checkIndex(Preconditions.java:248)
   [testng] 	at java.base/java.util.Objects.checkIndex(Objects.java:372)
   [testng] 	at java.base/java.util.ArrayList.get(ArrayList.java:459)
   [testng] 	at ome.xml.model.OME.getImage(OME.java:680)
   [testng] 	at ome.xml.meta.OMEXMLMetadataImpl.setImageROIRef(OMEXMLMetadataImpl.java:7699)
   [testng] 	at ome.xml.meta.FilterMetadata.setImageROIRef(FilterMetadata.java:1216)
   [testng] 	at loci.formats.in.ZeissCZIReader$MetadataInitializer.translateLayers(ZeissCZIReader.java:3827)
   [testng] 	at loci.formats.in.ZeissCZIReader$MetadataInitializer.translateMetadata(ZeissCZIReader.java:2320)
   [testng] 	at loci.formats.in.ZeissCZIReader$MetadataInitializer.readXMLMetadata(ZeissCZIReader.java:2284)
   [testng] 	at loci.formats.in.ZeissCZIReader$MetadataInitializer.initializeMetadata(ZeissCZIReader.java:2116)
   [testng] 	at loci.formats.in.ZeissCZIReader.initFile(ZeissCZIReader.java:1286)
   [testng] 	at loci.formats.FormatReader.setId(FormatReader.java:1496)
   [testng] 	at loci.formats.in.ZeissCZIReader.addSlidePreviewIfExists(ZeissCZIReader.java:1346)
   [testng] 	at loci.formats.in.ZeissCZIReader.initFile(ZeissCZIReader.java:1223)
   [testng] 	at loci.formats.FormatReader.setId(FormatReader.java:1496)
   [testng] 	at loci.formats.ImageReader.setId(ImageReader.java:875)

IOException: ZISRAWDIRECTORY segment expected, found ZISRAWFILE instead:

   [testng] java.io.IOException: ZISRAWDIRECTORY segment expected, found ZISRAWFILE instead.
   [testng] 	at loci.formats.in.libczi.LibCZI.getSubBlockDirectorySegment(LibCZI.java:123)
   [testng] 	at loci.formats.in.ZeissCZIReader$CZISegments.<init>(ZeissCZIReader.java:1994)
   [testng] 	at loci.formats.in.ZeissCZIReader.initFile(ZeissCZIReader.java:971)
   [testng] 	at loci.formats.FormatReader.setId(FormatReader.java:1496)
   [testng] 	at loci.formats.ImageReader.setId(ImageReader.java:875)

ArithmeticException: / by zero:

   [testng] java.lang.ArithmeticException: / by zero
   [testng] 	at loci.formats.in.ZeissCZIReader.setOriginAndSize(ZeissCZIReader.java:1436)
   [testng] 	at loci.formats.in.ZeissCZIReader.initFile(ZeissCZIReader.java:1155)
   [testng] 	at loci.formats.FormatReader.setId(FormatReader.java:1496)
   [testng] 	at loci.formats.ImageReader.setId(ImageReader.java:875)

EOFException: Attempting to read beyond end of file.:

   [testng] java.io.EOFException: Attempting to read beyond end of file.
   [testng] 	at loci.common.NIOFileHandle.readInt(NIOFileHandle.java:415)
   [testng] 	at loci.common.RandomAccessInputStream.readInt(RandomAccessInputStream.java:564)
   [testng] 	at loci.formats.in.libczi.LibCZI.getBlock(LibCZI.java:416)
   [testng] 	at loci.formats.in.ZeissCZIReader$MetadataInitializer.getSubBlockMeta(ZeissCZIReader.java:3076)
   [testng] 	at loci.formats.in.ZeissCZIReader$MetadataInitializer.setSpaceAndTimeInformation(ZeissCZIReader.java:3373)
   [testng] 	at loci.formats.in.ZeissCZIReader$MetadataInitializer.initializeMetadata(ZeissCZIReader.java:2127)
   [testng] 	at loci.formats.in.ZeissCZIReader.initFile(ZeissCZIReader.java:1286)
   [testng] 	at loci.formats.FormatReader.setId(FormatReader.java:1496)
   [testng] 	at loci.formats.ImageReader.setId(ImageReader.java:875)

IllegalArgumentException: No dimension C found:

   [testng] java.lang.IllegalArgumentException: No dimension C found
   [testng] 	at loci.formats.in.ZeissCZIReader$ModuloDimensionEntries.getDimension(ZeissCZIReader.java:1836)
   [testng] 	at loci.formats.in.ZeissCZIReader.initFile(ZeissCZIReader.java:1270)
   [testng] 	at loci.formats.FormatReader.setId(FormatReader.java:1496)
   [testng] 	at loci.formats.in.ZeissCZIReader.addSlidePreviewIfExists(ZeissCZIReader.java:1346)
   [testng] 	at loci.formats.in.ZeissCZIReader.initFile(ZeissCZIReader.java:1223)
   [testng] 	at loci.formats.FormatReader.setId(FormatReader.java:1496)
   [testng] 	at loci.formats.ImageReader.setId(ImageReader.java:875)

@NicoKiaru
Copy link
Contributor Author

Ok, 35 files not working, and I cound a bit more unique errors - some of them may be multiple (arithmetic exception) because there's no stack trace:

  • -no dim C found:

  • NPE at loci.formats.in.ZeissCZIReader$MetadataInitializer.setSpaceAndTimeInformation(ZeissCZIReader.java:3442)

    • "/data/greg/Experiment-519-export.czi"
    • "/data/greg/Experiment-519.czi"
    • "/data/qa-7601/AxioCamICc5_Gray16_12ValidBits_1C_8Z.czi"
    • "/data/qa-7601/AxioCamICc5_bgr24_24ValidBits_1C_8Z.czi"
    • "/data/qa-7601/AxioCamICc5_bgr48_36ValidBits_1C_8Z.czi"
    • "/data/qa-7601/AxioCamMRc_Gray16_12ValidBits_1C_9Z.czi"
    • "/data/qa-7601/AxioCamMRc_bgr48_36ValidBits_1C_9Z.czi"
  • read beyond EOF at loci.formats.in.libczi.LibCZI.getBlock(LibCZI.java:416)

    • "/data/qa-10301/141016_OP43adult_Worm3_DAPI_head.czi"
  • / by zero at loci.formats.in.ZeissCZIReader.setOriginAndSize(ZeissCZIReader.java:1436)

    • "/data/qa-11007/SCC47 EGF647 LAMP1488 - EGF only 3D with 405_PALM.czi"
    • "/data/qa-11007/SCC47 EGF647 LAMP1488 - LAMP1 only 3D with 405 2_PALM - KAthrin.czi"
  • NPE at loci.formats.in.ZeissCZIReader$MetadataInitializer.setSpaceAndTimeInformation(ZeissCZIReader.java:3173)

    • "/data/qa-11104/Jumping_dom_brain3_slide1.czi"
    • "/data/stephane/AO5.czi"
    • "/data/zeiss/zen-2012/Axio Scan.Z1/Intestine_3color_RAC.czi"
    • "/data/zeiss/zen-2012/Kidney_RAC_3color.czi"
  • java.lang.ArithmeticException: null unspecified

    • "/data/qa-17616/abeta647-dilution1milliontimesnanomolar.czi"
    • "/data/qa-9410/cav_PALM_drift_correct_Convert to Image.czi"
    • "/data/zeiss/ZEN 2.1 black LSM 880 -May 2015/Image 5_PALM_verrechnet.czi"
    • "/data/zeiss/ZEN 2.1 black LSM 880 -May 2015/PALM_OnlineVerrechnet.czi"
    • "/data/zeiss/dvds/lsm-disk1/CZI-Example Data LSM/PALM/Palm_mitDrift.czi"
  • ZISRAWDIRECTORY segment expected, found ZISRAWFILE instead at loci.formats.in.libczi.LibCZI.getSubBlockDirectorySegment(LibCZI.java:123)

    • "/data/qa-7271/bioformats_multifile.czi"
    • "/data/qa-7276/Experiment-297.czi"
  • NPE at loci.formats.in.ZeissCZIReader$MetadataInitializer.setSpaceAndTimeInformation(ZeissCZIReader.java:3256)

    • "/data/vito/36bit cast to 48.czi"
    • "/data/vito/36bit.czi"
  • NPE unspecified

    • "/data/vito/lena 10k x 10k bnw 16bit.czi"
    • "/data/vito/lena 10k x 10k bnw 8bit.czi"
  • NPE at loci.formats.in.ZeissCZIReader$MetadataInitializer$Corner.fromBlocks(ZeissCZIReader.java:3124)

    • "/data/zeiss/dvds/lsm-disk1/CZI-Example Data LSM/Misc/Channel-ZStack-LineScan-Bidirectional-Averaging.czi"
    • "/data/zeiss/dvds/lsm-disk1/CZI-Example Data LSM/ScanModes/LineScan_T3500.czi"
    • "/data/zeiss/dvds/lsm-disk1/CZI-Example Data LSM/ScanModes/LineScan_T80_Z25.czi"
    • "/data/zeiss/dvds/lsm-disk1/CZI-Example Data LSM/ScanModes/LineScan_Z200.czi"
  • xml issue

    • "/data/zeiss/dvds/wf-disk1/CZI Test WF/CZI-Example Data WF/Annotation/winnt.czi"
  • java.lang.IndexOutOfBoundsException: Index 0 out of bounds for length 0

    • "/data/zeiss/zen-2012/Young_mouse.czi"

So, it's not perfect but, by looking at these erros, it does not seem impossible. That leaves of course the crucial question of accessing similar files for me. Are there any of these which could be shared ?

I see some Axiocam issue - these kind of files I could try to generate with one of our systems. However there are some PALM dataset that I could not access and I see that they cause errors. The data are from 2012, so for these sort of files I'll clearly have a hard time finding a new sample file.

@dgault dgault added exclude and removed include labels Sep 20, 2023
@dgault
Copy link
Member

dgault commented Sep 22, 2023

I will have to take some time to go through the datasets and see if there is any way we can provide you with a way to reproduce and test, it may take me a few weeks to do so though.

@NicoKiaru
Copy link
Contributor Author

Don't you think there's a possibility to directly share these files ? I will not use them for anything else than fixing bugs, and will not updload them publicly anywhere.

@NicoKiaru
Copy link
Contributor Author

I know you're not super keen on changing the reader, but without the possibility for me to get the failing image, I just can't move forward. I'd just like to point out that these issues:

#4110
#4103
#4102
#3785

(and probably these ones)
#3919
#3790

are not happenning in the quick start reader.

I do not know what's the best way forward, but either:

  • you having a look at the alternative reader
  • or me being able to be able to fix these remaining issues
    are options maybe worth considering ?

@JavascriptMick
Copy link

JavascriptMick commented Oct 23, 2023

Hi Guys,

Thank you so much for working on this issue. FWIW, we went to a lot of trouble to ensure that this file we uploaded (https://zenodo.org/record/8423633) as an example is completely free of I.P and we are happy for it to be public, part of your test suite etc....

@NicoKiaru
Copy link
Contributor Author

Hello @JavascriptMick ,

I'll try to answer in the forum, but I think that what you request is a new feature, and it's not a bug of the reader. I think your file behaves correctly in bio-formats.

There's no way to stitch all scenes as a single image in the reader currently (only all tiles from a scene are stitched). And even if it was, it's a problem on the OMERO side... How to deal with a czi file which can be opened in two different ways ? That breaks somehow the 'immutability' concept of what is an image raw data.

@dgault
Copy link
Member

dgault commented Oct 23, 2023

@NicoKiaru unfortunately the sample files that are currently failing the tests are from files provided private that we are able to share. I realise that it will make it very difficult to move forward with resolving the failures. As a next step I will try and spend some time reviewing the PR over the next week to see if I can resolve the initial failures and unlock the next steps.

Thanks to @JavascriptMick for making that sample data publicly available, that is always a tremendous help. For reference the relevant forum post for that data is https://forum.image.sc/t/zoom-from-overview-to-detailed-scan-for-imported-czi-files/85002/1

Copy link

@JavascriptMick JavascriptMick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, Michael Dausmann from CMRI here. I raised an issue in the forums and I think maybe this PR is a result so I spent some time to do a review. Hope you don't mind. I was only able to pickout things that are maybe mistakes rather than really review the structure of what is done here. Thanks again for looking at our issue.

cacheLock.lock();
subBlockLRUCache.clear();
subBlocksCurrentlyLoading.clear();
cacheLock.unlock();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this need to be in a finally?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. Notes for me later:

lut[1][i] = (byte) (greenMax * (i / 255.0));
lut[2][i] = (byte) (blueMax * (i / 255.0));
}
public static boolean allowAutoStitch = false; // TODO CHANGE THIS TO METADATA OPTIONS!

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the relationship between 'allowAutoStitch' and 'allowAutoStitching' ? seems like there might be room for confusion...should this be allowAutoStitchingDefault?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. At minimum this should be private. I think I let this field to simplify testing, but that should be changed.

int redOffset = bpp * 2;
int index = 0;
int nloops=buf.length/pixel;
for (int i=0; i<nloops; i++) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you don't appear to use i at all, is this on purpose?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, there's probably a way to loop over index with the proper step


core.add(core_i);
core_i.orderCertain = true;
core_i.dimensionOrder = "XYCZT";

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this order seems to disagree with other areas including 'priority' which go XYZCT, is this on purpose?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I'm not sure I understand. Could you elaborate ? Which other areas ?

ByteArrayInputStream s =
new ByteArrayInputStream(xml.getBytes(Constants.ENCODING));
root = parser.parse(s).getDocumentElement();
s.close();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

finally?

s.close();
public Length getMinPosYInMicrons() {
if (pos.size()==0) {
return null;//new Length(0, UNITS.MICROMETER);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this comment deliberately left in place or the null is for debugging?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh!! Good catch... Maybe that's a reason why we get some NPE in the test files. But I'm not sure this case should ever happen. Logging an error would probably be the first thing to test

@NicoKiaru
Copy link
Contributor Author

@JavascriptMick thanks for the modifications. This work is on hold! I'll try not to forget your comments, but maybe the best would be to add them to the other repo where this reader currently is quick-start-czi-reader (biop github, sorry I have no time to put the link now)
@dgault thanks for the headsup. But I realized that a single subblock can contain multiple timepoint (for xt scans) and that breaks the current logic. I need to rewrite some of the logic to handle this...

@dgault dgault added include and removed exclude labels Nov 1, 2023
@snoopycrimecop
Copy link
Member

snoopycrimecop commented Nov 2, 2023

Conflicting PR. Removed from build BIOFORMATS-push#71. See the console output for more details.
Possible conflicts:

  • Upstream changes
    • components/formats-gpl/src/loci/formats/in/ZeissCZIReader.java

--conflicts Conflict resolved in build BIOFORMATS-push#72. See the console output for more details.

@dgault
Copy link
Member

dgault commented Nov 6, 2023

I had started investigating the initial test failures last week but have not yet got them all resolved. Once I have more progress I will likely push some small fixes to this PR. I will be away the rest of the week so this will likely be next week, excluding the PR from CI again until next week.

@dgault dgault added exclude and removed include labels Nov 6, 2023
@dgault dgault added include and removed exclude labels Nov 14, 2023
Small changes to fix some initFile failures
zStep is always initialised as micrometer so no need to attempt unit conversion
@dgault
Copy link
Member

dgault commented Nov 17, 2023

Last 2 commits have reduced the number of test failures (https://merge-ci.openmicroscopy.org/jenkins/job/BIOFORMATS-test-folder/lastFailedBuild/console)

Some of the remaining failures seem to be related to PALM data. Temporarily excluding for now, I will try to get back to looking at these remaining failures the week of the 27th.

@dgault dgault added exclude and removed include labels Nov 17, 2023
@dgault
Copy link
Member

dgault commented Feb 22, 2024

I haven't managed to get the 2 failing samples completely working yet, both are multifile datasets. I have pushed a few quick commits that should at least avoid the exceptions in those files and allow the files to initizatialize. Hopefully that will at least let the test suite run and give us an idea of the full test results. These commits can likely be reverted back out after..

Attempt to close open file handles
@dgault
Copy link
Member

dgault commented Feb 23, 2024

Made some progress though still seeing a failure for one dateset due to open file handles (https://merge-ci.openmicroscopy.org/jenkins/job/BIOFORMATS-test-folder/lastFailedBuild/console). I haven't been able to reproduce this with the single dataset in isolation but I've pushed a commit which I hope will tackle it.

@NicoKiaru
Copy link
Contributor Author

I had a quick look, and did not sort by thread, so I guess the order is a bit random. But, in brief, here are some typical issues I found:

  • There are 4 NPEs here:
[testng] 	at loci.formats.in.ZeissCZIReader$MetadataInitializer$Corner.fromSubBlockMeta(ZeissCZIReader.java:3339)
  • Many XML issues
  • This Invalid table name:
   [testng] java.lang.IndexOutOfBoundsException: Invalid table name: Cross_12M_3x4tiles_5%_Meander.czi series_1
   [testng] 	at loci.tests.testng.Configuration.setSeries(Configuration.java:504)
  • Some image names which are different (testUnflattenedImageNames):
   [testng] [2024-02-26 01:05:37,873] [pool-1-thread-6] 	testUnflattenedImageNames: FAILED (Series 0 (got 'Scene #1', expected 'Plate1-Blue-A-14-Scene-3-P2-E3-03.czi #1' or ''))
  • Little differences in plane positions, DeltaT, :
   [testng] [2024-02-26 01:05:40,866] [pool-1-thread-1] 	PlanePositions: FAILED (Z position series 0, plane 2 (expected -3351.2580000000003, actual -3354.7580000000003))
  • Some 'Indexed' issues I have no idea what this means:
   [testng] [2024-02-26 01:05:44,141] [pool-1-thread-9] Initializing /data/zeiss/dvds/lsm-disk1/CZI-Example Data LSM/Misc/Channel-3-Channels-ZStack-FrameScan-Integration-And-PhotonCounting.czi: 
   [testng] [2024-02-26 01:05:44,423] [pool-1-thread-9] 	Indexed: FAILED (Series 0 (expected true, actual false))
  • testPixelsHashes failures
  • some null PhysicalSizeX
testUnflattenedImageNames: FAILED (Series 3 (got 'label image', expected 'macro image' or 'null'))
  • differences in some SizeX and SizeY!
    etc.

I think many problematic files are provided by Zeiss:

  • /data/zeiss/dvds/wf-disk1/CZI Test WF/CZI-Example Data WF/ApoTome/40_Dual.czi
  • /data/zeiss/jpeg-xr/j_FL_3channel_2scenes_Zstack_EDF_compression85%.czi
  • /data/zeiss/dvds/lsm-disk1/CZI-Example Data LSM/Misc/ztacktimeposition16bit.czi

So maybe I could ask them to put them on Zenodo and have a look.

@dgault
Copy link
Member

dgault commented Feb 26, 2024

Yeah the order is a bit random and can be tricky to parse through. I will put together a full list of the failures and associated files over the next day or two so we can better gauge the scope of files impacted and see what the next steps are. I am going to temporarily exclude this from the CI for now so that we can test some other PR's.

@dgault dgault added exclude and removed include labels Feb 26, 2024
@swg08
Copy link
Contributor

swg08 commented Feb 26, 2024

@NicoKiaru
I found two of those until now. I'm looking a bit longer...
I'll pack up the two if I don't find the third and upload it via Zenodo.

@NicoKiaru
Copy link
Contributor Author

@swg08
Copy link
Contributor

swg08 commented Feb 26, 2024

@dgault
Can we (as ZEISS) release OME from the non-sharing agreement for the files we contributed before?

@NicoKiaru
Copy link
Contributor Author

@swg08 thanks! (I did not list extensively all the failing files, but that's already something to work on)

Can you enlighten me regarding 40_Dual.czi ? The first (and probably other) subblock has a compression value of 121. I have no idea what it refers to, and also the pixel data, when decompressed, gives 3 times the expected size in bytes. That could makes sense if it was rgb, but not even: there's only two channels. Only my reader fails on it (Zen 3.9 and the current bf reader works).

The file seems to be 3 times bigger than needed

@dgault
Copy link
Member

dgault commented Feb 27, 2024

I have put together a list fo all of the failing tests and associated files here (some of the failures that threw exceptions may not have been captured).

Quite a number of the failures relate to datasets which are already public (anything starting with /data/public) and can be found here https://downloads.openmicroscopy.org/images/Zeiss-CZI/

@swg08 I will look into getting you a list of the Zeiss files to review and follow up on the process for potentially making those public

@NicoKiaru
Copy link
Contributor Author

@dgault @swg08 any news regarding "freeing" some of the Zeiss contributed files ?

@dgault
Copy link
Member

dgault commented Apr 8, 2024

Hi @NicoKiaru, apologies for the delay response as I had been away for a few weeks. I have not heard any updates as of yet but I will follow up to see what the status is.

@NicoKiaru
Copy link
Contributor Author

Sure, no problem. I'm put this work on hold for a while, but I may have some time before ELMI 2024 to get back to it and understand a bit more what's going on with the differences in the gsheets.

@imagesc-bot
Copy link

This pull request has been mentioned on Image.sc Forum. There might be relevant details there:

https://forum.image.sc/t/incorrect-pixel-scaling-on-czi-image-series/87149/11

@imagesc-bot
Copy link

This pull request has been mentioned on Image.sc Forum. There might be relevant details there:

https://forum.image.sc/t/why-is-bioformats-reading-large-oir-files-so-much-slower-than-the-microscope-companion-software-provided-by-olympus/66922/8

@joshmoore
Copy link
Member

@NicoKiaru, a heads up that with David's departure there's going to be reduced capacity for reviews & testing of substantial changes like this one. Happy to use this PR to periodically swap testing on for you, but for transparency, it's unlikely that we will be able to move forward with the reader changes in the immediate future.


@NicoKiaru please try accessing this Zenodo record:

P.S.: @swg08, would you like to additionally submit that to https://zenodo.org/communities/bio-formats ?

@NicoKiaru
Copy link
Contributor Author

Thanks for the update, @joshmoore, and best wishes to you, @dgault!

Regarding this PR (and others, like this one), I understand that resource constraints are a significant issue. However, another major challenge is the strict non-sharing policy on private files, which hinders meaningful contributions from outside the OME team.

Bio-Formats is so central to the community that it should be leveraging its position to encourage more external contributions. Unfortunately, this isn't happening as effectively as it could.

With the adoption of NGFF, where Bio-Formats is doing much of the core conversion work, I'm still puzzled by the lack of a solution to the private files policy. This remains a major blocking point.

@joshmoore
Copy link
Member

However, another major challenge is the strict non-sharing policy on private files, which hinders meaningful contributions from outside the OME team. .. I'm still puzzled by the lack of a solution to the private files policy. This remains a major blocking point.

To be clear, the policy isn't "we won't share testing data". The policy is: "we must have permission to share data". This is why all image.sc posts now ask for an upload to, e.g., https://zenodo.org/communities/bio-formats, so that the permissions (along with a DOI) are explicitly defined up front. (Of course, a @BioImage-Archive link would be equally as welcome.)

This is very much mea culpa from our side, but we've been collecting data since before "FAIR" was a term. All the more kudos to @zenodo et al. for providing such invaluable resources that make this easier today.

We welcome all help to retroactively get permission to share the many filesets we have (as @swg08 did above), but as you can imagine, that's again a large job.

@NicoKiaru
Copy link
Contributor Author

Yes the future looks better, and Zenodo is awesome, but what about these old files ?

  • Then let's drop the support of these old files. When something break somebody has to make it public ?
  • Or let some willing contributor sign a contract where they guarantee they don't share the file ? (something similar to your contribution agreement)
  • Or let an institution (EPFL) be part of the OME github org / consortium so they get access right to these files ?
  • Or give a way to run a job remotely on these files and drop error messages ?

I don't know honestly.

We welcome all help to retroactively get permission to share the many filesets we have

But how could we even help ? We don't know what's in there.

@joshmoore
Copy link
Member

Then let's drop the support of these old files. When something break somebody has to make it public ?

This would be a very substantial deviation from the Bio-Formats ethos.

But how could we even help ? We don't know what's in there.

Unfortunately, I think only the original submitters can help. And as a heads up, members of the team have ever tried to do this retroactively with only small single-digit numbers of filesets being retrievable for a given reader.


I don't have immediate answers to your other questions, but I can tell you they all take time as well! So since that's what we have less of at the moment, I'd ask that you give us a chance to adjust to the new situation, but then let's keep the conversation going.

Thanks as ever for the enthusiasm! 😄

@swg08
Copy link
Contributor

swg08 commented Sep 4, 2024

@NicoKiaru @joshmoore
Hey there, maybe we can help and try to close some gaps regarding the images in a bit of a creative way:
If we would be able to understand what "kind" the images are - e.g. time-series with z-stack of a CellDiscoverer from ZEN 2.x - we could try to get or create some similar dataset.
@NicoKiaru if you could point me to the list again with the files and results and what fails I could also try to figure this out myself. Sorry, too many things running in parallel - I lost track of the topic!

Another idea: @joshmoore we have a tool called czicheck that would be great to be run on the czi files in the OME repository to see if those files "comply to the CZI specification" - testing the files against a greater subset of the specification (general file spec including some metadata).

@swg08
Copy link
Contributor

swg08 commented Sep 4, 2024

@joshmoore
Regarding one Davids last messages:

Hi @NicoKiaru, apologies for the delay response as I had been away for a few weeks. I have not heard any updates as of yet but I will follow up to see what the status is.

What about I hand-over a agreement to enable OME to publish all ZEISS CZI files we added over time? Would that work for OME?

@NicoKiaru
Copy link
Contributor Author

So since that's what we have less of at the moment,

Agree, and that's exactly my point, I'm hoping there's a better way for external contributors to give some of their time.

I can tell you they all take time as well!

Unfortunately, of course. Let's dream and hope that's a small enough energy barrier with a smoother ride from here...

I'd ask that you give us a chance to adjust to the new situation

Yes of course! Best of luck!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants