Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

idr0013-neumann-mitocheck S-BIAD865 #644

Open
will-moore opened this issue Feb 22, 2023 · 49 comments
Open

idr0013-neumann-mitocheck S-BIAD865 #644

will-moore opened this issue Feb 22, 2023 · 49 comments

Comments

@will-moore
Copy link
Member

idr0013-neumann-mitocheck

@pwalczysko
Copy link

Reimport still in progress - cancelled once because of long wait on FILESET_UPLOAD_PREP.
The new import in progress since 8 March, also FILESET_UPLOAD_PREP (with parallel-upload=10)

@will-moore
Copy link
Member Author

As discussed today, it is probably worth to try and import without chunks, then to add the chunks back by sym-linking to the full plate from the ManagedRepository.

This workflow has allowed me to import big plates from idr0125.
In that case, I created a "metadata only" plate (no chunks) by downloading from s3 using a sync command that ignored chunks.

In a single-image case, I recently achieved the same thing by making a copy of the NGFF Image, then deleting chunks
by deleing files by name, E.g. all files named "0": #652 (comment)
If you only have files named "0" or "1" or "2" you will have to delete each in turn, although there is probably a way to do it in 1 command?

Then, import the metadata only Plate. E.g. for idr0125 - 384-well plate, 9 fields per Well - took ~2 hours.

Then, try to view images in the Plate - they should appear as black.

Then you can delete the metadata-only plate in Managed Repo and replace it with symlink to the full plate.
In the case of idr0125 I was able to do this running https://github.com/IDR/idr0125-way-cellpainting/blob/main/scripts/symlinks.bash as the omero-server user

sudo -u omero-server -s
symlinks.bash

But on pilot-idrtesting I needed to use a different user to do the delete and symlinking: #652 (comment)

@pwalczysko
Copy link

Thanks @will-moore

ls -lah LT0008_31.ome.zarr/
total 36K
drwxrwxr-x. 19 dlindner dlindner 191 Feb 16 12:26 .
drwxrwxr-x.  3 dlindner dlindner  94 Feb 16 12:02 ..
drwxrwxr-x. 26 dlindner dlindner 252 Feb 16 12:26 A
drwxrwxr-x. 26 dlindner dlindner 252 Feb 16 12:26 B
drwxrwxr-x. 26 dlindner dlindner 252 Feb 16 12:26 C
drwxrwxr-x. 26 dlindner dlindner 252 Feb 16 12:26 D
drwxrwxr-x. 26 dlindner dlindner 252 Feb 16 12:26 E
drwxrwxr-x. 26 dlindner dlindner 252 Feb 16 12:26 F
drwxrwxr-x. 26 dlindner dlindner 252 Feb 16 12:26 G
drwxrwxr-x. 26 dlindner dlindner 252 Feb 16 12:26 H
drwxrwxr-x. 26 dlindner dlindner 252 Feb 16 12:26 I
drwxrwxr-x. 26 dlindner dlindner 252 Feb 16 12:26 J
drwxrwxr-x. 26 dlindner dlindner 252 Feb 16 12:26 K
drwxrwxr-x. 26 dlindner dlindner 252 Feb 16 12:26 L
drwxrwxr-x. 26 dlindner dlindner 252 Feb 16 12:26 M
drwxrwxr-x. 26 dlindner dlindner 252 Feb 16 12:26 N
drwxrwxr-x. 26 dlindner dlindner 252 Feb 16 12:26 O
drwxrwxr-x.  2 dlindner dlindner  60 Feb 16 12:02 OME
drwxrwxr-x. 26 dlindner dlindner 252 Feb 16 12:26 P
-rw-rw-r--.  1 dlindner dlindner 31K Feb 16 12:26 .zattrs
-rw-rw-r--.  1 dlindner dlindner  23 Feb 16 12:02 .zgroup

So would you recommend to delete all the A-P files ?

@will-moore
Copy link
Member Author

will-moore commented Apr 3, 2023

No, those A-P are directories that contain important files etc. You only want to delete the chunks, which are files named 0, 1 etc.

You can list them with e.g.

find -type f -name '0'

count them:

find -type f -name '0' | wc

@will-moore
Copy link
Member Author

And only delete the chunks from a copy of the Plate - Don't delete the originals.

Delete chunks with e.g:

sudo find -type f -name '0' -delete

@pwalczysko
Copy link

After having done the workflow suggested by @will-moore I have no imports found response. I have deleted the

sudo find -type f -name '0' -delete
sudo find -type f -name '1' -delete

Then tried

  1. to point the importer onto the .../OME/METADATA... file omero import --parallel-upload=10 --transfer=ln_s --skip=all --depth 10 --name "idr0013-nochunks" /data/ngff/idr0013/LT0008_31.ome.zarr-copy/OME/METADATA.ome.xml --file /tmp/idr0013-nochun.log --errs /tmp/idr0013-nochun.err
  2. to point the importer to the whole copied and trimmed folder (omero import --parallel-upload=10 --transfer=ln_s --skip=all --depth 10 --name "idr0013-nochunks" /data/ngff/idr0013/LT0008_31.ome.zarr-copy)

Both attempts above end in no imports found

@will-moore
Copy link
Member Author

@pwalczysko it might be that the plate name has to end with .zarr extension?
Also, I presume that --depth 10 and --depth=10 are the same?

@pwalczysko
Copy link

it might be that the plate name has to end with .zarr extension?

Indeed, thank you @will-moore , this did the trick. The data are now imported as http://localhost:1080/webclient/?show=plate-253 (idr0013-nochunks). Also, I have replaced the file in the ManagedRepo as instructed with the symlink to the original chunks and the images in the plate http://localhost:1080/webclient/?show=plate-253 are displaying correctly in iviewer, the timelapse is playing okay too.

@will-moore
Copy link
Member Author

Looks great! I adjusted rendering settings and "Saved to all" so the thumbnails are clearer - they all regenerated fine 👍

@will-moore
Copy link
Member Author

Try to guess how much space is needed for conversion.
Raw data is 8bit (1 byte per pixel), single Z & C timelapse

ScreenA 1344 x 1024 x 93 x 384 x 510 plates = 25TB
ScreenB 25 plates (slightly sparse) ~ 1.2 TB

@dominikl
Copy link
Member

dominikl commented Jul 11, 2023

On pilot-zarr2-dev:

Converting one plate takes ~30min, zipping ~50min (without compression 7min!). Converted plate size 36Gb, zipped 28Gb.

7zip (p7zip): 5min (also 28Gb), (without compression 4min)

There are 538 plates in total.

@dominikl
Copy link
Member

Created batch directories for each 10 plates under /data/ngff/idr0013 . Trying to do 10 conversions and 10 zip/uploads/delete a time, due to the disk space limitation.

@dominikl
Copy link
Member

dominikl commented Jul 14, 2023

For conversion:

cd /data/ngff/idr0013/batch_XX
for i in `cat ../batch_XX.txt`; do ~/bioformats2raw/bin/bioformats2raw --memo-directory ../../memo /uod/idr/metadata/idr0013-neumann-mitocheck/screens/$i ${i%.*}.ome.zarr; done
# Note: The input file batch_XX.txt is one directory up in /data/ngff/idr0013 !

For zipping:
Each batch directory contains a zip.sh which zips and deletes the original if successful

cd /data/ngff/idr0013/batch_XX
for i in `ls | grep zarr`; do ./zip.sh $i; done

For upload:

mv *.zip idr0013.   # each batch dir already has an empty idr0013 subdir
ascp -P33001 -i ~/.aspera/cli/etc/asperaweb_id_dsa.openssh -d idr0013 [email protected]:<SECRET_DIDR>

Then add zu files.tsv and delete:

ls idr0013 >> ../idr0013_files.tsv
rm idr0013/*.zip

@dominikl
Copy link
Member

Failing plate:

(base) [dlindner@pilot-zarr2-dev batch_3]$ ~/bioformats2raw/bin/bioformats2raw --memo-directory ../../memo  /uod/idr/metadata/idr0013-neumann-mitocheck/screens/LT0012_29--ex2005_06_10--sp2005_04_08--tt16--c3.screen LT0012_29--ex2005_06_10--sp2005_04_08--tt16--c3.ome.zarr
OpenJDK 64-Bit Server VM warning: You have loaded library /tmp/opencv_openpnp3633973597553018286/nu/pattern/opencv/linux/x86_64/libopencv_java342.so which might have disabled stack guard. The VM will try to fix the stack guard now.
It's highly recommended that you fix the library with 'execstack -c <libfile>', or link it with '-z noexecstack'.
Exception in thread "main" picocli.CommandLine$ExecutionException: Error while calling command (com.glencoesoftware.bioformats2raw.Converter@63a65a25): java.lang.NullPointerException
        at picocli.CommandLine.executeUserObject(CommandLine.java:1962)
        at picocli.CommandLine.access$1300(CommandLine.java:145)
        at picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2352)
        at picocli.CommandLine$RunLast.handle(CommandLine.java:2346)
        at picocli.CommandLine$RunLast.handle(CommandLine.java:2311)
        at picocli.CommandLine$AbstractParseResultHandler.handleParseResult(CommandLine.java:2172)
        at picocli.CommandLine.parseWithHandlers(CommandLine.java:2550)
        at picocli.CommandLine.parseWithHandler(CommandLine.java:2485)
        at picocli.CommandLine.call(CommandLine.java:2761)
        at com.glencoesoftware.bioformats2raw.Converter.main(Converter.java:2192)
Caused by: java.lang.NullPointerException
        at ome.xml.meta.OMEXMLMetadataImpl.getWellSampleImageRef(OMEXMLMetadataImpl.java:5205)
        at com.glencoesoftware.bioformats2raw.Converter.hasValidPlate(Converter.java:2055)
        at com.glencoesoftware.bioformats2raw.Converter.convert(Converter.java:604)
        at com.glencoesoftware.bioformats2raw.Converter.call(Converter.java:516)
        at com.glencoesoftware.bioformats2raw.Converter.call(Converter.java:107)
        at picocli.CommandLine.executeUserObject(CommandLine.java:1953)
        ... 9 more

I guess there will be more. I'll start and append to this list here to keep track of them:

  • LT0012_29--ex2005_06_10--sp2005_04_08--tt16--c3.screen

@dominikl
Copy link
Member

Wrapped it all into one script:

#!/bin/bash

# Usage: ./run.sh screens.txt log.txt

# Disable all output
exec 2>&1 1>/dev/null

for i in `cat $1`;
do
	date >> $2
	echo "Converting $i" >> $2
	zarr_file=${i%.*}.ome.zarr
	~/bioformats2raw/bin/bioformats2raw --memo-directory /data/ngff/memo /uod/idr/metadata/idr0013-neumann-mitocheck/screens/$i $zarr_file
	if [ $? -eq 0 ]
	then
		echo "Zipping ${zarr_file}" >> $2
		7za -mmt8 a ${zarr_file}.zip ${zarr_file}
		if [ $? -eq 0 ]
		then
			rm -rf ${zarr_file}
			mv ${zarr_file}.zip idr0013/
			echo "Uploading ${zarr_file}.zip" >> $2
			ascp -P33001 -i ~/.aspera/cli/etc/asperaweb_id_dsa.openssh -d idr0013 [email protected]:/<SECRET_DIR>
			if [ $? -eq 0 ]
			then
				echo ${zarr_file}.zip >> files.tsv
				rm idr0013/${zarr_file}.zip
			else
				echo "ERR Upload failed." >> $2
			fi
		else
			echo "ERR Zipping failed." >> $2
		fi
	else
		echo "ERR Converting failed." >> $2
	fi
done

It's running now in three sessions (screens) in /data/ngff/idr0013_new/run_1 / 2 /3 (there is a run_4 as well, but that might be a bit too much).

@dominikl
Copy link
Member

This is currently doing 3 conversions in a bit more than an hour. So should all be done in ~8 days.

@dominikl
Copy link
Member

dominikl commented Aug 1, 2023

Finished. Only LT0012_29--ex2005_06_10--sp2005_04_08--tt16--c3.screen failed conversion (see above).

@dominikl dominikl removed their assignment Aug 1, 2023
@dominikl
Copy link
Member

dominikl commented Aug 2, 2023

Really finished now, exported the LT0012_29 plate with omero cli zarr. (LT0012_29.ome.zarr.zip)

@will-moore
Copy link
Member Author

will-moore commented Aug 28, 2023

Looking into submission error with file names in idr0013_files.tsv.

Looks like problem is that each row doesn't include the directory with idr0013/...

But I also noticed a zip called LT0012_29.ome.zarr.zip which looks wrong (different from the others).
Now I see above that this was generated via omero-cli-zarr so that it matches the Plate name in IDR, whereas all the others have much longer names.

To try and make this consistent with the others, I downloaded it (via web page), renamed it and uploaded via Aspera...

$ ./ascp -P33001 -i ../etc/asperaweb_id_dsa.openssh -d ~/Downloads/LT0012_29--ex2005_06_10--sp2005_04_08--tt16--c3.ome.zarr.zip [email protected]:/5f/136e8d-e575-4755-9ac2-aa7fc10cae67-a26596/idr0013/

Checked on https://www.ebi.ac.uk/biostudies/submissions/files?path=%2Fuser%2Fidr0013 that the file sizes of renamed file matched the old file, then deleted LT0012_29.ome.zarr.zip.

Upload new idr0013_files.tsv

@will-moore
Copy link
Member Author

Failed with ResourceError. Checked Blitz logs..

2023-09-20 11:23:14,170 DEBUG [                   loci.formats.Memoizer] (l.Server-6) start[1695208695185] time[298985] tag[loci.formats.Memoizer.setId]
2023-09-20 11:23:14,171 ERROR [         ome.io.bioformats.BfPixelBuffer] (l.Server-6) Failed to instantiate BfPixelsWrapper with /data/OMERO/ManagedRepository/demo_2/2016-05/09/05-00-41.632_mkngff/011c38fb-c3d0-4d1d-82d8-9147a5060d88.zarr/OME/METADATA.ome.xml
2023-09-20 11:23:14,172 ERROR [                ome.io.nio.PixelsService] (l.Server-6) Error instantiating pixel buffer: /data/OMERO/ManagedRepository/demo_2/2016-05/09/05-00-41.632_mkngff/011c38fb-c3d0-4d1d-82d8-9147a5060d88.zarr/OME/METADATA.ome.xml
java.lang.RuntimeException: java.io.IOException: Path '/bia-integrator-data/S-BIAD865/011c38fb-c3d0-4d1d-82d8-9147a5060d88/011c38fb-c3d0-4d1d-82d8-9147a5060d88.zarr/M/23' is not a valid path or not a directory.
        at ome.io.bioformats.BfPixelBuffer.reader(BfPixelBuffer.java:79)
        at ome.io.bioformats.BfPixelBuffer.setSeries(BfPixelBuffer.java:124)
        at ome.io.nio.PixelsService.createBfPixelBuffer(PixelsService.java:898)

/M/23 is a missing Well for this plate, so we shouldn't be trying to read from that dir.

@will-moore
Copy link
Member Author

Viewing a different Plate from idr0004 with missing Wells gives same error:

        at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: java.io.IOException: Path '/bia-integrator-data/S-BIAD867/103d9428-b86b-4f4e-84d8-966b5d89aae1/103d9428-b86b-4f4e-84d8-966b5d89aae1.zarr/A/1' is not a valid path or not a directory.
        at com.bc.zarr.ZarrUtils.ensureDirectory(ZarrUtils.java:158)
        at com.bc.zarr.ZarrGroup.open(ZarrGroup.java:95)
        at com.bc.zarr.ZarrGroup.open(ZarrGroup.java:88)

@will-moore
Copy link
Member Author

will-moore commented Sep 20, 2023

To see if a non-sparse Plate would work, updated

$ psql -U omero -d idr -h $DBHOST -f 18460.sql 
UPDATE 384
BEGIN
 mkngff_fileset 
----------------
        6312002
(1 row)
COMMIT

http://localhost:1080/webclient/?show=well-802140

... but this failed due to goofys:
#671 (comment)

@will-moore
Copy link
Member Author

Goofys failed again, when re-running mkngff sql...

  File "/opt/omero/server/venv3/lib64/python3.6/site-packages/omero_mkngff/__init__.py", line 185, in sql
    if not symlink_path.exists():
  File "/usr/lib64/python3.6/pathlib.py", line 1336, in exists
    self.stat()
  File "/usr/lib64/python3.6/pathlib.py", line 1158, in stat
    return self._accessor.stat(self)
  File "/usr/lib64/python3.6/pathlib.py", line 387, in wrapped
    return strfunc(str(pathobj), *args)
OSError: [Errno 107] Transport endpoint is not connected: '/bia-integrator-data/S-BIAD865/ffe4bcd6-a5dd-4c7f-ace2-751f67921207/ffe4bcd6-a5dd-4c7f-ace2-751f67921207.zarr'

@will-moore
Copy link
Member Author

A big problem with goofys failing (twice above) is that we need to restart the server to re-mount and this means that previously generated sql become invalid due to a different $SECRET being generated.

Need to move to a workflow of creating and executing the sql immediately...

for row in csv:
  omero mkngff sql > fileset.sql
  psql -f fileset.sql

@will-moore
Copy link
Member Author

for r in $(cat $IDRID.csv); do
  biapath=$(echo $r | cut -d',' -f2)
  uuid=$(echo $biapath | cut -d'/' -f2)
  fsid=$(echo $r | cut -d',' -f3)
  omero mkngff sql --symlink_repo /data/OMERO/ManagedRepository --secret=$SECRET $fsid "/bia-integrator-data/$biapath/$uuid.zarr" > "$IDRID/$fsid.sql"
  psql -U omero -d idr -h $DBHOST -f "$IDRID/$fsid.sql"
done

@will-moore
Copy link
Member Author

http://localhost:1080/webclient/?show=well-802140 eventually viewable...

$ grep -A 2 "22.251_mkngff/04c70c80" /opt/omero/server/OMERO.server/var/log/Blitz-0.log | grep -A 2 "saved memo"
2023-09-20 15:27:16,224 DEBUG [                   loci.formats.Memoizer] (l.Server-9) saved memo file: /data/OMERO/BioFormatsCache/data/OMERO/ManagedRepository/demo_2/2016-04/30/15-54-22.251_mkngff/04c70c80-bc2e-4210-a21f-d2f02108b829.zarr/OME/.METADATA.ome.xml.bfmemo (529578 bytes)
2023-09-20 15:27:16,224 DEBUG [                   loci.formats.Memoizer] (l.Server-9) start[1695222972274] time[663949] tag[loci.formats.Memoizer.setId]
2023-09-20 15:27:16,224 INFO  [                ome.io.nio.PixelsService] (l.Server-9) Creating BfPixelBuffer: /data/OMERO/ManagedRepository/demo_2/2016-04/30/15-54-22.251_mkngff/04c70c80-bc2e-4210-a21f-d2f02108b829.zarr/OME/METADATA.ome.xml Series: 0

663949 ms is 11 minutes

@will-moore
Copy link
Member Author

will-moore commented Sep 21, 2023

mkgff sql failed again with goofys mount

Got about 40 complete - most others are 0 bytes.

$ ls -alh idr0013 | grep "r 4" 
.sqlr--r--.  1 omero-server omero-server 486K Sep 20 14:41 18376
.sqlr--r--.  1 omero-server omero-server 484K Sep 20 14:29 18379
.sqlr--r--.  1 omero-server omero-server 486K Sep 20 14:19 18392
.sqlr--r--.  1 omero-server omero-server 486K Sep 20 14:38 18421
.sqlr--r--.  1 omero-server omero-server 486K Sep 20 14:51 18456
.sqlr--r--.  1 omero-server omero-server 486K Sep 20 14:16 18460
.sqlr--r--.  1 omero-server omero-server 486K Sep 20 15:34 18476
.sqlr--r--.  1 omero-server omero-server 463K Sep 20 15:37 18478
.sqlr--r--.  1 omero-server omero-server 482K Sep 20 14:25 18532
.sqlr--r--.  1 omero-server omero-server 486K Sep 20 15:59 18533
.sqlr--r--.  1 omero-server omero-server 486K Sep 20 15:49 18538
.sqlr--r--.  1 omero-server omero-server 484K Sep 20 15:11 18543
.sqlr--r--.  1 omero-server omero-server 484K Sep 20 15:21 18545
.sqlr--r--.  1 omero-server omero-server 481K Sep 20 14:32 18561
.sqlr--r--.  1 omero-server omero-server 481K Sep 20 14:07 18562
.sqlr--r--.  1 omero-server omero-server 481K Sep 20 14:35 18567
.sqlr--r--.  1 omero-server omero-server 481K Sep 20 15:31 18598
.sqlr--r--.  1 omero-server omero-server 481K Sep 20 14:13 18654
.sqlr--r--.  1 omero-server omero-server 478K Sep 20 15:46 18660
.sqlr--r--.  1 omero-server omero-server 481K Sep 20 15:43 18667
.sqlr--r--.  1 omero-server omero-server 481K Sep 20 14:22 18704
.sqlr--r--.  1 omero-server omero-server 481K Sep 20 14:54 18705
.sqlr--r--.  1 omero-server omero-server 481K Sep 20 15:25 18717
.sqlr--r--.  1 omero-server omero-server 481K Sep 20 13:58 18727
.sqlr--r--.  1 omero-server omero-server 481K Sep 20 14:44 18729
.sqlr--r--.  1 omero-server omero-server 486K Sep 20 15:18 18735
.sqlr--r--.  1 omero-server omero-server 481K Sep 20 15:06 18741
.sqlr--r--.  1 omero-server omero-server 481K Sep 20 14:57 18749
.sqlr--r--.  1 omero-server omero-server 481K Sep 20 13:55 18761
.sqlr--r--.  1 omero-server omero-server 481K Sep 20 15:53 18813
.sqlr--r--.  1 omero-server omero-server 481K Sep 20 15:56 18822
.sqlr--r--.  1 omero-server omero-server 481K Sep 20 14:10 18838
.sqlr--r--.  1 omero-server omero-server 481K Sep 20 15:28 18840
.sqlr--r--.  1 omero-server omero-server 481K Sep 20 14:47 18841
.sqlr--r--.  1 omero-server omero-server 481K Sep 20 15:00 18852
.sqlr--r--.  1 omero-server omero-server 481K Sep 20 15:15 18911
.sqlr--r--.  1 omero-server omero-server 481K Sep 20 14:04 18914
.sqlr--r--.  1 omero-server omero-server 481K Sep 20 14:01 18933
.sqlr--r--.  1 omero-server omero-server 481K Sep 20 15:40 18935
.sqlr--r--.  1 omero-server omero-server 433K Sep 20 15:03 22203

Kinda painful to pick up where we left off with mkngff sql, since we don't have a good way to skip all the filesets that have been successfully processed.

Updated omero-mkngff with IDR/omero-mkngff@a2d0aee
So now we output nothing if we have previously successfully generated sql output (as known by the existence of the symlink_dir in managed repo, which is now created after sql output).

Now we just need to update the command to append to the sql file instead of writing to it, to avoid overwriting the existing files.

We also want to use the old SECRET from those existing sql files, so that the new ones are the same and we can do a global replace when needed.

export SECRET=b76bb9c5-92b7-42c7-809e-97c808b4598a
for r in $(cat $IDRID.csv); do
  biapath=$(echo $r | cut -d',' -f2)
  uuid=$(echo $biapath | cut -d'/' -f2)
  fsid=$(echo $r | cut -d',' -f3)
  omero mkngff sql --symlink_repo /data/OMERO/ManagedRepository --secret=$SECRET $fsid "/bia-integrator-data/$biapath/$uuid.zarr" >> "$IDRID/$fsid.sql"
done

Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
Found prefix demo_2/2016-05/09/05-00-41.632 for fileset 18761
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/2016-05/09/05-00-41.632
Symlink dir exists at /data/OMERO/ManagedRepository/demo_2/2016-05/09/05-00-41.632_mkngff - skipping sql output
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
Found prefix demo_2/2016-05/08/16-44-06.910 for fileset 18727
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/2016-05/08/16-44-06.910
Symlink dir exists at /data/OMERO/ManagedRepository/demo_2/2016-05/08/16-44-06.910_mkngff - skipping sql output
...
# last fileset where symlink found - NB: this probably didn't output sql before!
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
Found prefix demo_2/2016-04/30/22-03-36.052 for fileset 18469
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/2016-04/30/22-03-36.052
Symlink dir exists at /data/OMERO/ManagedRepository/demo_2/2016-04/30/22-03-36.052_mkngff - skipping sql output

# first fileset to generate sql in this round...
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
Found prefix demo_2/2016-05/01/05-35-47.122 for fileset 18479
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/2016-05/01/05-35-47.122

@will-moore
Copy link
Member Author

Needed another server restart to re-mount goofys...
Re-ran again as above...
First fileset of this round 18800...

@will-moore
Copy link
Member Author

Needed another server restart to re-mount goofys...
Re-ran again as above...
First fileset of this round 18386...

@will-moore
Copy link
Member Author

Since running the mkngff for this and idr0016 at the same time on idr-testing is causing goofys issues, going to pause on this one now until idr0016 is done....

@will-moore
Copy link
Member Author

Picking up where we left off...
Work out where to start....

for r in $(cat $IDRID.csv); do
  fsid=$(echo $r | cut -d',' -f3)
  ls -alh "$IDRID/$fsid.sql"
done

Kept these 4 rows (no sql exported) deleted the other completed rows from idr0013.csv on idr-testing..
18761?.sql 18727?.sql 18933?.sql 18469?.sql 18458?.sql

for r in $(cat $IDRID.csv); do
  biapath=$(echo $r | cut -d',' -f2)
  uuid=$(echo $biapath | cut -d'/' -f2)
  fsid=$(echo $r | cut -d',' -f3)
  omero mkngff sql $fsid "/bia-integrator-data/$biapath/$uuid.zarr" > "$IDRID/$fsid.sql"
done
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
Found prefix: demo_2/2016-05/09/05-00-41.632 for fileset: 18761

@will-moore
Copy link
Member Author

Repeated several times, each time processing 20 - 40 Filesets...

@will-moore
Copy link
Member Author

Restarted again... seems to be 39 or 40 each time.

(venv3) bash-4.2$ for r in $(cat $IDRID.csv); do   biapath=$(echo $r | cut -d',' -f2);   uuid=$(echo $biapath | cut -d'/' -f2);   fsid=$(echo $r | cut -d',' -f3);   omero mkngff sql $fsid "/bia-integrator-data/$biapath/$uuid.zarr" > "$IDRID/$fsid.sql"; done
Using session for [email protected]:4064. Idle timeout: 10 min. Current group: Public
Found prefix: demo_2/2016-09/20/20-36-59.899 for fileset: 22207
Using session for [email protected]:4064. Idle timeout: 10 min. Current group: Public
Found prefix: demo_2/2016-05/11/09-33-21.804 for fileset: 18867
...

@will-moore
Copy link
Member Author

Restarted again after another 39...

(venv3) bash-4.2$ for r in $(cat $IDRID.csv); do   biapath=$(echo $r | cut -d',' -f2);   uuid=$(echo $biapath | cut -d'/' -f2);   fsid=$(echo $r | cut -d',' -f3);   omero mkngff sql $fsid "/bia-integrator-data/$biapath/$uuid.zarr" > "$IDRID/$fsid.sql"; done
Using session for [email protected]:4064. Idle timeout: 10 min. Current group: Public
Found prefix: demo_2/2016-05/07/19-34-24.204 for fileset: 18687
...

@will-moore
Copy link
Member Author

Need to fix naming of sql. Using fsid=$(echo $r | cut -d',' -f3) this includes a line-break character if the csv has been downloaded with wget https://raw.githubusercontent.com/IDR/idr-utils/cac35aa0d1731afb5db0ab6b60e10bdf03c591fd/scripts/ngff_filesets/idr0013.csv
We can use | tr -d '[:space:]' to strip this off.

for r in $(cat $IDRID.csv); do
  fsid=$(echo $r | cut -d',' -f3)
  newid=$(echo $r | cut -d',' -f3 | tr -d '[:space:]')
  mv "$IDRID/$fsid.sql" "$IDRID/$newid.sql"
done
mv: cannot stat ‘idr0013/18351\r.sql’: No such file or directory
mv: cannot stat ‘idr0013/18353.sql’: No such file or directory

@will-moore
Copy link
Member Author

will-moore commented Sep 27, 2023

Check for .zarray files...

for r in $(cat $IDRID.csv); do
  fsid=$(echo $r | cut -d',' -f3 | tr -d '[:space:]')
  echo "$IDRID/$fsid.sql $(grep -c 'zarray' $IDRID/$fsid.sql)"
done
idr0013/18761.sql 1520
idr0013/18727.sql 1520
idr0013/18933.sql 1520
idr0013/18914.sql 0
idr0013/18562.sql 0
idr0013/18838.sql 0
idr0013/18654.sql 0
idr0013/18460.sql 0
idr0013/18392.sql 0
idr0013/18704.sql 0
idr0013/18532.sql 0
idr0013/18379.sql 0
idr0013/18561.sql 0
idr0013/18567.sql 0
idr0013/18421.sql 0
idr0013/18376.sql 0
idr0013/18729.sql 0
idr0013/18841.sql 0
idr0013/18456.sql 0
idr0013/18705.sql 0
idr0013/18749.sql 0
idr0013/18852.sql 0
idr0013/22203.sql 0
idr0013/18741.sql 0
idr0013/18823.sql 0
idr0013/18543.sql 0
idr0013/18911.sql 0
idr0013/18735.sql 0
idr0013/18545.sql 0
idr0013/18717.sql 0
idr0013/18840.sql 0
idr0013/18598.sql 0
idr0013/18476.sql 0
idr0013/18478.sql 0
idr0013/18935.sql 0
idr0013/18667.sql 0
idr0013/18660.sql 0
idr0013/18538.sql 0
idr0013/18813.sql 0
idr0013/18822.sql 0
idr0013/18533.sql 0
idr0013/18469.sql 1536
idr0013/18479.sql 0
idr0013/22216.sql 0
idr0013/18906.sql 0
idr0013/22223.sql 0
idr0013/18797.sql 0
idr0013/18352.sql 0
idr0013/18355.sql 0
idr0013/22206.sql 0
idr0013/18578.sql 0
idr0013/18707.sql 0
idr0013/18766.sql 0
idr0013/18855.sql 0
idr0013/18802.sql 0
idr0013/18462.sql 0
idr0013/18601.sql 0
idr0013/18775.sql 0
idr0013/18381.sql 0
idr0013/18800.sql 0
idr0013/18763.sql 0
idr0013/18767.sql 0
idr0013/18915.sql 0
idr0013/18520.sql 0
idr0013/18725.sql 0
idr0013/18777.sql 0
idr0013/18869.sql 0
idr0013/18411.sql 0
idr0013/18512.sql 0
idr0013/18383.sql 0
idr0013/18737.sql 0
idr0013/18839.sql 0
idr0013/18701.sql 0
idr0013/18662.sql 0
idr0013/18833.sql 0
idr0013/18836.sql 0
idr0013/18784.sql 0
idr0013/18472.sql 0
idr0013/18923.sql 0
idr0013/18594.sql 0
idr0013/18529.sql 0
idr0013/18361.sql 0
idr0013/18528.sql 0
idr0013/18747.sql 0
idr0013/18464.sql 0
idr0013/18848.sql 0
idr0013/18765.sql 0
idr0013/18826.sql 0
idr0013/18799.sql 0
idr0013/18661.sql 0
idr0013/18470.sql 0
idr0013/18948.sql 0
idr0013/18864.sql 0
idr0013/18732.sql 0
idr0013/18790.sql 0
idr0013/18953.sql 0
idr0013/18386.sql 0
idr0013/18716.sql 0
idr0013/18787.sql 0
idr0013/18461.sql 0
idr0013/18384.sql 0
idr0013/22227.sql 0
idr0013/18947.sql 0
idr0013/18566.sql 0
idr0013/22222.sql 0
idr0013/18774.sql 0
idr0013/18924.sql 0
idr0013/18391.sql 0
idr0013/18401.sql 0
idr0013/18858.sql 0
idr0013/22204.sql 0
idr0013/18580.sql 0
idr0013/18862.sql 0
idr0013/18490.sql 0
idr0013/18936.sql 0
idr0013/18870.sql 0
idr0013/22211.sql 0
idr0013/18828.sql 0
idr0013/22209.sql 0
idr0013/18754.sql 0
idr0013/18465.sql 0
idr0013/18523.sql 0
idr0013/18670.sql 0
idr0013/18579.sql 0
idr0013/18473.sql 0
idr0013/18958.sql 0
idr0013/18577.sql 0
idr0013/18957.sql 0
idr0013/18463.sql 0
idr0013/18589.sql 0
idr0013/18748.sql 0
idr0013/18359.sql 0
idr0013/18354.sql 0
idr0013/18752.sql 0
idr0013/18454.sql 0
idr0013/18824.sql 0
idr0013/18909.sql 0
idr0013/18542.sql 0
idr0013/18403.sql 0
idr0013/18931.sql 0
idr0013/18695.sql 0
idr0013/18489.sql 0
idr0013/18853.sql 0
idr0013/18718.sql 0
idr0013/18358.sql 0
idr0013/18902.sql 0
idr0013/18771.sql 0
idr0013/18604.sql 0
idr0013/18788.sql 0
idr0013/18491.sql 0
idr0013/18700.sql 0
idr0013/18943.sql 0
idr0013/18683.sql 0
idr0013/18846.sql 0
idr0013/22210.sql 0
idr0013/18803.sql 0
idr0013/18918.sql 0
idr0013/18455.sql 0
idr0013/18521.sql 0
idr0013/18844.sql 0
idr0013/18926.sql 0
idr0013/18863.sql 0
idr0013/18843.sql 0
idr0013/18730.sql 0
idr0013/18920.sql 0
idr0013/18585.sql 0
idr0013/18366.sql 0
idr0013/18458.sql 1536
idr0013/18760.sql 1520
idr0013/18804.sql 1520
idr0013/18574.sql 1520

Edited idr0013.csv to contain just the 163 rows with 0 above. Re-ran...

for r in $(cat $IDRID.csv); do
  biapath=$(echo $r | cut -d',' -f2)
  uuid=$(echo $biapath | cut -d'/' -f2)
  fsid=$(echo $r | cut -d',' -f3 | tr -d '[:space:]')
  omero mkngff sql $fsid "/bia-integrator-data/$biapath/$uuid.zarr" > "$IDRID/$fsid.sql"
done

Using session for [email protected]:4064. Idle timeout: 10 min. Current group: Public
Found prefix: demo_2/2016-05/12/13-09-25.587 for fileset: 18914

@will-moore
Copy link
Member Author

Since idr0138-pilot seems to have much more stable goofys mount, move remaining generation there....

Still to do "idr0013.csv"... on idr0138-pilot... as wmoore user...

idr0013/LT0099_16.ome.zarr,S-BIAD865/2fddf4f4-bbad-490e-9d1a-64f10a911f5f,18716
idr0013/LT0121_09.ome.zarr,S-BIAD865/30078617-8947-451e-b4fc-b5459f8d787d,18787
idr0013/LT0025_54.ome.zarr,S-BIAD865/3068778b-ca4a-409f-8a91-a436aaefd539,18461
idr0013/LT0011_30.ome.zarr,S-BIAD865/3092cb82-f48f-4918-9a2d-a159ff420623,18384
idr0013/LTValidMitosisSon384Plate01_02.ome.zarr,S-BIAD865/3215294d-e302-43e8-a96f-0a0dd44f10a6,22227
idr0013/LT0601_01.ome.zarr,S-BIAD865/32f78fc1-3cb0-4ef5-96ff-a7521a1c5d28,18947
idr0013/LT0066_02.ome.zarr,S-BIAD865/333b0032-273f-470d-be49-b944b4191327,18566
idr0013/LTValidMitosisSon384Plate02_04.ome.zarr,S-BIAD865/33bd6e90-8597-445f-a6a1-6f03216902c1,22222
idr0013/LT0116_47.ome.zarr,S-BIAD865/340f3f55-2286-4fa2-8c01-e049bbd86d5d,18774
idr0013/LT0153_06.ome.zarr,S-BIAD865/34eea383-ae3d-4c39-ad85-127571a58957,18924
idr0013/LT0014_01.ome.zarr,S-BIAD865/350edb2c-befa-4ebd-b130-4f5d88fd18b8,18391
idr0013/LT0016_18.ome.zarr,S-BIAD865/364084b9-d7af-4600-b6e3-0621bd50c563,18401
idr0013/LT0142_01.ome.zarr,S-BIAD865/364309c6-4bd0-469d-ad6b-981cb86ac9c0,18858
idr0013/LTValidMitosisSon384Plate07_01.ome.zarr,S-BIAD865/369313de-98e2-44d7-9362-d9c710ade6dd,22204
idr0013/LT0070_41.ome.zarr,S-BIAD865/36a2e3d5-72e4-4652-af7f-929161e2322d,18580
idr0013/LT0143_02.ome.zarr,S-BIAD865/3726dab9-e7a3-4df2-a594-aaeaa9f94d95,18862
idr0013/LT0033_42.ome.zarr,S-BIAD865/376cff9a-a923-4f19-957e-1c4c644b39c5,18490
idr0013/LT0157_07.ome.zarr,S-BIAD865/381f57c9-d2cc-4e33-a0da-cfff6357d9ae,18936
idr0013/LT0145_02.ome.zarr,S-BIAD865/38689649-4f4c-4983-9840-25e2d5f058a5,18870
idr0013/LTValidMitosisSon384Plate05_02.ome.zarr,S-BIAD865/386a44d6-1132-4d0d-abf7-180764320c63,22211
idr0013/LT0133_19.ome.zarr,S-BIAD865/38e77549-b1d0-4559-918b-85da280e9949,18828
idr0013/LTValidMitosisSon384Plate05_04.ome.zarr,S-BIAD865/3932254b-22f9-487d-80b1-b9c2daa7bf46,22209
idr0013/LT0110_09.ome.zarr,S-BIAD865/39885715-f764-46a1-b045-dd423db83c63,18754
idr0013/LT0026_21.ome.zarr,S-BIAD865/3a0f0b01-39aa-4745-aebd-1719c1796206,18465
idr0013/LT0042_28.ome.zarr,S-BIAD865/3a54eeb7-9e0a-438b-8993-926a9ad10689,18523
idr0013/LT0085_07.ome.zarr,S-BIAD865/3b4f9774-4a00-489d-89a3-0d2aeca87835,18670
idr0013/LT0069_52.ome.zarr,S-BIAD865/3c36a642-5b4f-4c11-8e6d-baa0f4178c9b,18579
idr0013/LT0029_01.ome.zarr,S-BIAD865/3ca89c81-1eea-49ac-b7da-fee5f5f945af,18473
idr0013/LT0603_05.ome.zarr,S-BIAD865/3caeca4e-c69c-4a1a-a98e-bb0f83ee6a0c,18958
idr0013/LT0069_51.ome.zarr,S-BIAD865/3cc6b15c-13b0-417b-a249-57932368b51e,18577
idr0013/LT0603_06.ome.zarr,S-BIAD865/3d461dd5-bec1-43fc-8fc3-2406f1d2bb72,18957
idr0013/LT0025_56.ome.zarr,S-BIAD865/3d4a9c7f-944a-40f0-b872-3da8ff3557ff,18463
idr0013/LT0073_02.ome.zarr,S-BIAD865/3d5ac001-823d-4c7a-83a4-e29c826f81e0,18589
idr0013/LT0108_47.ome.zarr,S-BIAD865/3e550b11-5e87-4587-8b8a-f7653fadab9b,18748
idr0013/LT0003_40.ome.zarr,S-BIAD865/3e7ad301-3cad-413a-9e79-571a691712bf,18359
idr0013/LT0002_02.ome.zarr,S-BIAD865/3e7aeaeb-4de8-42b9-bed3-2c4af89a0bf7,18354
idr0013/LT0110_01.ome.zarr,S-BIAD865/3edb1d3a-91da-48a9-b6a4-592328ea5f1c,18752
idr0013/LT0023_01.ome.zarr,S-BIAD865/40aadbcb-77df-4663-a5f8-29177971b58b,18454
idr0013/LT0132_04.ome.zarr,S-BIAD865/40e83a42-6bc0-4f3b-80f9-80a865ac5424,18824
idr0013/LT0148_37.ome.zarr,S-BIAD865/4251a3eb-043c-4abe-9326-3e2afb9f6e97,18909
idr0013/LT0049_02.ome.zarr,S-BIAD865/427c1e16-5bee-425f-ae65-163a4db18e54,18542
idr0013/LT0016_28.ome.zarr,S-BIAD865/42a137f5-6f48-4873-9d66-fac6367a802b,18403
idr0013/LT0156_07.ome.zarr,S-BIAD865/4399d284-8a5c-47f3-9169-007d2f0cad27,18931
idr0013/LT0093_16.ome.zarr,S-BIAD865/4421634f-208a-4d43-88c8-80b5c8caa056,18695
idr0013/LT0033_11.ome.zarr,S-BIAD865/448ecf99-dba9-4e72-8edb-f8e03453c292,18489
idr0013/LT0140_06.ome.zarr,S-BIAD865/44bff916-3cc4-4f8b-a185-fabcf82b5e01,18853
idr0013/LT0100_09.ome.zarr,S-BIAD865/44c62d0b-c9e8-42e7-97e8-37592f26ba75,18718
idr0013/LT0003_15.ome.zarr,S-BIAD865/44e8361b-2bbd-4f01-ba02-f3333a34a5c4,18358
idr0013/LT0146_06.ome.zarr,S-BIAD865/44f07347-2f3d-4d65-ad1c-6c376577862a,18902
idr0013/LT0116_43.ome.zarr,S-BIAD865/44f3c26d-65e2-43d0-9d2e-75ac4673f210,18771
idr0013/LT0077_01.ome.zarr,S-BIAD865/45eb9b6b-f72f-42cc-b0c8-19923f2c6d92,18604
idr0013/LT0121_37.ome.zarr,S-BIAD865/46b7571c-679a-4aba-ba66-c1e608eb803d,18788
idr0013/LT0034_01.ome.zarr,S-BIAD865/4706dd97-c751-447d-b8ef-6dc9ea68dea7,18491
idr0013/LT0094_44.ome.zarr,S-BIAD865/497cb9a3-4e13-4498-aae5-c4b291515352,18700
idr0013/LT0170_01.ome.zarr,S-BIAD865/4a33abd2-9f15-4ddb-9cc6-faf7cffb4960,18943
idr0013/LT0089_02.ome.zarr,S-BIAD865/4a3ace35-8cb0-459a-8609-c78f99cb79a5,18683
idr0013/LT0138_03.ome.zarr,S-BIAD865/4a96176c-6d36-4ce5-a9d6-ed5cca52cbeb,18846
idr0013/LTValidMitosisSon384Plate05_03.ome.zarr,S-BIAD865/4acc4a36-2066-43ee-9a7f-756733f1e379,22210
idr0013/LT0125_41.ome.zarr,S-BIAD865/4b271b6d-1dd3-4079-9e40-4153e13f56ae,18803
idr0013/LT0151_08.ome.zarr,S-BIAD865/4b390ccd-714f-4452-aae7-5db76302337b,18918
idr0013/LT0023_04.ome.zarr,S-BIAD865/4c512657-5553-41b4-a77c-6df1f562ff05,18455
idr0013/LT0042_10.ome.zarr,S-BIAD865/4c5e7b2b-f19d-4bdd-ae88-1d1bb0c3c869,18521
idr0013/LT0138_01.ome.zarr,S-BIAD865/4f0ab5bc-90f9-474d-8b0d-0f2303f94593,18844
idr0013/LT0154_02.ome.zarr,S-BIAD865/4f84d491-654e-4c1b-b39e-16258cbb7056,18926
idr0013/LT0143_05.ome.zarr,S-BIAD865/4fd3c599-6c09-4a63-bfe8-cc345ea99002,18863
idr0013/LT0137_44.ome.zarr,S-BIAD865/50991552-7af6-40b0-813a-e03bc6590cd1,18843
idr0013/LT0104_04.ome.zarr,S-BIAD865/50be2b3c-b163-4363-9bdd-5be0651f2b03,18730
idr0013/LT0152_04.ome.zarr,S-BIAD865/50f78452-8396-401b-9aeb-d9982ddbca0b,18920
idr0013/LT0072_02.ome.zarr,S-BIAD865/513a062e-2307-40c5-8f6b-57761e9b502f,18585
idr0013/LT0006_10.ome.zarr,S-BIAD865/5147e4d3-bec9-4166-b63a-dbe5f5008f52,18366

@will-moore
Copy link
Member Author

will-moore commented Nov 14, 2023

Following Images/Filesets found to be incomplete when regenerating memo files on idr-testing...

On pilot-zarr1-dev, screen

$ screen -r idr0015_ngff
$ cd /data/idr0013
$ conda activate bioformats2raw2
$ for i in LT0066_23--ex2005_08_03--sp2005_06_07--tt17--c3 LT0080_37--ex2005_07_20--sp2005_07_04--tt17--c4 LT0103_13--ex2006_11_22--sp2005_08_16--tt19--c4; do
~/bioformats2raw-0.6.0-24/bin/bioformats2raw --memo-directory /../memo  /uod/idr/metadata/idr0013-neumann-mitocheck/screens/$i.screen $i.ome.zarr; done

Can't seem to read the data...

$ sudo ls /uod/idr/filesets/idr0013-neumann-mitocheck/
ls: cannot open directory /uod/idr/filesets/idr0013-neumann-mitocheck/: Permission denied

EDIT: seems to work when I'm not in that old screen.
Created screen -S idr0013_bf2raw and ran again... 10:35...

@will-moore
Copy link
Member Author

will-moore commented Nov 14, 2023

Checking that files missing from previous plates are present in newly-generated ones...

This was missing M/1 Well before, but seems to have the same number of files as other Wells now...

(base) [wmoore@pilot-zarr1-dev idr0013]$ find LT0066_23--ex2005_08_03--sp2005_06_07--tt17--c3.ome.zarr/M/1 -type f | wc
    478     478   36242
(base) [wmoore@pilot-zarr1-dev idr0013]$ find LT0066_23--ex2005_08_03--sp2005_06_07--tt17--c3.ome.zarr/M/2 -type f | wc
    478     478   36242
(base) [wmoore@pilot-zarr1-dev idr0013]$ find LT0066_23--ex2005_08_03--sp2005_06_07--tt17--c3.ome.zarr/A/1 -type f | wc
    478     478   36242

Similar checks with the other plates for .zattrs etc and /A all look good...

Renamed to shorten names...

(base) [wmoore@pilot-zarr1-dev idr0013]$ ls -lh 
total 0
drwxrwxr-x. 19 wmoore wmoore 271 Nov 14 14:37 LT0066_23.ome.zarr
drwxrwxr-x. 19 wmoore wmoore 271 Nov 14 12:18 LT0080_37.ome.zarr
drwxrwxr-x. 19 wmoore wmoore 271 Nov 14 13:32 LT0103_13.ome.zarr
$ for i in $(ls); do zip -r $i.zip $i; done
...

EDIT:
oops - realised that previous idr0013 plates have full names, not shortened as above. Re-named back to full names and zipped them..

$ md5sum ./*
2dc74001d737bf48841ea4a186391574  LT0066_23--ex2005_08_03--sp2005_06_07--tt17--c3.ome.zarr.zip
bc35cca08c935c765df6a3d1b1198732  LT0103_13--ex2006_11_22--sp2005_08_16--tt19--c4.ome.zarr.zip
5ad963825e2e3c5ccc5c2a5060819e7f  LT0080_37--ex2005_07_20--sp2005_07_04--tt17--c4.ome.zarr.zip

Delete these 3 from https://www.ebi.ac.uk/biostudies/submissions/files?path=%2Fuser%2Fidr0013

Upload...

$ cd .aspera/cli/bin
$ ./ascp -P33001 -i ~/.aspera/cli/etc/asperaweb_id_dsa.openssh -d /data/idr0013/idr0013 [email protected]:/5f/13xxxxx

LT0066_23--ex2005_08_03--sp2005_06_07--tt17--           100%   24GB  128Mb/s    12:05    
LT0080_37--ex2005_07_20--sp2005_07_04--tt17--            100%   25GB  247Mb/s    26:43    
LT0103_13--ex2006_11_22--sp2005_08_16--tt19--               100%   24GB  377Mb/s    48:59 

@will-moore
Copy link
Member Author

Let's host those 3 plates on our s3 for testing mkngff etc.

$ aws --endpoint-url https://uk1s3.embassy.ebi.ac.uk s3 mb s3://idr0013
make_bucket: idr0013
(base) [wmoore@pilot-zarr1-dev idr0013]$ /home/wmoore/mc cp -r idr0013/ uk1s3/idr0013
...tt19--c4.ome.zarr/P/9/0/3/92/0/0/0/0: 102.77 GiB / 102.77 GiB ━━━━━━━━━━━━━━━ 18.86 MiB/s 1h32m59s

Looking good:

On idr0125-pilot...

ssh -A -o 'ProxyCommand ssh idr-pilot.openmicroscopy.org -W %h:%p' idr0125-omeroreadwrite -L 1080:localhost:80

sudo mkdir /idr0013 && sudo /opt/goofys --endpoint https://uk1s3.embassy.ebi.ac.uk/ -o allow_other idr0013 /idr0013

ls /idr0013
LT0066_23--ex2005_08_03--sp2005_06_07--tt17--c3.ome.zarr  LT0080_37--ex2005_07_20--sp2005_07_04--tt17--c4.ome.zarr  LT0103_13--ex2006_11_22--sp2005_08_16--tt19--c4.ome.zarr

As omero-server user...

idr0013.csv

LT0066_23,LT0066_23--ex2005_08_03--sp2005_06_07--tt17--c3.ome.zarr,18568
LT0080_37,LT0080_37--ex2005_07_20--sp2005_07_04--tt17--c4.ome.zarr,18655
LT0103_13,LT0103_13--ex2006_11_22--sp2005_08_16--tt19--c4.ome.zarr,18728

screen -r mkngff

for r in $(cat $IDRID.csv); do
  zarrpath=$(echo $r | cut -d',' -f2)
  fsid=$(echo $r | cut -d',' -f3 | tr -d '[:space:]')
  omero mkngff sql $fsid --clientpath="https://uk1s3.embassy.ebi.ac.uk/idr0013/$zarrpath" "/idr0013/$zarrpath" > "$IDRID/$fsid.sql"
done

@will-moore
Copy link
Member Author

will-moore commented Jan 3, 2024

Check sql output - all have .zarr/.zattrs...

(venv3) (base) bash-4.2$ for i in 18568.sql 18655.sql 18728.sql; do echo $i; cat $i | grep ".zarr/.zattrs" | wc; cat $i | grep ".zattrs" | wc; done
18568.sql
      1       4     258
    762    3048  205148
18655.sql
      1       4     258
    762    3048  205148
18728.sql
      1       4     258
    738    2952  198688

$ less 18568.sql...

UPDATE pixels SET name = 'METADATA.ome.xml', path = 'demo_2/2016-05/03/23-33-31.705_mkngff/LT0066_23--ex2005_08_03--sp2005_06_07--tt17--c3.ome.zarr/OME' where image in (select id from Image where fileset = 18568);

begin;
    select mkngff_fileset(
      18568,
      'SECRETUUID',
      'cdf35825-def1-4580-8d0b-9c349b8f78d6',
      'demo_2/2016-05/03/23-33-31.705_mkngff/',
      array[
          ['demo_2/2016-05/03/23-33-31.705_mkngff/LT0066_23--ex2005_08_03--sp2005_06_07--tt17--c3.ome.zarr/', '.zattrs', 'application/octet-stream', 'https://uk1s3.embassy.ebi.ac.uk/idr0013/LT0066_23--ex2005_08_03--sp2005_06_07--tt17--c3.ome.zarr/.zattrs'],
          ['demo_2/2016-05/03/23-33-31.705_mkngff/LT0066_23--ex2005_08_03--sp2005_06_07--tt17--c3.ome.zarr/', '.zgroup', 'application/octet-stream', 'https://uk1s3.embassy.ebi.ac.uk/idr0013/LT0066_23--ex2005_08_03--sp2005_06_07--tt17--c3.ome.zarr/.zgroup'],
          ['demo_2/2016-05/03/23-33-31.705_mkngff/LT0066_23--ex2005_08_03--sp2005_06_07--tt17--c3.ome.zarr/A/', '.zgroup', 'application/octet-stream', 'https://uk1s3.embassy.ebi.ac.uk/idr0013/LT0066_23--ex2005_08_03--sp2005_06_07--tt17--c3.ome.zarr/A/.zgroup'],
...

Updated SECRET to 9630ba1e-ed3a-42e3-9296-xxxxxxxx then ran

for r in $(cat $IDRID.csv); do
  zarrpath=$(echo $r | cut -d',' -f2)
  fsid=$(echo $r | cut -d',' -f3 | tr -d '[:space:]')
  psql -U omero -d idr -h $DBHOST -f "$IDRID/$fsid.sql"
  omero mkngff symlink /data/OMERO/ManagedRepository "/idr0013/$zarrpath" --bfoptions
done

UPDATE 380
BEGIN
 mkngff_fileset
----------------
        5289227
(1 row)

COMMIT
usage: /opt/omero/server/venv3/bin/omero mkngff symlink [-h] [--bfoptions]
                                                        symlink_repo
                                                        fileset_id
                                                        symlink_target
/opt/omero/server/venv3/bin/omero mkngff symlink: error: argument fileset_id: invalid int value: '/idr0013/LT0066_23--ex2005_08_03--sp2005_06_07--tt17--c3.ome.zarr'
UPDATE 380
BEGIN
 mkngff_fileset
----------------
        5289228
(1 row)

COMMIT
usage: /opt/omero/server/venv3/bin/omero mkngff symlink [-h] [--bfoptions]
                                                        symlink_repo
                                                        fileset_id
                                                        symlink_target
/opt/omero/server/venv3/bin/omero mkngff symlink: error: argument fileset_id: invalid int value: '/idr0013/LT0080_37--ex2005_07_20--sp2005_07_04--tt17--c4.ome.zarr'
UPDATE 368
BEGIN
 mkngff_fileset
----------------
        5289229
(1 row)

COMMIT
usage: /opt/omero/server/venv3/bin/omero mkngff symlink [-h] [--bfoptions]
                                                        symlink_repo
                                                        fileset_id
                                                        symlink_target
/opt/omero/server/venv3/bin/omero mkngff symlink: error: argument fileset_id: invalid int value: '/idr0013/LT0103_13--ex2006_11_22--sp2005_08_16--tt19--c4.ome.zarr'

Ooops....
re-ran symlinks....

$ for r in $(cat $IDRID.csv); do
>   zarrpath=$(echo $r | cut -d',' -f2)
>   fsid=$(echo $r | cut -d',' -f3 | tr -d '[:space:]')
>   echo $zarrpath
>   echo $fsid
>   omero mkngff symlink /data/OMERO/ManagedRepository $fsid "/idr0013/$zarrpath" --bfoptions
> done
LT0066_23--ex2005_08_03--sp2005_06_07--tt17--c3.ome.zarr
18568
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/2016-05/03/23-33-31.705
Creating dir at /data/OMERO/ManagedRepository/demo_2/2016-05/03/23-33-31.705_mkngff
Creating symlink /data/OMERO/ManagedRepository/demo_2/2016-05/03/23-33-31.705_mkngff/LT0066_23--ex2005_08_03--sp2005_06_07--tt17--c3.ome.zarr -> /idr0013/LT0066_23--ex2005_08_03--sp2005_06_07--tt17--c3.ome.zarr
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/2016-05/03/23-33-31.705
write bfoptions to: /data/OMERO/ManagedRepository/demo_2/2016-05/03/23-33-31.705_mkngff/LT0066_23--ex2005_08_03--sp2005_06_07--tt17--c3.ome.zarr.bfoptions
LT0080_37--ex2005_07_20--sp2005_07_04--tt17--c4.ome.zarr
18655
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/2016-05/07/02-36-52.924
Creating dir at /data/OMERO/ManagedRepository/demo_2/2016-05/07/02-36-52.924_mkngff
Creating symlink /data/OMERO/ManagedRepository/demo_2/2016-05/07/02-36-52.924_mkngff/LT0080_37--ex2005_07_20--sp2005_07_04--tt17--c4.ome.zarr -> /idr0013/LT0080_37--ex2005_07_20--sp2005_07_04--tt17--c4.ome.zarr
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/2016-05/07/02-36-52.924
write bfoptions to: /data/OMERO/ManagedRepository/demo_2/2016-05/07/02-36-52.924_mkngff/LT0080_37--ex2005_07_20--sp2005_07_04--tt17--c4.ome.zarr.bfoptions
LT0103_13--ex2006_11_22--sp2005_08_16--tt19--c4.ome.zarr
18728
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/2016-05/08/17-02-05.805
Creating dir at /data/OMERO/ManagedRepository/demo_2/2016-05/08/17-02-05.805_mkngff
Creating symlink /data/OMERO/ManagedRepository/demo_2/2016-05/08/17-02-05.805_mkngff/LT0103_13--ex2006_11_22--sp2005_08_16--tt19--c4.ome.zarr -> /idr0013/LT0103_13--ex2006_11_22--sp2005_08_16--tt19--c4.ome.zarr
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/2016-05/08/17-02-05.805
write bfoptions to: /data/OMERO/ManagedRepository/demo_2/2016-05/08/17-02-05.805_mkngff/LT0103_13--ex2006_11_22--sp2005_08_16--tt19--c4.ome.zarr.bfoptions

Fileset info looks good...

(base) [wmoore@pilot-idr0125-omeroreadwrite ~]$ ls -alh /data/OMERO/ManagedRepository/demo_2/2016-05/03/23-33-31.705_mkngff
total 12K
drwxr-xr-x.  2 omero-server omero-server  144 Jan  3 11:26 .
drwxr-xr-x. 63 omero-server omero-server 4.0K Jan  3 11:26 ..
lrwxrwxrwx.  1 omero-server omero-server   65 Jan  3 11:26 LT0066_23--ex2005_08_03--sp2005_06_07--tt17--c3.ome.zarr -> /idr0013/LT0066_23--ex2005_08_03--sp2005_06_07--tt17--c3.ome.zarr
-rw-r--r--.  1 omero-server omero-server   49 Jan  3 11:26 LT0066_23--ex2005_08_03--sp2005_06_07--tt17--c3.ome.zarr.bfoptions

Checking http://localhost:1080/webclient/?show=image-1556033 - view image....
Looks good.
Other plates: http://localhost:1080/webclient/?show=image-1573071...
and LT0103_13

@will-moore
Copy link
Member Author

Lets check_pixels...

for i in 3669 3669 3828; do
  python check_pixels.py Plate:$i --max-planes=sizeC --max-images=10 >> /tmp/check_pix_20240301_idr0013.log;
done

$ grep Error /tmp/check_pix_20240301_idr0013.log | wc
      0       0       0

@will-moore
Copy link
Member Author

will-moore commented Jan 16, 2024

We have re-submitted data now available on EBI s3...

Test on idr-testing, using Fileset IDs from idr-testing!

Install IDR/omero-mkngff#14 to create new Filesets without extra _mkngff suffix...
And use --fs_suffix=None below...

pip install 'omero-mkngff @ git+https://github.com/will-moore/omero-mkngff@fs_suffix'

idr0013.csv

idr0013/LT0080_37.ome.zarr.zip,S-BIAD865/aea4aa32-60c2-4a38-8a91-9f303381e562,6312927
idr0013/LT0066_23.ome.zarr.zip,S-BIAD865/c1d9f06e-cfd0-43cd-be2f-3e5f39c3b62a,6313098
idr0013/LT0103_13.ome.zarr.zip,S-BIAD865/eae9bb4c-9504-4f88-9931-dbf234f86023,6313107
export IDRID-idr0013
for r in $(cat $IDRID.csv); do
  biapath=$(echo $r | cut -d',' -f2)
  uuid=$(echo $biapath | cut -d'/' -f2)
  fsid=$(echo $r | cut -d',' -f3 | tr -d '[:space:]')
  omero mkngff sql $fsid --fs_suffix=None --clientpath="https://uk1s3.embassy.ebi.ac.uk/bia-integrator-data/$biapath/$uuid.zarr" "/bia-integrator-data/$biapath/$uuid.zarr" > "$IDRID/$fsid.sql"
done

Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
Found prefix: demo_2/2016-05/07/02-36-52.924_mkngff for fileset: 6312927
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
Found prefix: demo_2/2016-05/03/23-33-31.705_mkngff for fileset: 6313098
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
Found prefix: demo_2/2016-05/08/17-02-05.805_mkngff for fileset: 6313107

Then, update SECRET and... (again using --fs_suffix=None)...

for i in $(ls); do sed -i 's/SECRETUUID/f464e059-16b5-4013-b9a2-417e5976371c/g' $i; done

for r in $(cat $IDRID.csv); do
  biapath=$(echo $r | cut -d',' -f2)
  uuid=$(echo $biapath | cut -d'/' -f2)
  fsid=$(echo $r | cut -d',' -f3 | tr -d '[:space:]')
  psql -U omero -d idr -h $DBHOST -f "$IDRID/$fsid.sql"
  omero mkngff symlink /data/OMERO/ManagedRepository $fsid "/bia-integrator-data/$biapath/$uuid.zarr" --fs_suffix=None --bfoptions
done

UPDATE 380
BEGIN
 mkngff_fileset 
----------------
        6314896
(1 row)

COMMIT
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/2016-05/07/02-36-52.924_mkngff
Creating dir at /data/OMERO/ManagedRepository/demo_2/2016-05/07/02-36-52.924_mkngff
Creating symlink /data/OMERO/ManagedRepository/demo_2/2016-05/07/02-36-52.924_mkngff/aea4aa32-60c2-4a38-8a91-9f303381e562.zarr -> /bia-integrator-data/S-BIAD865/aea4aa32-60c2-4a38-8a91-9f303381e562/aea4aa32-60c2-4a38-8a91-9f303381e562.zarr
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/2016-05/07/02-36-52.924_mkngff
write bfoptions to: /data/OMERO/ManagedRepository/demo_2/2016-05/07/02-36-52.924_mkngff/aea4aa32-60c2-4a38-8a91-9f303381e562.zarr.bfoptions
UPDATE 380
BEGIN
 mkngff_fileset 
----------------
        6314897
(1 row)

COMMIT
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/2016-05/03/23-33-31.705_mkngff
Creating dir at /data/OMERO/ManagedRepository/demo_2/2016-05/03/23-33-31.705_mkngff
Creating symlink /data/OMERO/ManagedRepository/demo_2/2016-05/03/23-33-31.705_mkngff/c1d9f06e-cfd0-43cd-be2f-3e5f39c3b62a.zarr -> /bia-integrator-data/S-BIAD865/c1d9f06e-cfd0-43cd-be2f-3e5f39c3b62a/c1d9f06e-cfd0-43cd-be2f-3e5f39c3b62a.zarr
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/2016-05/03/23-33-31.705_mkngff
write bfoptions to: /data/OMERO/ManagedRepository/demo_2/2016-05/03/23-33-31.705_mkngff/c1d9f06e-cfd0-43cd-be2f-3e5f39c3b62a.zarr.bfoptions
UPDATE 368
BEGIN
 mkngff_fileset 
----------------
        6314898
(1 row)

COMMIT
Using session for demo@localhost:4064. Idle timeout: 10 min. Current group: Public
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/2016-05/08/17-02-05.805_mkngff
Creating dir at /data/OMERO/ManagedRepository/demo_2/2016-05/08/17-02-05.805_mkngff
Creating symlink /data/OMERO/ManagedRepository/demo_2/2016-05/08/17-02-05.805_mkngff/eae9bb4c-9504-4f88-9931-dbf234f86023.zarr -> /bia-integrator-data/S-BIAD865/eae9bb4c-9504-4f88-9931-dbf234f86023/eae9bb4c-9504-4f88-9931-dbf234f86023.zarr
Checking for prefix_dir /data/OMERO/ManagedRepository/demo_2/2016-05/08/17-02-05.805_mkngff
write bfoptions to: /data/OMERO/ManagedRepository/demo_2/2016-05/08/17-02-05.805_mkngff/eae9bb4c-9504-4f88-9931-dbf234f86023.zarr.bfoptions

http://localhost:1080/webclient/?show=image-1600787...

@will-moore
Copy link
Member Author

Updated sql scripts to use original Fileset IDs in IDR/mkngff_upgrade_scripts@3f8e169

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: NGFF studies
Development

No branches or pull requests

4 participants