Pipeline-runnable code to "decollage" the combined FlowCam images #21

metazool · 2024-08-12T09:03:20Z

This code currently exists as a command-line script repurposed by @Kzra (based on https://sarigiering.co/posts/extract-individual-particle-images-from-flowcam/) and run by hand on a VM with internal storage available as a volume mount

#20 (Automating file transfer off the FlowCam) suggests we may be able to change this workflow beneficially and take the VM out of the loop.

This issue is about setup work to add the code with tests to the decollage package and also to preserve more metadata while doing it - step towards replacing #4 (minimal metadata, currently just a file listing!) and recording more detail (coordinates, date, depth, and also image size)

Not end-to-end as per #11 (diagrams of the instrument-to-storage workflow) but could be a chance to get a handle on the Luigi package in passing (a python analogue to R's {targets} recommended by @albags - rapid prototyping for work with an Airflow destination?)

The text was updated successfully, but these errors were encountered:

metazool · 2024-08-12T11:55:46Z

See also https://github.com/NERC-CEH/cyto-ML/issues/6 (private to internal contributors)

comment here about embedding the metadata in the output image EXIF headers which makes a lot of sense here, the files are still in transit to cloud storage where they'll get fixed addresses, it saves rewriting external metadata

And is better for compatibility with scivision if that's still actively being supported by Turing Inst (who does one ask about that?)

metazool · 2024-08-12T12:17:22Z

This is interesting! The decollage script depends on the presence of a .lst file (which we have available) - it only uses the pixel coordinates to fish the image out, but all the geometry analytics based on the binary mask is included in this file too

(screen photo from the FlowCam walkthrough about 2/3 of the way down:

it's an "almost but not quite CSV" with 53 lines of column_name|type headers and there's bound to be a lot we can get out of these metrics alone (I collaborated on a paper about this once, characterisation of grains in sandstone, comparing shape-based to deep learning approaches but it got boiled down to "does the deep learning approach work" rather than digging into the tradeoffs between the approaches...)

metazool · 2024-08-28T04:28:26Z

Glad to close this, there's a next step preserving the analytic metadata in the .lst files off the FlowCam but that's out of scope here

metazool mentioned this issue Aug 15, 2024

Add the "decollage" process for raw microscope output to the package #22

Merged

metazool closed this as completed Aug 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pipeline-runnable code to "decollage" the combined FlowCam images #21

Pipeline-runnable code to "decollage" the combined FlowCam images #21

metazool commented Aug 12, 2024 •

edited

Loading

metazool commented Aug 12, 2024

metazool commented Aug 12, 2024 •

edited

Loading

metazool commented Aug 28, 2024

Pipeline-runnable code to "decollage" the combined FlowCam images #21

Pipeline-runnable code to "decollage" the combined FlowCam images #21

Comments

metazool commented Aug 12, 2024 • edited Loading

metazool commented Aug 12, 2024

metazool commented Aug 12, 2024 • edited Loading

metazool commented Aug 28, 2024

metazool commented Aug 12, 2024 •

edited

Loading

metazool commented Aug 12, 2024 •

edited

Loading