-
Notifications
You must be signed in to change notification settings - Fork 6
Files and directories
pyifcb
provides several ways to access and locate IFCB data files.
If you know the pathnames of a set of three raw data files (or the pathname of any one of them), you can access the data from the files using the open_raw
function. For example:
import ifcb
PATHAME = '/mnt/ifcb/data/D20150101T123456_IFCB102.adc'
sample_bin = ifcb.open_raw(PATHNAME)
open_raw
also works as a context manager, which is the recommended way to use it if you are going to access image data.
with ifcb.open_raw(PATHNAME) as sample_bin:
...
For more on how to use the context manager support, see Opening and closing bins.
An IFCB data collection that includes many samples comprises a large number of files. It can become inconvenient to locate all of these files in the same directory, so pyifcb
provides ways of accessing files from a set of sample bins even if the files are organized into a directory hierarchy.
pyifcb
assumes that directory hierarchies are organized based on date and time. The best practice is to use directory names that are prefixes of the filenames in them. For example, a directory called D2016
should contain only data from 2016 (i.e., files whose names start D2016
). Inside that directory there could be a directory called D20161020
which would contain only data from October 20, 2016 (i.e., files whose names start D20161020
). pyifcb
does not require organization into this year/day organization--you could also organize files just by year, or by year/month/day, or even by year/month/day/hour.
For small data collections, organizing files into directories is usually overkill.
If you know the LID of a file set that is located in a directory structure, you can access it using DataDirectory
:
import ifcb
data_dir = ifcb.DataDirectory('/mnt/ifcb/data')
sample_bin = data_dir['D20150101T123456_IFCB102']
If you want to access all the file sets in a data directory, you can iterate over a DataDirectory
:
for sample_bin in data_dir:
number_of_images = len(sample_bin.images)
lid = sample_bin.lid
print('{} has {} image(s)'.format(lid, number_of_images))
Note that if you are iterating over the bins in a data directory and want to access images from each bin, it is still best to use the context manager interface for each bin. For example, this computes the average image intensity for each image in all samples in a data directory:
import numpy as np
for sample_bin in data_dir:
lid = sample_bin.lid
with sample_bin:
for roi_number in sample_bin.images:
avg_intensity = np.mean(sample_bin.images[roi_number])
print('{} ROI #{} has average intensity {}'.format(lid, roi_number, avg_intensity))