Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loading filtered data? #3470

Open
jazlynntan opened this issue Oct 9, 2024 · 3 comments
Open

Loading filtered data? #3470

jazlynntan opened this issue Oct 9, 2024 · 3 comments
Labels
question General question regarding SI

Comments

@jazlynntan
Copy link

Hi, I used kilosort4 outside of spikeinterface for preprocessing and spike sorting. I did some brief checks and curation in Phy. I now want to load the data into spikeinterface to do other visualizations and calculate additional quality metrics. However, when I loaded the recording using read_openephys, the sorting using read_kilosort and read_phy, the waveforms extracted looked wrong despite them looking fine in templates.npy.

The units visualized with the templates.npy
Screenshot 2024-10-09 at 5 01 18 PM

The units visualized using the waveformextractor on the sortinganalyzer created from the sorting from read_phy and recording from read_openephys.
Screenshot 2024-10-09 at 5 02 59 PM

In addition, when I ran compute_quality_metrics(), the median amplitudes and snr were odd - all units had the same median amplitude and the noise level estimated for snr calculation were mostly negative.

I suspect that it has to do with the recording in the sortinganalyzer being the unprocessed raw data. Is this the case? If so, how can I load the preprocessed recording from kilosort so that I can perform visualizations and quality metric computation in spikeinterface?

Thank you.

@zm711
Copy link
Collaborator

zm711 commented Oct 10, 2024

Hey @jazlynntan, you would regenerate the same filtering and referencing inside of spikeinterface. We can't really take the exact preprocessing of kilosort, but instead you could do something like:

import spikeinterface.extractors as se
import spikeinterface.preprocessing as spre
recording = se.read_openephys(xx)
recording_f= spre.bandpass_filter(recording)
recording_cmr = spre.common_reference(recording_f)

You could also do other preprcessing/look up what kilosort is doing to replicate the same preprocessing in spikeinterface.

@zm711 zm711 added the question General question regarding SI label Oct 10, 2024
@jazlynntan
Copy link
Author

jazlynntan commented Oct 11, 2024

Thanks for the prompt response.

Kilosort4 has an option to save preprocessed data to a temp_wh.dat file. Does spikeinterface have extractor modules that can read the dat file? edit: I tried the read_binary function as per below using the information from rec0903_seg3 which is the raw recording loaded using read_openephys. The processed_path refers to the temp_wh.dat file generated by kilosort4. May I know whether I am reading in the processed file appropriately?

processed_0903 = si.extractors.read_binary(processed_path, sampling_frequency=rec0903_seg3.sampling_frequency, num_channels=rec0903_seg3.get_num_channels(), dtype='int16', channel_ids=rec0903_seg3.channel_ids, gain_to_uV=rec0903_seg3.get_channel_gains(), offset_to_uV=rec0903_seg3.get_channel_offsets(), is_filtered=True)

Also, just so that I understand correctly, spikeinterface allows the use of multiple spike sorters (eg. kilosort, spykingcircus, etc.) but this is only for the sorting stage itself? If I wanted to use the preprocessing or other features of these spike sorters I would have to directly use the software outside of spikeinterface?

Thanks!

@JoeZiminski
Copy link
Collaborator

Hi @jazlynntan, unfortunately there is not an extractor to load the temp_wh.dat, it was attempted here, but it's not possible. Note that this file has some zero padding around the edges, so reading it directly as a binary will not work. While this is relatively easy to handle, a deeper problem is that this data is whitened, but the unwhitening matrix is not available for some KS versions. As whitening has a massive impact on the waveform shape, it is not possible to easily assess this preprocessed data. I think there are also some additional issues around data scaling, please see the PR for full details.

In general you are touching on an important point in the preprocessing chain, in particular with kilosort. To compute waveforms, postprocessing etc. in SpikeInterface you need access to the preprocessed recording. However, it is not possible to access the preprocessed kilosort recording (for the reasons above). Therefore there are two options. One is to do all preprocessing in spikeinterface (including drift correction and whitening) and pass kilosort the preprocessed data, turning off preprocessing in kilosort with the 'skip_kilosort_preprocessing' argument. However, it will be worth checking the outputs carefully as the kilosort_like motion correction is close-to but not identical to the kilosort matlab implementation, and there may be subtle differences in the whitening approach, but I am not sure as I have not looked at the code in detail. A middle-ground is to perform the basic preprocessing steps in spikeinterface (filtering, CAR) and turn these off in kilosort by setting the min frequency on the filter to very low and turning off CAR in kilosort. Then you can perform postprocessing on this data in spikeinterface. However, it is important to note the postprocessing will be computed on data that is not drift-corrected, while the kilosort results will be on the drift-corrected data. If you don't have much drift, it should make little difference. On that PR there is a comparison for a test dataset.

For (I think all the sorters) the sorters arguments are exposed, and I guess by default (depending on the sorter, for sure with kilosort) will run their own preprocessing steps. So, if you want to run preprocessing steps in spikeinterface it's good to use the sorter arguments to turn these off at the sorter level, so you don't duplicate the preprocessing. Let me know if anything is not clear!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question General question regarding SI
Projects
None yet
Development

No branches or pull requests

3 participants