Coherence Sorting Method for Improved TF Estimates #119

kkappler · 2021-10-12T20:45:01Z

A basic coherence sorting has been implemented. This improves the highest frequency estimates at for example IAK34, but there are still some issues with phase, and a few bad periods in aurora that are not showing as bad in EMTF. The existing method is a sort-of jackknife that iteratively removes a segment of data and recomputes the coherence of the remaining ensembles, either local of remote. The coherence is only between two channels though (not multiple coherence)

To do multiple coherence we should make:

a properly sized segment weights placeholder vector (currently it is a NoneType)
a method to specify the "multiple coherence estimate band", but for now, we could use a table prefilled, or a simple let of rules, like "let the 'multiple coherence estimate band' have 3x the FCs as the TF estimation band
a variation on get_band_for_tf_estimate that extracts the coherence band
"preliminary Z-estimator", can be OLS, or a simple jackknife
mutliple-coherence computation (PSD(E_p)/PSD(E_obs)

There are several issues here:

Need to automate the choice of coherence threshold (at each decimation level this can be different, in some cases it can differ between periods as well)
May want to rerun leverage points downweighting after restricting to the coherence filtered FCs
Decide if we drop the low coherence or just downweight them

References:

Egbert, Gary D. New approaches to estimation of magnetotelluric parameters. Final technical report, 1 August 1989--31 July 1991. No. DOE/ER/14057-2. Oregon State Univ., Corvallis, OR (United States). Coll. of Oceanography, 1991a.

Egbert, Gary D., and Dean W. Livelybrooks. "Single station magnetotelluric impedance estimation: Coherence weighting and the regression M-estimate." Geophysics 61.4 (1996): 964-970.

Jones, Alan G., and Hartmut Jödicke. "Magnetotelluric transfer function estimation improvement by a coherence-based rejection technique." SEG Technical Program Expanded Abstracts 1984. Society of Exploration Geophysicists, 1984. 51-55.

Smirnov, M. Yu. "Magnetotelluric data processing with a robust statistical procedure having a high breakdown point." Geophysical Journal International 152.1 (2003): 1-7.

The text was updated successfully, but these errors were encountered:

kkappler · 2021-10-12T21:29:54Z

Notes from a chat with Gary

This is really a segment rejection method

Standard Multiple Coherence:
-R2 is just the fraction of the E field signal power that is Fit
-Use the residuals to weight the data --

Could also use magnetic vs magnetic from remote station. This would just identify segments, it is done before we do the processing pipeline.

You use a much wider band than your estimation band to do these estimations.

The FORTRAN implementation is aimed at single station data where there is no remote.

One way to implement this:

For each band that will be processed, select a representative frequency (center frequency)
Select a few harmonics from about the center frequency (note that even if your band is only one harmonic wide, we will still need to select some neighboring harmonics, otherwise the coherence for a segment will be degenerate with value = 1)
use these "coherence bands" to compute the cross-spectral powers, and those cross spectral powers to in turn give preliminary estimates of the impedance tensor (following for example Sims et al 1971), and use the RR estimates if RR is available
Now you can use the impedance tensor to estimate the cross powers (not the components), but estimate it once using auto-powers and once using cross-powers.
The cross power estimate will in general be smaller than the auto-power estimate, and the ratio of these estimates is a form of coherence, as it is bounded by [0,1] in absolute value.
That is, estimate
<ExEx*>_1 = Zxx <ExHx*> + Zxy <ExHy*> [Equation 1] where <> denotes cross-power estimates from the fcs
But we can also estimate ExEx* another way,
...
Consider that an autopower-based estimate of ExHx can be obtained via:
<ExHx*>' ~= Zxx <HxHx*> + Zxy <HxHy*>, [Equation 2]
and similarly we can get an autopower-based estimate of ExHy* from
<ExHy*>' ~= Zxx <HxHy*> + Zxy <HyHy*>,[Equation 3]
The estimates from Equations 2 & 3 can then be fed into Equation 1, yielding a second estimate of <ExEx*>_2.

Note that in some implementations, a direct ExEx is multiplied into the "diminished estimate" and the sqrt of their product is taken. In either case, 0 < [<ExEx*>_1 / <ExEx*>_2] < 1.0 and ranking these and selecting a precentile cut from below will increase SNR in the dataset fed to RME.

Wrote and tested a simple coherence sorting. Mostly tested on IAK34 dataset but did not implement -- i.e. commented it out in transfer_function_helpers.py for this commit. If this commit passes I will test moving mth5, mt_metadata to master branches [Issue: #119]

kkappler · 2022-12-02T23:43:15Z

The task of extracting features (cross-powers and impedance estimates per time-window) should be treated as separate tasks from the coherence sorting. These are necessary, but not sufficient. The features may also find use for other weighting or data QC/viz schemes in future.

Go forward plan:

Decide where in the flow this will be implemented
A good place would be in process_mth5 right after the merged FC objects are collected and before the TF estimation process begins. This would make compute_coherence_weights a method that would be associated with STFTs rather than TFs, since coherence analysis is a process that should be available whether or not TFs will be computed.
The only change this would force is that the weights will need to be shared to process_tf_decimation_level function.
Note also, that the coherence measurements do not depend on having all the data in memory, but are computed separately on each segment (time-element of the STFT) so there is room to paralellize

Probably we can make STFT into a class, and then coherence_weights can be a method of that class.

A "CrossPower" class should be developed. The parent data come from STFT. The cross power class can manage the "coherency bands"
The CrossPower" class estimates the cross power for each time series window
(there may be some tricky bits if the window length to estimate coherecy differs from that for estimating TFs)
A tool for generating the "coherency bands". These can be diffferent from the estimation bands but it is required that there is a mapping from each "estimation band" to a unique "coherence band". Probably use a central frequency and follow Borah et al. 2015 for a "spectral radius"
Initial Z-estimates
The multiple coherences require estimates of the TF in order to take the ratio of predicted to observed power. If the initial estimates are terrible, then multiple coherences may not be much help.
- OLS
- Jackknife

Note that issue #316 or a similar solution will likely need to be addressed.
The SigMT version of this seems similar to the method that Gary described.

- tidied up workflow in process_transfer_functions - created placeholders for segment weights - factored out some methods applied to X, Y, RR for readability

Add some placeholders for coherence sorting - frequency_band_helpers.py - add some placeholder methods - especially Spectrogram class -- a candidate to replace stft_obj in main flow - kernel_dataset.py: - Factor update_survey_metadata method into its own method for better readability - coherence_weights.py - experimental code for Jackkknife coherence weights - xarray_helpers.py - Deprecate unusued cast_3d_stft_to_2d_observations method from xarray_helpers - move notes from this method into stack_fcs()

kkappler · 2024-01-23T03:18:25Z

Consider these two distributions of simple coherence from the same band:

It turs out that the first one seems like a reasonable candidate for jaackknife coherence, but the second one has some fairly non-intuitive results, in that the ensemble that gives the lowest partial coherence is not necesarily one with particularly poor simple coherence. Something to think about ...
Probably, application of jackknife prematurely not a good idea ... probably should apply simple and or multiple coherence before jackknife.

- added a note on issue #119 about how jackknife may be better applied after simple and multiple coherences - misc notes/doc in coherence_weights and factoring - adjust_band_for_coherence_sorting added to frequency_band_helpers

- have been experimenting with a few approaches to coherence sorting - have decided on a method using Spectrogram() class _crosspower method - most of what is in coherence sorting now is just messy, half tested, experimental muck - but it has a form of what I would like later, and there may be a few nuggets - so, against my better judgement I am committing it and tagging associated with issue #119

kkappler self-assigned this Dec 22, 2023

kkappler mentioned this issue Jan 11, 2024

consider channel_nomenclature decorator #316

Open

kkappler added a commit that referenced this issue Jan 19, 2024

Towards issue #119

93e5785

- tidied up workflow in process_transfer_functions - created placeholders for segment weights - factored out some methods applied to X, Y, RR for readability

kkappler added a commit that referenced this issue Feb 2, 2024

Bug looks to be fixed - need to decruft #119

0be04c1

kkappler added a commit that referenced this issue Feb 10, 2024

towards issue #119 - adding placeholders for coherence sorting

f0a9d9d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Coherence Sorting Method for Improved TF Estimates #119

Coherence Sorting Method for Improved TF Estimates #119

kkappler commented Oct 12, 2021 •

edited

Loading

kkappler commented Oct 12, 2021 •

edited

Loading

kkappler commented Dec 2, 2022 •

edited

Loading

kkappler commented Jan 23, 2024

Coherence Sorting Method for Improved TF Estimates #119

Coherence Sorting Method for Improved TF Estimates #119

Comments

kkappler commented Oct 12, 2021 • edited Loading

kkappler commented Oct 12, 2021 • edited Loading

kkappler commented Dec 2, 2022 • edited Loading

kkappler commented Jan 23, 2024

kkappler commented Oct 12, 2021 •

edited

Loading

kkappler commented Oct 12, 2021 •

edited

Loading

kkappler commented Dec 2, 2022 •

edited

Loading