Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOC: show example code for custom reader functions (was: Subclassing BaseRaw is too difficult) #12854

Open
larsoner opened this issue Sep 16, 2024 · 11 comments
Labels

Comments

@larsoner
Copy link
Member

I directly set this [filenames] attribute [...] because I find setting up a new reader by adding a new class a bit cumbersome. I mostly do the file-specific stuff and then create a RawArray, so maybe adding a public setter in that class might be an option?

Originally posted by @cbrnr in #12843 (comment)

@larsoner larsoner changed the title > I think the only cases where we should worry about this are in MNE itself. End users should not mess with private attributes like this. And if they do, they should expect things to break silently in unpredictable ways at any time. If we start trying to protect against this, we end up in an unsustainable rabbit hole and break a policy/contract we've always tried to follow. Subclassing BaseRaw is too difficult Sep 16, 2024
@larsoner
Copy link
Member Author

larsoner commented Sep 16, 2024

I find setting up a new reader by adding a new class a bit cumbersome

@cbrnr I opened this to try to figure out if we could make the process easier, but this already works:

import numpy as np
from mne import create_info
from mne.io import BaseRaw


class MyRaw(BaseRaw):

    def __init__(self):
        data = np.zeros((1, 1000))
        info = create_info(1, 1000., 'eeg')
        filenames = ["test.xyz"]
        super().__init__(preload=data, info=info, filenames=filenames)


x = MyRaw()
x.plot()

image

I don't see this as too onerous, and gives you the advantage of nice repr and such

>>> x
<MyRaw | test.xyz, 1 x 1000 (1.0 s), ~13 kB, data loaded>

So I would rather have this little subclass wrapper than add setters for the filenames (which would likely lead to other problems anyway).

@cbrnr
Copy link
Contributor

cbrnr commented Sep 16, 2024

Perfect, that's really great indeed! I'll switch to this approach then for my readers. Thanks!

@mscheltienne
Copy link
Member

mscheltienne commented Sep 17, 2024

Maybe it could be turned into an example?

@cbrnr
Copy link
Contributor

cbrnr commented Sep 17, 2024

Maybe it could be turned into an example?

I'll see what it'll look like, I have readers for .npz, .mat, and .xdf. The .npz one could be short enough to serve as an example, but we'll see.

@cbrnr
Copy link
Contributor

cbrnr commented Sep 17, 2024

FWIW, what's really missing in the example is a function read_raw_xxx(), because that's what's user-facing. This generates a bit of an overhead and redundancy (and this was the original reason why I took the shortcut), but it's perfectly OK having to implement both a class and a function (or maybe not perfectly OK, but changing the whole reader implementation would not be worth it I guess 🤷).

@drammock
Copy link
Member

Maybe it could be turned into an example?

exactly what I was thinking; I'll re-open and re-title this issue as an issue about doc improvement.

FWIW, what's really missing in the example is a function read_raw_xxx(), because that's what's user-facing.

Agreed. I think the new/edited example would be most useful if it were something like:

# If no reader exists yet for the kind of file you're working with,
# here's a template function that should get you up and running:

def read_raw_npz(...):
    # load the data (this part will usually be different for each file type)
    # create fake info (details of how the channel names/types are stored in the file may vary)
    # use RawArray to instantiate
    # return RawArray

# (now show it in action)
foo = read_raw_npz(...)

# If you want to contribute your file reader to MNE-Python, it would need a few more pieces,
# like supporting `preload=False`. Here's an expanded reader for `.npz` to illustrate what a
# full-fledged reader function usually looks like:

def read_raw_npz(...):
    ...

# (now show it in action)
foo = read_raw_npz(..., preload=False)

@drammock drammock reopened this Sep 17, 2024
@drammock drammock changed the title Subclassing BaseRaw is too difficult DOC: show example code for custom reader functions (was: Subclassing BaseRaw is too difficult) Sep 17, 2024
@cbrnr cbrnr added the DOC label Sep 17, 2024
@cbrnr
Copy link
Contributor

cbrnr commented Sep 17, 2024

I'm working on this here: cbrnr/mnelab#434

I think the .npy reader could be used for the example, but I don't think I will have time to add the "few more pieces". I'm not sure if it is really necessary to show everything, because then we might as well add the reader to our codebase, or no?

@cbrnr
Copy link
Contributor

cbrnr commented Sep 17, 2024

Also, in your example I think subclassing BaseRaw is really essential, otherwise there's no way to set the .filenames attribute. This is what brought us here in the first place.

@drammock
Copy link
Member

I think the .npy reader could be used for the example, but I don't think I will have time to add the "few more pieces". I'm not sure if it is really necessary to show everything, because then we might as well add the reader to our codebase, or no?

I think .npz (or .hdf5 for that matter) are good candidates for "example in our docs, but not in our API" because they're user-customizable formats that aren't guaranteed to follow a proper "spec" in terms of what the different elements are named, their order, etc. So in that sense even building out a "full" reader function for them in the docs only seems fine to me.

Aside: I think .npy is not a good candidate for an example because it can only contain one array (e.g., only the data, not sfreq, ch_names, ch_types, etc; unless you did an object array I guess).

@drammock
Copy link
Member

Also, in your example I think subclassing BaseRaw is really essential, otherwise there's no way to set the .filenames attribute. This is what brought us here in the first place.

ah, right you are.

@cbrnr
Copy link
Contributor

cbrnr commented Sep 17, 2024

I think .npz (or .hdf5 for that matter) are good candidates for "example in our docs, but not in our API" because they're user-customizable formats that aren't guaranteed to follow a proper "spec" in terms of what the different elements are named, their order, etc. So in that sense even building out a "full" reader function for them in the docs only seems fine to me.

I agree!

Aside: I think .npy is not a good candidate for an example because it can only contain one array (e.g., only the data, not sfreq, ch_names, ch_types, etc; unless you did an object array I guess).

Right, .npz would be a better candidate for an example. I've still implemented a reader for .npy, because it was requested and I guess some people use it because it's so simple to just dump an array into it. Of course, additional information needs to be provided elsewhere (it's in a dialog in my case).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants