-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Should there always be a sample dimension? #127
Comments
I'm looking at this again, and I think the way to understand my PoV is that I want the least friction between the
Additionally, as per @weiji14's comments a while back, it's a much lower cognitive load to return an array with an extra sample dimension and use |
As it so happens, there is a bug with |
It seems the xarray documentation is partially at fault, at least, according to them. Apparently, there is no combination of |
Thanks for your thoughts! I commented on the other thread about the expand_dims/transpose behavior, but more generally you're correct that xbatcher will need to be responsible the correct ordering of dimensions as xarray's data model is generally agnostic to axis order. |
TBH I actually blame Keras/PyTorch for caring about axis order. So passé! |
Thinking about this some more, the current behavior does make sense if we're not considering an ML context. Like, if you wanted to sample a bunch of patches and average each of them, a sample dimension wouldn't make sense. I'm thinking that we could have BatchGenerator wrappers for the ML libs, and then we can append a sample dimension there. I had a look at the existing ones, but I think they don't have this. |
What is your issue?
As shown in the section below, , there are a couple cases in which the batch generator will not include a
sample
dimension in the returned dataset/dataarray:input_dims
does not exceed the number of dimensions in the original dataset by more than one. In this case, the original dimension names will be retained as long asconcat_input_dims=False
.input_dims
andconcat_input_dims=True
. In this case, an extra dimension calledinput_batch
is created along with the original dimensions appended with_input
.xbatcher/xbatcher/testing.py
Lines 126 to 142 in 59df776
Would it be better to always include a sample dimension in the output batches, for example by renaming the excluded dimension to
sample
for case 1?The text was updated successfully, but these errors were encountered: