Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

error when providing initial_indices for sparse array data #14

Open
chschroeder opened this issue Jul 12, 2020 · 2 comments
Open

error when providing initial_indices for sparse array data #14

chschroeder opened this issue Jul 12, 2020 · 2 comments

Comments

@chschroeder
Copy link

Hi,

at first glance this library looks really nice (with regard to API, code and docs) and i really like it. Kudos for that!

The first steps were easy to follow using the examples.
However, when i switched from dense to sparse arrays i had some troubles:

Is FeatureBasedSelection in combination with the initial_subset argument intended to work on sparse arrays?
According to the documentation, scipy's csr_matrix should be supported, right?

(1) without initial_subset

selector = FeatureBasedSelection(n, concave_func='sqrt')
selector.fit(x)

(2) with initial_subset

selector = FeatureBasedSelection(n, concave_func='sqrt', initial_subset=initial_subset)
selector.fit(x)

Whenever x is an ndarray (dense) both variants work fine.
However, for a csr_matrix (sparse) only the former works, and for the latter i get the following error:

  File "<my_workspace>/my_script.py", line 86, in my_func
    selector.fit(x)
  File "<site-packges>/apricot/functions/featureBased.py", line 265, in fit
    return super(FeatureBasedSelection, self).fit(X, y=y, 
  File "<site-packges>/apricot/functions/base.py", line 251, in fit
    optimizer.select(X, self.n_samples, sample_cost=sample_cost)
  File "<site-packges>/apricot/optimizers.py", line 491, in select
    optimizer1.select(X, self.n_first_selections, sample_cost=sample_cost)
  File "<site-packges>/apricot/optimizers.py", line 234, in select
    gains = self.function._calculate_gains(X) / sample_cost[self.function.idxs]
  File "<site-packges>/apricot/functions/featureBased.py", line 321, in _calculate_gains
    concave_func(X.data, X.indices, X.indptr, gains, 
  File "<site-packges>/numba/core/dispatcher.py", line 608, in _explain_matching_error
    raise TypeError(msg)
TypeError: No matching definition for argument type(s) array(float64, 1d, C), array(int32, 1d, C), array(int32, 1d, C), array(float64, 1d, C), array(float64, 2d, C), array(float64, 2d, C), array(int64, 1d, C)```
@jmschrei
Copy link
Owner

jmschrei commented Jul 13, 2020

Howdy

Thanks for reporting this. It does look like a bug on my end. The selectors are supposed to work with both dense and sparse arrays, even when using an initial subset. I'll try to fix it in the next week ortwo. Sorry about that! If you need a fix sooner than that you should go into the FeatureBasedSelection code and just hard-code the gain _select_next function that you want to use.

@chschroeder
Copy link
Author

Thanks for the quick response! There is no hurry at all. I am happy to hear that there will be a fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants