You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, I'm in trouble with statsample to do PCA analysis for large data. Does anyone have any good idea?
I want to do PCA alanysis with very large data. (3000 variables, 50 samples)
Then, I wrote this code.
data_raw=IO.readlines('data1.txt').map{|v| v.split}[1..-1]hash_tmp={}data_raw[1..3000].eachdo |ary|
hash_tmp[ary[0]]=ary[1..-1].map(&:to_i).to_scaleendds=hash_tmp.to_datasetputs"Input data done!"cor_matrix=Statsample::Bivariate.correlation_matrix(ds)puts"cor_matrix was prepared."pca=Statsample::Factor::PCA.new(cor_matrix)binding.pry
But the ruby on my mac doesn't return "Cor_matrix was prepared.".
I wrote another code to investigate a cause of this.
# Opening Class to investigate where is bottleneckmoduleStatsamplemoduleBivariateclass << selfdefcovariance_matrix_optimized(ds)x=ds.to_gsln=x.row_sizem=x.column_sizeputs"calculating means..."means=((1/n.to_f)*GSL::Matrix.ones(1,n)*x).row(0)puts"centering matrix..."centered=x-(GSL::Matrix.ones(n,m)*GSL::Matrix.diag(means))puts"calculating covariance matrix..."ss=centered.transpose*centeredputs"calculating n..."s=((1/(n-1).to_f))*ssputs"done!"#<= This line has executedsenddefcorrelation_matrix(ds)vars,cases=ds.fields.size,ds.casesif !ds.has_missing_data?andStatsample.has_gsl?andprediction_optimized(vars,cases) < prediction_pairwise(vars,cases)binding.prycm=correlation_matrix_optimized(ds)binding.pry#<= This line hasn't executed. :(elsecm=correlation_matrix_pairwise(ds)endbinding.prycm.extend(Statsample::CovariateMatrix)binding.prycm.fields=ds.fieldsbinding.prycmendendendend
Then the Ruby return until "done!" and doesn't return from Statsample::Bivariate#covariance_matrix_optimized method.
I haven't seen a Ruby method which doesn't return.
If someone knows a way to solve this problem or investigate cause deeply, please tell me.
The text was updated successfully, but these errors were encountered:
(Original: clbustos/statsample#17)
Hi, I'm in trouble with statsample to do PCA analysis for large data. Does anyone have any good idea?
I want to do PCA alanysis with very large data. (3000 variables, 50 samples)
Then, I wrote this code.
But the ruby on my mac doesn't return "Cor_matrix was prepared.".
I wrote another code to investigate a cause of this.
Then the Ruby return until "done!" and doesn't return from Statsample::Bivariate#covariance_matrix_optimized method.
I haven't seen a Ruby method which doesn't return.
If someone knows a way to solve this problem or investigate cause deeply, please tell me.
The text was updated successfully, but these errors were encountered: