Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove data from dataset #31

Open
alex7tula opened this issue Feb 8, 2018 · 1 comment
Open

Remove data from dataset #31

alex7tula opened this issue Feb 8, 2018 · 1 comment
Assignees

Comments

@alex7tula
Copy link

alex7tula commented Feb 8, 2018

In dataset.R
Instead

# Remove data from dataset if necessary
# TODO better name?
postProcessDataSet <- function(dataSet = get("dataSet", envir = parent.frame()),
  darch = get("darch", envir = parent.frame()))
{
  if (!getParameter(".retainData", F))
  {
    dataSet@data <- NULL
    dataSet@targets <- NULL
  }
  
  dataSet
}

I offer:

# Remove data from dataset if necessary
# TODO better name?
postProcessDataSet <- function(dataSet = get("dataSet", envir = parent.frame()),
  darch = get("darch", envir = parent.frame()))
{
  if (!getParameter(".retainData", F))
  {
    dataSet@data <- NULL
    dataSet@targets <- NULL
    dataSet@parameters<-list();
  }
  
  dataSet
}

dataSet@parameters<-list(); - will clear educate data|matrix. With this i lose file from 2 mB to 50 kB.

But I'm not sure, maybe these data are needed somewhere?

@saviola777
Copy link
Collaborator

Thanks for your feedback. Yes, this huge parameters list has been a thorn in my side for a while (I do not generate it myself, it is generated by the caret library), but removing it removes information necessary for pre-processing. So after you remove it, the network may not be able to predict new data if they needed to be pre-processed.

This is currently implemented for autosaving, if you pass the parameter autosave.trim – this will reset both the dataset and the network itself.

I will keep this open for now, maybe I can find a way of reducing the size of this list without breaking pre-processing, or maybe I can add a parameter to give the user a choice of removing these parameters when they are not needed.

@saviola777 saviola777 self-assigned this Feb 9, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants