Our project relies mainly on sklearn, pandas, PyTorch, and matplotlib. Running the code requires the following files:
./requirements.txt <-- python environment requirements.txt file
./main.py <-- To run the entire project
Our code utilities for classification, regression, and novelty component require the following files:
./Models/import_data.py <-- script to read raw data files
./Models/modelling.py <-- script to train and evaluate models
./Models/plotting.py <-- script to create plots
./Models/export_data.py <-- script to export plots and model evaluations
./Models/Training_parameters/*.py <-- model parameters for each dataset
Our code utilities for classifier interpretability require the following files:
./Models/import_batches.py <-- script to import batches
./Models/Classifier_interpretability/classifier_interpretability.py <-- script to run models
./Models/Classifier_interpretability/*.pkl <-- script to store trained models
- The files should be run in the order: main.py
- GPU is not required.
- The main.py script creates a "Results" directory (for CL, REGR) and an "out_img" directory (for DTC, CNN) and saves results there.
- Training takes ~1 day (4 cores 3.5Ghz).
- To lower training time to 2 min:
- Access ./Models/import_data.py, forward to the read_files_param_grid variable at the bottom, comment out every dataset (and their parameters) except one. remember the dataset.
- Access ./Models/Training_parameters/{dataset_in_read_files_param_grid}.py and comment out most models
- Access ./main.py and comment out the last line: classifier_interpretability.initialize_ci()
- execute ./main.py