Dataset search engine prototype
Instructions:
- Run Search.py with D2C = false
- Enter datasets database directory (e.g. ./ConvertedDatasets/)
- Enter filepath of dataset to be used as search term (e.g. ./ConvertedDatasets/somefilename.csv)
- Filename of closest dataset to the search term from the database will be printed.
Dataset search is done using faiss (https://github.com/facebookresearch/faiss) and metafeatures extraction is done using https://github.com/Seris370/Test
Hyperparameter Prediction
Instructions: This is separated into 2 parts. Hyperparameter Training:
- Change appropriate variables in Search.py accordingly (depending on # of hyperparam, etc.)
- Run Search.py with readDatabaseInput
- Enter folder directory containing the metafeatures + hyperparam dataset
- Wait for the training to finish.
- Model will be saved on the current directory based on model name variable
Prediction & Training:
- Run Search,py with readTestInput.
- Enter folder directory containing the datasets to train on. This will train the model with hyperparameters predicted by the model based on model name variable.
- Wait for the training to finish.
- Accuracy of the resulting model will be saved. (default: TestingAcc or TrainingAcc folder)